Agentic Tool Use

Agentic tool use is the pattern where the model decides which tools to call, interprets the results, and loops until it has enough information to answer. You define the tools, call run(), and Stratus handles the dispatch, validation, parallel execution, and error recovery.

How the tool loop works

When you call run(), Stratus enters an autonomous loop. The model decides what to do next at every step.

Model call

Stratus sends the conversation history and tool definitions to the model. The model either responds with text (done) or requests one or more tool calls.

Argument parsing and validation

Stratus parses the JSON arguments from each tool call and validates them against the tool's Zod schema. If parsing fails, the error is sent back to the model so it can retry.

Tool execution

Each tool's execute function runs with the validated parameters and shared context. If the model requested multiple tools, they run in parallel.

Results injected

Tool results are added to the message history as tool messages, one per tool call.

Loop or finish

Stratus sends the updated message history back to the model. The model can call more tools, or respond with a final text answer. This repeats until the model stops calling tools or the max turn limit is reached.

Stratus handles this entire loop automatically. You define tools and call run() - the SDK manages message passing, JSON parsing, validation, retries, and multi-round execution.

Quick start

Define a tool with tool(), attach it to an agent, and call run(). The model decides when to call the tool and what to do with the result.

quick-start.ts

import { AzureResponsesModel } from "stratus-sdk";
import { Agent, run, tool } from "stratus-sdk/core";
import { z } from "zod";

const model = new AzureResponsesModel({
  endpoint: process.env.AZURE_ENDPOINT!,
  apiKey: process.env.AZURE_API_KEY!,
  deployment: "gpt-5.2",
});

const getWeather = tool({
  name: "get_weather",
  description: "Get the current weather for a city",
  parameters: z.object({
    city: z.string().describe("City name"),
  }),
  execute: async (_ctx, { city }) => { 
    const response = await fetch(
      `https://api.weather.example/v1/current?city=${encodeURIComponent(city)}`
    );
    const data = await response.json();
    return `${data.temp}°F, ${data.condition} in ${city}`;
  },
});

const agent = new Agent({
  name: "weather_assistant",
  model,
  instructions: "You are a helpful weather assistant.",
  tools: [getWeather], 
});

const result = await run(agent, "What's the weather in Seattle?");
console.log(result.output);
// "The current weather in Seattle is 58°F and cloudy."

Behind the scenes, run() made two model calls: one that triggered get_weather, and one that produced the final answer using the tool result. You wrote zero dispatch logic.

Parallel tool calls

When the model needs information from multiple sources, it can call several tools at once. Stratus executes them in parallel and sends all results back in one batch.

parallel-tools.ts

import { AzureResponsesModel } from "stratus-sdk";
import { Agent, run, tool } from "stratus-sdk/core";
import { z } from "zod";

const model = new AzureResponsesModel({
  endpoint: process.env.AZURE_ENDPOINT!,
  apiKey: process.env.AZURE_API_KEY!,
  deployment: "gpt-5.2",
});

const getWeather = tool({
  name: "get_weather",
  description: "Get the current weather for a city",
  parameters: z.object({
    city: z.string().describe("City name"),
  }),
  execute: async (_ctx, { city }) => {
    const response = await fetch(
      `https://api.weather.example/v1/current?city=${encodeURIComponent(city)}`
    );
    const data = await response.json();
    return `${data.temp}°F, ${data.condition}`;
  },
});

const agent = new Agent({
  name: "weather_assistant",
  model,
  tools: [getWeather],
});

// The model calls get_weather 3 times in parallel
const result = await run(
  agent,
  "What's the weather in Tokyo, London, and New York?"
);
console.log(result.output);

The model sees all three results at once and produces a single response comparing the three cities. No sequential round-trips needed.

Parallel tool calls are a model behavior, not something you configure. The model decides when to batch calls based on the prompt. You can disable this with modelSettings: { parallelToolCalls: false } if you need sequential execution.

Multi-tool agents

Most real agents have multiple tools. The model picks the right tool based on the user's request. You don't need routing logic.

multi-tool.ts

import { AzureResponsesModel } from "stratus-sdk";
import { Agent, run, tool } from "stratus-sdk/core";
import { z } from "zod";

const model = new AzureResponsesModel({
  endpoint: process.env.AZURE_ENDPOINT!,
  apiKey: process.env.AZURE_API_KEY!,
  deployment: "gpt-5.2",
});

const searchProducts = tool({
  name: "search_products",
  description: "Search the product catalog by keyword",
  parameters: z.object({
    query: z.string().describe("Search keywords"),
    maxResults: z.number().optional().describe("Max results to return"),
  }),
  execute: async (_ctx, { query, maxResults }) => {
    const results = await productDB.search(query, maxResults ?? 5);
    return JSON.stringify(results);
  },
});

const getProductDetails = tool({
  name: "get_product_details",
  description: "Get detailed information about a product by ID",
  parameters: z.object({
    productId: z.string().describe("The product ID"),
  }),
  execute: async (_ctx, { productId }) => {
    const product = await productDB.findById(productId);
    if (!product) return "Product not found";
    return JSON.stringify(product);
  },
});

const checkInventory = tool({
  name: "check_inventory",
  description: "Check if a product is in stock at a specific warehouse",
  parameters: z.object({
    productId: z.string(),
    warehouseId: z.string().describe("Warehouse ID, e.g. 'us-west-1'"),
  }),
  execute: async (_ctx, { productId, warehouseId }) => {
    const stock = await inventoryAPI.check(productId, warehouseId);
    return JSON.stringify({ inStock: stock.available, quantity: stock.count });
  },
});

const calculateShipping = tool({
  name: "calculate_shipping",
  description: "Calculate shipping cost and estimated delivery date",
  parameters: z.object({
    productId: z.string(),
    zipCode: z.string().describe("Destination ZIP code"),
  }),
  execute: async (_ctx, { productId, zipCode }) => {
    const estimate = await shippingAPI.estimate(productId, zipCode);
    return JSON.stringify(estimate);
  },
});

const agent = new Agent({
  name: "shopping_assistant",
  model,
  instructions: `You are a shopping assistant. Help customers find products,
    check availability, and get shipping estimates. Be concise and helpful.`,
  tools: [searchProducts, getProductDetails, checkInventory, calculateShipping], 
});

const result = await run(
  agent,
  "I'm looking for a USB-C monitor. Is the top result in stock? How fast can it ship to 98101?"
);
console.log(result.output);

The model might call search_products first, then check_inventory and calculate_shipping in parallel on the top result. Stratus handles the multi-turn orchestration automatically.

Controlling tool behavior

toolChoice

toolChoice tells the model whether and how to use tools. Set it via modelSettings on the agent.

The default. The model decides whether to call a tool or respond with text.

tool-choice-auto.ts

const agent = new Agent({
  name: "assistant",
  model,
  tools: [getWeather],
  modelSettings: {
    toolChoice: "auto", // default - model decides
  },
});

Force the model to call at least one tool. Useful when you always want tool execution.

tool-choice-required.ts

const agent = new Agent({
  name: "data_fetcher",
  model,
  tools: [fetchData, queryDatabase],
  modelSettings: {
    toolChoice: "required", 
  },
});

Prevent the model from calling any tools, even if tools are defined. Useful for a "summarize what you know" follow-up.

tool-choice-none.ts

const agent = new Agent({
  name: "assistant",
  model,
  tools: [getWeather],
  modelSettings: {
    toolChoice: "none", 
  },
});

Force the model to call a specific tool by name.

tool-choice-specific.ts

const agent = new Agent({
  name: "classifier",
  model,
  tools: [classifyIntent],
  modelSettings: {
    toolChoice: { 
      type: "function", 
      function: { name: "classify_intent" }, 
    }, 
  },
});

toolUseBehavior

toolUseBehavior controls what happens after a tool executes. By default, results go back to the model for another turn. You can change this to stop early.

The default. After tool execution, the model gets the results and decides what to do next.

behavior-default.ts

const agent = new Agent({
  name: "assistant",
  model,
  tools: [getWeather],
  toolUseBehavior: "run_llm_again", // default
});

Stop immediately after the first tool call. The tool's return value becomes the run output. No second model call. Useful when the tool produces the final answer directly.

behavior-stop.ts

const agent = new Agent({
  name: "calculator",
  model,
  tools: [calculate],
  toolUseBehavior: "stop_on_first_tool", 
});

const result = await run(agent, "What is 42 * 17?");
console.log(result.output); // "714" - raw tool output, no model summary

Stop only when specific tools are called. Other tools loop normally.

behavior-stop-at.ts

const agent = new Agent({
  name: "order_agent",
  model,
  tools: [lookupOrder, processRefund, sendConfirmation],
  toolUseBehavior: { 
    stopAtToolNames: ["send_confirmation"], 
  }, 
});

// lookupOrder and processRefund loop back to the model.
// sendConfirmation stops the run and returns its output directly.

When toolUseBehavior stops early, result.output contains the raw tool output string, not a model-generated response. The model does not get a chance to summarize or format the result.

Tool errors and recovery

When a tool's execute function throws, Stratus catches the error and sends the error message back to the model as the tool result. The model sees the error and can adjust - retry with different parameters, try a different tool, or respond to the user with an explanation.

error-recovery.ts

import { AzureResponsesModel } from "stratus-sdk";
import { Agent, run, tool } from "stratus-sdk/core";
import { z } from "zod";

const model = new AzureResponsesModel({
  endpoint: process.env.AZURE_ENDPOINT!,
  apiKey: process.env.AZURE_API_KEY!,
  deployment: "gpt-5.2",
});

const lookupUser = tool({
  name: "lookup_user",
  description: "Look up a user by email address",
  parameters: z.object({
    email: z.string().describe("User email address"),
  }),
  execute: async (_ctx, { email }) => {
    const user = await db.users.findByEmail(email);
    if (!user) {
      throw new Error(`No user found with email "${email}". Try a different email.`); 
    }
    return JSON.stringify({ id: user.id, name: user.name, plan: user.plan });
  },
});

const agent = new Agent({
  name: "support",
  model,
  tools: [lookupUser],
});

const result = await run(agent, "Find the account for typo@exmple.com");
console.log(result.output);
// The model sees the error, tells the user no account was found,
// and may ask for the correct email.

The error message format matters. Write error messages that help the model take the right next step. Compare:

Bad: "Error: ENOENT" - the model has no idea what to do
Good: "No user found with email \"typo@exmple.com\". Try a different email." - the model can ask the user for the correct email

Tool errors never crash the run. They flow back to the model as information. Only MaxTurnsExceededError and RunAbortedError will terminate the loop.

Streaming with tools

When you use stream(), you receive real-time events during the entire tool loop - including tool call events between model turns.

stream-tools.ts

import { AzureResponsesModel } from "stratus-sdk";
import { Agent, stream, tool } from "stratus-sdk/core";
import { z } from "zod";

const model = new AzureResponsesModel({
  endpoint: process.env.AZURE_ENDPOINT!,
  apiKey: process.env.AZURE_API_KEY!,
  deployment: "gpt-5.2",
});

const getWeather = tool({
  name: "get_weather",
  description: "Get the current weather for a city",
  parameters: z.object({ city: z.string() }),
  execute: async (_ctx, { city }) => {
    const res = await fetch(
      `https://api.weather.example/v1/current?city=${encodeURIComponent(city)}`
    );
    const data = await res.json();
    return `${data.temp}°F, ${data.condition}`;
  },
});

const agent = new Agent({
  name: "assistant",
  model,
  tools: [getWeather],
});

const { stream: s, result } = stream(
  agent,
  "What's the weather in Portland and Miami?"
);

for await (const event of s) {
  switch (event.type) {
    case "tool_call_start": 
      console.log(`\n[Calling ${event.toolCall.name}...]`); 
      break; 
    case "tool_call_done": 
      console.log(`[Done]`); 
      break; 
    case "content_delta":
      process.stdout.write(event.content);
      break;
    case "done":
      console.log(`\n\nTokens: ${event.response.usage?.totalTokens}`);
      break;
  }
}

const finalResult = await result;
console.log(finalResult.output);

A typical event sequence for a tool-using agent:

tool_call_start - model begins a tool call
tool_call_delta - incremental JSON arguments arrive (useful for progress UI)
tool_call_done - arguments are complete, execution begins
done - first model response is finished, tools execute
content_delta - second model turn streams the final answer
done - final response complete

You see multiple done events in a multi-turn run - one per model call.

Tools with context

Pass shared resources like database clients, API keys, or user info through the context object. This keeps tools pure and testable.

context-tools.ts

import { AzureResponsesModel } from "stratus-sdk";
import { Agent, run, tool } from "stratus-sdk/core";
import { z } from "zod";

interface AppContext {
  userId: string;
  db: Database;
  apiKeys: { stripe: string; sendgrid: string };
}

const model = new AzureResponsesModel({
  endpoint: process.env.AZURE_ENDPOINT!,
  apiKey: process.env.AZURE_API_KEY!,
  deployment: "gpt-5.2",
});

const getOrders = tool({
  name: "get_orders",
  description: "Get recent orders for the current user",
  parameters: z.object({
    limit: z.number().optional().describe("Max orders to return"),
  }),
  execute: async (ctx: AppContext, { limit }) => { 
    const orders = await ctx.db.orders.findByUser(ctx.userId, limit ?? 10); 
    return JSON.stringify(orders);
  },
});

const sendEmail = tool({
  name: "send_email",
  description: "Send an email notification to the current user",
  parameters: z.object({
    subject: z.string(),
    body: z.string(),
  }),
  execute: async (ctx: AppContext, { subject, body }) => {
    const user = await ctx.db.users.findById(ctx.userId);
    await sendgrid.send({
      to: user.email,
      subject,
      body,
      apiKey: ctx.apiKeys.sendgrid, 
    });
    return `Email sent to ${user.email}`;
  },
});

const agent = new Agent<AppContext>({
  name: "account_assistant",
  model,
  instructions: "You help users manage their account and orders.",
  tools: [getOrders, sendEmail],
});

const result = await run(agent, "Show me my last 3 orders", {
  context: { 
    userId: "user_abc123", 
    db: database, 
    apiKeys: { stripe: STRIPE_KEY, sendgrid: SENDGRID_KEY }, 
  }, 
});
console.log(result.output);

The context object is passed to every tool's execute function as the first argument. Type it with a generic on Agent<AppContext> for full type safety.

Abort signal

Pass an AbortSignal to cancel a running agent. The signal propagates to every tool's execute function, so you can cancel long-running operations like HTTP requests or database queries.

abort-tools.ts

import { AzureResponsesModel } from "stratus-sdk";
import { Agent, run, tool, RunAbortedError } from "stratus-sdk/core";
import { z } from "zod";

const model = new AzureResponsesModel({
  endpoint: process.env.AZURE_ENDPOINT!,
  apiKey: process.env.AZURE_API_KEY!,
  deployment: "gpt-5.2",
});

const fetchDocs = tool({
  name: "fetch_docs",
  description: "Fetch documentation from a URL",
  parameters: z.object({ url: z.string() }),
  execute: async (_ctx, { url }, options) => {
    const res = await fetch(url, {
      signal: options?.signal, 
    });
    return await res.text();
  },
});

const agent = new Agent({
  name: "docs_assistant",
  model,
  tools: [fetchDocs],
});

const controller = new AbortController();

// Cancel after 10 seconds
setTimeout(() => controller.abort(), 10_000);

try {
  const result = await run(agent, "Summarize the docs at https://example.com/api", {
    signal: controller.signal, 
  });
  console.log(result.output);
} catch (error) {
  if (error instanceof RunAbortedError) {
    console.log("Run was cancelled");
  }
}

The abort signal is checked between every model call and tool execution. When aborted mid-tool, any in-flight fetch calls using the signal are cancelled immediately.

Compared to raw API calls

Without Stratus, function calling requires you to manually manage the message array, parse JSON arguments, dispatch to functions by name, and make multiple API calls in a loop. Stratus eliminates all of this - define your tools and call run().

Here's the same two-tool agent, with and without Stratus:

stratus.ts

import { AzureResponsesModel } from "stratus-sdk";
import { Agent, run, tool } from "stratus-sdk/core";
import { z } from "zod";

const model = new AzureResponsesModel({
  endpoint: process.env.AZURE_ENDPOINT!,
  apiKey: process.env.AZURE_API_KEY!,
  deployment: "gpt-5.2",
});

const getWeather = tool({
  name: "get_weather",
  description: "Get weather for a city",
  parameters: z.object({ location: z.string() }),
  execute: async (_ctx, { location }) => fetchWeather(location),
});

const getTime = tool({
  name: "get_time",
  description: "Get current time for a city",
  parameters: z.object({ location: z.string() }),
  execute: async (_ctx, { location }) => fetchTime(location),
});

const agent = new Agent({
  name: "assistant",
  model,
  tools: [getWeather, getTime],
});

const result = await run(
  agent,
  "Weather and time in San Francisco, Tokyo, and Paris?"
);
console.log(result.output);

manual.py

import json
from openai import OpenAI

client = OpenAI(
    base_url="https://YOUR-RESOURCE.openai.azure.com/openai/v1/",
    api_key="YOUR_KEY",
)

# Step 1: Define tools as raw JSON
tools = [
    {"type": "function", "function": {
        "name": "get_weather", "description": "Get weather",
        "parameters": {"type": "object", "properties": {
            "location": {"type": "string"}
        }, "required": ["location"]}
    }},
    {"type": "function", "function": {
        "name": "get_time", "description": "Get time",
        "parameters": {"type": "object", "properties": {
            "location": {"type": "string"}
        }, "required": ["location"]}
    }},
]

# Step 2: First API call
messages = [{"role": "user", "content": "Weather and time in SF, Tokyo, Paris?"}]
response = client.chat.completions.create(
    model="gpt-5.2", messages=messages, tools=tools
)

# Step 3: Manually parse and append assistant message
msg = response.choices[0].message
messages.append(msg)

# Step 4: Manually dispatch each tool call
if msg.tool_calls:
    for tc in msg.tool_calls:
        args = json.loads(tc.function.arguments)
        if tc.function.name == "get_weather":
            result = fetch_weather(args.get("location"))
        elif tc.function.name == "get_time":
            result = fetch_time(args.get("location"))
        else:
            result = json.dumps({"error": "Unknown tool"})
        messages.append({
            "tool_call_id": tc.id, "role": "tool",
            "name": tc.function.name, "content": result,
        })

# Step 5: Second API call for the final answer
final = client.chat.completions.create(
    model="gpt-5.2", messages=messages, tools=tools
)
print(final.choices[0].message.content)

# Missing: streaming, multi-round loops, validation, retries,
# error recovery, abort signals, parallel execution, type safety

The Stratus version handles parallel tool calls, multi-round tool loops, Zod validation, error recovery, 429 retries, streaming, and abort signals. The manual approach handles none of these.