Testing
Test agents with mock models and response builders
Stratus ships test utilities as a separate entrypoint so they stay out of production bundles.
import { createMockModel, textResponse, toolCallResponse } from "@usestratus/sdk/testing";Mock Model
createMockModel() returns a Model that serves canned responses in sequence:
import { createMockModel, textResponse } from "@usestratus/sdk/testing";
import { Agent, run } from "@usestratus/sdk/core";
const model = createMockModel([
textResponse("Hello!"),
textResponse("Goodbye!"),
]);
const agent = new Agent({ name: "test", model });
const result = await run(agent, "Hi");
expect(result.output).toBe("Hello!");When responses are exhausted, the mock throws with a clear message including how many calls were made.
Capturing Requests
Pass { capture: true } to record every ModelRequest the mock receives:
const model = createMockModel(
[textResponse("ok")],
{ capture: true },
);
await run(agent, "Hello");
expect(model.requests).toHaveLength(1);
expect(model.requests[0].messages[0].content).toBe("Hello");Response Builders
textResponse(content, options?)
Builds a ModelResponse with text content and no tool calls.
textResponse("Hello world")
// { content: "Hello world", toolCalls: [], finishReason: "stop" }
textResponse("ok", {
usage: { promptTokens: 10, completionTokens: 5, totalTokens: 15 },
responseId: "resp_123",
})toolCallResponse(calls, options?)
Builds a ModelResponse with tool calls. Each call needs a name and args object:
toolCallResponse([
{ name: "search", args: { query: "test" } },
{ name: "save", args: { key: "result", value: "42" } },
])
// toolCalls: [{ id: "tc_0", ... }, { id: "tc_1", ... }]
// finishReason: "tool_calls"Custom IDs:
toolCallResponse([
{ name: "search", args: { query: "test" }, id: "call_abc" },
])Testing Tool Calls
Mock a multi-turn conversation where the agent calls a tool and gets a follow-up response:
import { z } from "zod";
import { Agent, run, tool } from "@usestratus/sdk/core";
import { createMockModel, textResponse, toolCallResponse } from "@usestratus/sdk/testing";
const add = tool({
name: "add",
description: "Add two numbers",
parameters: z.object({ a: z.number(), b: z.number() }),
execute: async (_ctx, { a, b }) => String(a + b),
});
const model = createMockModel([
toolCallResponse([{ name: "add", args: { a: 2, b: 3 } }]), // LLM calls tool
textResponse("The answer is 5"), // LLM responds with result
]);
const agent = new Agent({ name: "calc", model, tools: [add] });
const result = await run(agent, "What is 2 + 3?");
expect(result.output).toBe("The answer is 5");Testing with Streaming
The mock model supports getStreamedResponse — it yields content_delta, tool_call_start/delta/done, and done events matching the real API shape:
import { Agent, stream } from "@usestratus/sdk/core";
import { createMockModel, textResponse } from "@usestratus/sdk/testing";
const model = createMockModel([textResponse("Streamed!")]);
const agent = new Agent({ name: "test", model });
const { stream: s, result } = stream(agent, "Hi");
const deltas: string[] = [];
for await (const event of s) {
if (event.type === "content_delta") deltas.push(event.content);
}
expect(deltas).toEqual(["Streamed!"]);
expect((await result).output).toBe("Streamed!");Debug Mode
Enable { debug: true } on run(), stream(), or createSession() to log model calls, tool executions, and handoffs to stderr:
const result = await run(agent, "Hello", { debug: true }); Output goes to process.stderr with [stratus:model], [stratus:tool], and [stratus:handoff] prefixes. No-op when disabled — zero overhead in production.
Last updated on