Finish Reasons

Every model response includes a finishReason - why the model stopped generating. The run loop uses this to decide what happens next: execute tool calls, return a result, or throw an error.

Finish reason values

Value	Meaning	Run loop behavior
`stop`	The model finished naturally. It produced a complete response.	Returns the result. The run is done.
`tool_calls`	The model wants to call one or more tools.	Executes the tool calls, then calls the model again with the results.
`length`	The response was truncated because it hit the `maxTokens` limit.	Returns the partial result. No error is thrown.
`content_filter`	Azure's content filter blocked the request or response.	Throws a `ContentFilterError`. The run does not continue.

How the run loop uses finish reasons

When the model responds, the run loop checks the response and branches:

Model returns a response

The run loop calls the model and receives a ModelResponse containing content, toolCalls, and finishReason.

If toolCalls is non-empty (finish reason is tool_calls), the run loop executes all tool calls in parallel, appends the results as tool messages, and calls the model again. This repeats until the model responds without tool calls or maxTurns is exceeded.

No tool calls -- return the result

If toolCalls is empty, the run is finished. The model's text output becomes result.output. The finish reason is stored on result.finishReason -- typically stop or length.

Model response
├── toolCalls present?
│   ├── Yes → execute tools → call model again (loop)
│   └── No  → finishReason is "stop" or "length"
│       └── return RunResult
└── finishReason is "content_filter"?
    └── Yes → throw ContentFilterError

The content_filter finish reason is intercepted at the model layer before the run loop sees it. Both AzureResponsesModel and AzureChatCompletionsModel throw a ContentFilterError immediately, so the run loop never receives a response with finishReason: "content_filter".

Accessing finishReason

From `run()`

run-finish-reason.ts

import { Agent, run } from "stratus-sdk/core";

const agent = new Agent({ name: "assistant", model });
const result = await run(agent, "What is the capital of France?");

console.log(result.finishReason); // "stop"
console.log(result.output);       // "The capital of France is Paris."

From `stream()`

stream-finish-reason.ts

import { Agent, stream } from "stratus-sdk/core";

const agent = new Agent({ name: "writer", model });
const { stream: s, result } = stream(agent, "Write a haiku");

for await (const event of s) {
  if (event.type === "content_delta") {
    process.stdout.write(event.content);
  }
  if (event.type === "done") {
    // Per-call finish reason from this model response
    console.log(event.response.finishReason); 
  }
}

const finalResult = await result;
console.log(finalResult.finishReason); // "stop" - from the last model call

From a session

session-finish-reason.ts

import { createSession } from "stratus-sdk/core";

const session = createSession({ model, instructions: "You are a helpful assistant." });

session.send("Explain TypeScript generics");
for await (const event of session.stream()) {
  if (event.type === "content_delta") process.stdout.write(event.content);
}

const result = await session.result;
console.log(result.finishReason); // "stop"

Finish reasons vs errors

A finish reason is part of a successful response. An error means no usable response was produced.

Condition	Type	How it surfaces	Recoverable?
`stop`	Finish reason	`result.finishReason`	N/A -- this is the normal case
`tool_calls`	Finish reason	`result.finishReason` (of the last call)	N/A -- the run loop handles this
`length`	Finish reason	`result.finishReason`	Yes -- increase `maxTokens` or shorten input
`content_filter`	Thrown error	`catch (e) { e instanceof ContentFilterError }`	Depends -- rephrase the input or output
API failure	Thrown error	`catch (e) { e instanceof ModelError }`	Retry or check credentials
Timeout	Thrown error	`catch (e) { e instanceof RunAbortedError }`	Increase timeout or simplify the task
Too many turns	Thrown error	`catch (e) { e instanceof MaxTurnsExceededError }`	Increase `maxTurns`

A length finish reason is not an error. The run completes successfully, but the output may be incomplete. Always check finishReason if you need to guarantee the model finished its response.

Handling truncated responses

When finishReason is "length", the model hit the token limit before finishing. The output is cut off mid-sentence or mid-thought. Here are your options:

Increase maxTokens -- Give the model more room to respond.

increase-max-tokens.ts

const agent = new Agent({
  name: "writer",
  model,
  modelSettings: {
    maxTokens: 4096, 
  },
});

const result = await run(agent, "Write a detailed analysis of TypeScript's type system");
if (result.finishReason === "length") {
  console.warn("Response was truncated - consider increasing maxTokens");
}

Shorten the input -- Reduce the prompt length so more tokens are available for the response.

Split into multiple calls -- Break a large task into smaller, focused prompts that each fit within the token limit.

Detect and retry -- Check the finish reason and automatically retry with a higher limit.

retry-on-truncation.ts

import { Agent, run } from "stratus-sdk/core";

const agent = new Agent({ name: "writer", model });

let result = await run(agent, "Summarize this document", {
  context: { maxTokens: 1024 },
});

if (result.finishReason === "length") { 
  const retryAgent = agent.clone({
    modelSettings: { maxTokens: 4096 },
  });
  result = await run(retryAgent, "Summarize this document");
}

console.log(result.output);

In streaming

During streaming, the finish reason is not available until the model finishes its response. It arrives in the final done event for each model call.

streaming-finish-reason.ts

import { Agent, stream } from "stratus-sdk/core";

const agent = new Agent({ name: "assistant", model });
const { stream: s, result } = stream(agent, "Tell me a story");

for await (const event of s) {
  switch (event.type) {
    case "content_delta":
      process.stdout.write(event.content);
      break;
    case "done":
      // Available here - one 'done' event per model call
      console.log("\nFinish reason:", event.response.finishReason); 
      break;
  }
}

// Also available on the final RunResult
const finalResult = await result;
console.log("Last finish reason:", finalResult.finishReason);

If the run involves tool calls, you will see multiple done events -- one per model call. The finishReason on the RunResult is always from the last model call in the run.

Finish reason values

How the run loop uses finish reasons

Model returns a response

Check for tool calls

No tool calls -- return the result

Accessing finishReason

From `run()`

From `stream()`

From a session

Finish reasons vs errors

Handling truncated responses

In streaming

Next steps

Running Agents

Streaming

Errors

Model Settings

On this page