stratus

Guardrails

Input and output validation with tripwire support

Guardrails validate agent input and output, allowing you to block harmful or invalid content before it reaches the user.

Input Guardrails

Input guardrails run before the model is called. They check the user's message:

input-guardrail.ts
import { Agent, run } from "@usestratus/sdk/core";
import type { InputGuardrail } from "@usestratus/sdk/core";

const noPersonalInfo: InputGuardrail = {
  name: "no_personal_info",
  execute: async (input) => {
    const hasPII = /\b\d{3}-\d{2}-\d{4}\b/.test(input); // SSN pattern
    return { tripwireTriggered: hasPII };
  },
};

const agent = new Agent({
  name: "assistant",
  model,
  inputGuardrails: [noPersonalInfo], 
});

Output Guardrails

Output guardrails run after the model responds. They check the model's output:

output-guardrail.ts
import type { OutputGuardrail } from "@usestratus/sdk/core";

const noCodeInOutput: OutputGuardrail = {
  name: "no_code",
  execute: async (output) => {
    const hasCode = output.includes("```");
    return { tripwireTriggered: hasCode };
  },
};

const agent = new Agent({
  name: "assistant",
  model,
  outputGuardrails: [noCodeInOutput], 
});

Guardrail Interface

types.ts
interface InputGuardrail<TContext = unknown> {
  name: string;
  execute: (input: string, context: TContext) => GuardrailResult | Promise<GuardrailResult>;
}

interface OutputGuardrail<TContext = unknown> {
  name: string;
  execute: (output: string, context: TContext) => GuardrailResult | Promise<GuardrailResult>;
}

interface GuardrailResult {
  tripwireTriggered: boolean;
  outputInfo?: unknown; // Optional metadata about why the tripwire fired
}

Tripwire Errors

When a guardrail triggers, it throws an error that you can catch:

error-handling.ts
import {
  InputGuardrailTripwireTriggered,
  OutputGuardrailTripwireTriggered,
} from "@usestratus/sdk/core";

try {
  await run(agent, userInput);
} catch (error) {
  if (error instanceof InputGuardrailTripwireTriggered) {
    console.log(`Blocked by: ${error.guardrailName}`);
    console.log(`Details:`, error.outputInfo);
  }
  if (error instanceof OutputGuardrailTripwireTriggered) {
    console.log(`Output blocked by: ${error.guardrailName}`);
  }
}

Using Context

Guardrails receive the same context as tools:

context-guardrail.ts
const tenantGuardrail: InputGuardrail<AppContext> = {
  name: "tenant_check",
  execute: async (input, ctx) => {
    const isAllowed = await checkTenantPermissions(ctx.tenantId, input);
    return { tripwireTriggered: !isAllowed };
  },
};

Tool Guardrails

Tool guardrails run before and after individual tool executions. Use them to validate tool arguments or inspect tool results.

ToolInputGuardrail

Runs before a tool's execute function. Receives the tool name, parsed arguments, and context:

tool-input-guardrail.ts
import type { ToolInputGuardrail } from "@usestratus/sdk/core";

const noDeleteOps: ToolInputGuardrail<AppContext> = {
  name: "no_delete_operations",
  execute: async ({ toolName, toolArgs, context }) => {
    if (toolName.startsWith("delete_") && !context.isAdmin) {
      return { tripwireTriggered: true, outputInfo: "Admin access required" };
    }
    return { tripwireTriggered: false };
  },
};

ToolOutputGuardrail

Runs after a tool's execute function. Receives the tool name, result string, and context:

tool-output-guardrail.ts
import type { ToolOutputGuardrail } from "@usestratus/sdk/core";

const noSensitiveData: ToolOutputGuardrail = {
  name: "no_sensitive_data",
  execute: async ({ toolName, toolResult, context }) => {
    const hasPII = /\b\d{3}-\d{2}-\d{4}\b/.test(toolResult);
    return { tripwireTriggered: hasPII };
  },
};

Passing Tool Guardrails

Tool guardrails are passed via run() / stream() options or SessionConfig:

tool-guardrails-usage.ts
await run(agent, input, {
  toolInputGuardrails: [noDeleteOps], 
  toolOutputGuardrails: [noSensitiveData], 
});

Unlike input/output guardrails (which throw TripwireTriggered errors), tool guardrails return their results without throwing. The results are collected and available on RunResult.inputGuardrailResults and RunResult.outputGuardrailResults.

Guardrail Results

Guardrail execution results are available on the RunResult:

guardrail-results.ts
const result = await run(agent, input, {
  toolInputGuardrails: [noDeleteOps],
});

for (const gr of result.inputGuardrailResults) {
  console.log(`${gr.guardrailName}: triggered=${gr.result.tripwireTriggered}`);
}
for (const gr of result.outputGuardrailResults) {
  console.log(`${gr.guardrailName}: triggered=${gr.result.tripwireTriggered}`);
}

Each GuardrailRunResult contains:

interface GuardrailRunResult {
  guardrailName: string;
  result: GuardrailResult;
}

Guardrails in Sessions

session-guardrails.ts
const session = createSession({
  model,
  inputGuardrails: [noPersonalInfo],
  outputGuardrails: [noCodeInOutput],
  toolInputGuardrails: [noDeleteOps],
  toolOutputGuardrails: [noSensitiveData],
});

Execution Details

Edit on GitHub

Last updated on

On this page