📊Evaluationalso: guardrails ai, ai guardrails, llm guardrails

Guardrails (AI)definition and how it works in 2026

Guardrails (AI): Constraints and filters layered around an LLM that prevent it from producing harmful, off-topic, or policy-violating outputs — applied at input, output, or both.

Guardrails are the production-runtime layer that catches what training-time alignment missed. An input guardrail filters out prompt injections and harmful queries before they reach the model. An output guardrail re-classifies the model's response and blocks or rewrites unsafe content before it reaches the user.

The common techniques: NeMo Guardrails for declarative policy specs, Llama Guard and ShieldGemma for output classification, regex/heuristic filters for PII redaction, and tool-call allowlists for agent action gating. Modern stacks layer several together; no single guardrail catches everything.

For agents specifically, guardrails are non-negotiable when the agent has tools. The agent can be jailbroken in fifty ways you have not thought of; the guardrail layer ensures that even if jailbroken, the agent cannot execute the worst actions (delete data, send unauthorized emails, charge cards).

Frequently asked

What is the difference between guardrails and alignment?+

Alignment is built into the model at training time. Guardrails are bolted on at deployment time. Use both — guardrails catch what alignment misses, and alignment makes guardrails simpler.

Do open-source guardrail libraries work?+

For input filtering and PII redaction: yes, mature. For output classification of nuanced policy violations: partially — Llama Guard and Shield models help but commercial stacks (Aporia, Lasso, Lakera) still lead on coverage.

Agents that use guardrails (ai)

SierraA78

Branded customer-facing agents from the founders of Salesforce + Google.

🎧SupportAutonomousSubscription

VoiceTool useMemoryRAG

33kFeb 18, 2025sierra.ai

Get Sierra demo

Demo · hover to play

DecagonA73

Conversational support agents that resolve tickets like your best reps.

🎧SupportAutonomousSubscription

Tool useMemoryRAG

20kApr 25, 2025decagon.ai

Get Decagon demo

Demo · hover to play

Parloa B65

Voice agents for contact centers — handles tier-1 calls end-to-end.

🎧SupportAutonomousSubscription

VoiceTool useMemory

8.2kApr 11, 2025parloa.com

Get Parloa demo

Demo · hover to play

Frequently asked

Agents that use guardrails (ai)

Related terms