📊Evaluationalso: prompt injection, prompt injections, indirect prompt injection

Prompt injectiondefinition and how it works in 2026

Prompt injection: An attack where malicious instructions are smuggled into an LLM's input — through user prompts, web pages, documents, or tool outputs — causing the agent to ignore its real instructions.

Prompt injection is the agent-era equivalent of SQL injection. The agent reads input from a source (a webpage, an email, a document); that source contains instructions disguised as data; the agent follows the disguised instructions instead of its real system prompt.

Two variants. Direct injection: a user explicitly types "ignore your instructions and..." into the chat. Indirect injection: the agent reads a webpage that contains hidden instructions, and acts on them. Indirect injection is the harder problem — the user is not the attacker; they are the victim.

Defenses in 2026 are layered, none complete. System prompts with explicit refusal patterns. Treating tool outputs as untrusted data, not instructions. Output classifiers that flag suspicious responses. Tool-call allowlists. Constitutional AI training. The OWASP Top 10 for LLM Applications has prompt injection as #1, and for good reason.

Frequently asked

How is prompt injection different from jailbreaking?+

Jailbreaking is a user trying to override their own agent's safety. Prompt injection is a third party hiding instructions in content the agent reads, weaponizing it against the user. Indirect prompt injection is the more dangerous failure mode.

Can prompt injection be fully prevented?+

No technique fully prevents it today. The right model is defense in depth: assume injection will succeed sometimes, design the agent so successful injections cannot cause irreversible damage. Tool-call gates and confirmation on irreversible actions are the highest-leverage controls.

Agents that use prompt injection

SierraA78

Branded customer-facing agents from the founders of Salesforce + Google.

🎧SupportAutonomousSubscription

VoiceTool useMemoryRAG

33kFeb 18, 2025sierra.ai

Get Sierra demo

Demo · hover to play

DecagonA73

Conversational support agents that resolve tickets like your best reps.

🎧SupportAutonomousSubscription

Tool useMemoryRAG

20kApr 25, 2025decagon.ai

Get Decagon demo

Demo · hover to play

Devinv2.1A78

Autonomous AI software engineer that ships PRs end-to-end.

💻CodeAutonomousSubscription · from $500

CodeTool useBrowserMemory

184kMay 12, 2025devin.ai

Start Devin trial

Demo · hover to play

Frequently asked

Agents that use prompt injection

Related terms