aiagentrank.io
🏗️Architecturealso: reflexion, reflexion pattern, self-reflection agents

Reflexion

An agent design pattern where the agent reflects on its previous attempts, generates a critique, and uses the critique to improve subsequent attempts — produces measurable accuracy gains on hard tasks.

Reflexion was introduced in the 2023 paper by Shinn et al. as a way to add learning-from-experience to LLM agents without weight updates. The agent tries a task, evaluates its own outcome, writes a reflection in natural language, and uses the reflection as additional context in the next attempt.

In 2026 the pattern is mainstream. Most agent frameworks (LangGraph, CrewAI, AutoGen) support reflection nodes. Production agents at Devin, Manus, and Sweep use reflection-like patterns for failure recovery: if a test fails, the agent reflects, adjusts approach, retries.

The trade-off is cost. Reflection doubles or triples token usage on a given task. Use it when the cost of being wrong exceeds the cost of the extra reflection — typically for verifiable tasks (code, math, structured output) where you can detect failure cheaply.

Frequently asked

When is reflexion worth the cost?+

For verifiable tasks where a wrong answer is detectable. Coding (tests pass), math (numerical check), structured output (schema validation). For open-ended outputs where success is fuzzy, reflexion adds cost without clear benefit.

How is reflexion different from chain-of-thought?+

CoT generates reasoning before the answer in a single pass. Reflexion runs across multiple attempts, learning from each attempt's failure. CoT improves single-shot quality; reflexion improves multi-attempt quality.

Related terms