The 25 AI agent terms every buyer should know in 2026 — what they mean in plain English and why they matter.
For the full glossary with all 80+ terms see /glossary.
Foundational (5 terms)
Agent — Software powered by an LLM that perceives, plans, and acts across multiple steps. Different from a chatbot because it takes action, not just generates text.
Agentic AI — AI systems that act autonomously rather than just respond to prompts. The umbrella term for the 2026 wave.
LLM (Large Language Model) — The neural network at the heart of every agent. GPT-5, Claude, Gemini are LLMs.
Agentic loop — The core pattern: observe → reason → act → observe, repeated until done.
Agentic workflow — A workflow where AI decides next steps dynamically, not a fixed if-this-then-that automation.
Autonomy (4 terms)
Autonomous agent — Plans and executes without asking for approval between steps. Examples: Devin, Manus.
Semi-autonomous agent — Plans and acts unsupervised but pauses at irreversible actions. Examples: Cursor, Cline.
Copilot — Suggests; waits for human to accept. Examples: GitHub Copilot inline, ChatGPT chat.
Human-in-the-loop — Pattern where agent pauses for human approval at key checkpoints.
Capabilities (6 terms)
Tool use — Agent invokes external functions (APIs, browsers, code). The capability that turns LLMs into agents.
Function calling — The API mechanic for tool use. Model emits structured JSON; runtime executes.
Browser use — Agent drives a web browser like a human (clicks, types, navigates).
Code execution — Agent writes and runs code in a sandbox.
RAG (Retrieval-augmented generation) — Agent retrieves relevant docs before answering. Grounds outputs in your data.
Memory — Agent remembers across sessions. Critical for personal assistants and long-running agents.
Infrastructure (5 terms)
MCP (Model Context Protocol) — Open standard for connecting agents to tools. Anthropic-published, now industry standard.
Vector database — Storage for embeddings used in RAG and semantic search.
Vector embedding — Dense numerical representation of text/image/audio used for semantic similarity.
Context window — Maximum tokens the model considers at once. 200K-1M typical in 2026.
Prompt caching — Cost optimization that reuses computation for stable prompt prefixes. 50-90% discount.
Reasoning & architecture (3 terms)
Chain of thought — Model reasons step-by-step before answering. Improves accuracy.
Reasoning model — Models that produce internal thinking before responding. Examples: o3, Claude with extended thinking.
Test-time compute — Spending more compute at inference time for better answers without retraining.
Evaluation (2 terms)
AI evals — Systematic test suites for AI systems. Unit tests for prompts and agents.
LLM observability — Tracing every LLM call (prompt, response, cost, latency). Mandatory for production.
What you actually need to know as a buyer
For purchasing AI agents in 2026, focus on these:
- Autonomy tier (copilot / semi-auto / autonomous)
- Capabilities (tool use, RAG, voice, browser use)
- Pricing model (subscription, per-task, BYO-key)
- Key metrics (deflection rate, PR merge rate, etc.)
- MCP support (for coding agents specifically)
The rest is for builders, not buyers.
The verdict
25 terms cover 95% of AI agent buyer conversations in 2026. The other 5%? Read our full glossary which has 80+ entries.
For deeper guides see The 15 best AI agents in 2026, AI agent vs LLM, and AI agent vs chatbot.