Tooling terms
Protocols and primitives agents use to talk to the outside world.
- ๐ToolingA2A protocol
Google's 2025 open Agent2Agent protocol โ a standard for agents from different vendors to discover each other, exchange tasks, and stream results.
- ๐ToolingAgent protocols
The umbrella term for the standards agents use to communicate with tools (MCP) and with each other (A2A) โ the connective-tissue layer of the 2026 agent ecosystem.
- ๐ToolingAI gateway
A middleware layer that sits between your application and LLM APIs โ handles routing, fallback, caching, rate limiting, cost tracking, and observability across multiple model providers.
- ๐ToolingAI pipeline
A multi-step data processing flow that includes one or more LLM or AI calls โ typically combines preprocessing, retrieval, LLM inference, post-processing, and observability into a single deployable unit.
- ๐ToolingAutoGen
Microsoft's open-source framework for multi-agent conversation โ agents talk to each other to solve problems collaboratively, with explicit support for code execution and human-in-the-loop.
- ๐ToolingBedrock Agents
AWS's managed agent service inside Amazon Bedrock โ provides agent orchestration, tool integration via OpenAPI/Lambda, and a knowledge-base layer for RAG out of the box.
- ๐ToolingContext window
The maximum number of tokens a model can consider at once โ covers the system prompt, conversation, tool results, and the answer being generated.
- ๐ToolingCrewAI
An open-source Python framework for role-based multi-agent systems โ define agents with roles, goals, and tools, then orchestrate them into "crews" that collaborate on tasks.
- ๐ToolingDSPy
A Stanford-built framework that treats LLM prompts as compilable programs โ define what you want declaratively, DSPy optimizes the prompts and few-shot examples automatically.
- ๐ToolingFunction calling
An LLM API feature that lets the model emit a structured JSON call to a developer-defined function โ the model picks the function name and arguments; the runtime executes the call.
- ๐ToolingJSON mode
An LLM API setting that constrains the output to syntactically valid JSON โ and, with strict mode, to a specific schema. The simplest reliable path to structured output.
- ๐ToolingKV cache
A transformer inference optimization that stores key/value attention tensors from previous tokens so they do not need to be recomputed on every new token.
- ๐ToolingLangChain
The original Python/TypeScript framework for building LLM applications โ provides abstractions for chains, agents, tool use, memory, and retrieval. In 2026, mostly superseded by LangGraph for new projects.
- ๐ToolingLangGraph
An open-source framework from LangChain for building stateful, multi-step agent applications as graphs โ nodes are agent steps, edges define control flow, and state persists across steps.
- ๐ToolingLlamaIndex
A Python framework focused on RAG and data-augmented LLM applications โ provides indexing, retrieval, and query pipelines for connecting LLMs to your data.
- ๐ToolingLLM gateway
A specific kind of AI gateway focused on LLM API calls โ provides a unified interface to multiple LLM providers (OpenAI, Anthropic, Google, Mistral, etc.) with routing, caching, and observability.
- ๐Toolingllms.txt
A proposed convention (like robots.txt) for sites to tell LLMs which content to ingest, in what summary form, and on what terms. Adoption growing through 2025โ2026.
- ๐ToolingMCP server
A process that exposes tools, resources, or prompts over the Model Context Protocol โ any MCP-compliant agent can connect to it and use what it exposes.
- ๐ToolingModel Context Protocol (MCP)
Model Context Protocol (MCP) is an open standard that lets any AI agent connect to any tool or data source through a single protocol โ solving the MรN integration problem for the agent ecosystem.
- ๐ToolingModel router
A component that selects which LLM to use for each request โ based on cost, latency, capability, or content classification. Sits inside an AI gateway or as a standalone routing layer.
- ๐ToolingOpenAI Agents SDK
OpenAI's 2025 production framework for building agents โ successor to the older Assistants API, with first-class handoffs, guardrails, tracing, and tool use.
- ๐ToolingParallel tool calling
A model capability where the LLM returns multiple tool calls in a single response โ the agent runtime executes them concurrently rather than serially, cutting latency on independent operations.
- ๐ToolingPrompt caching
A vendor-side optimization that reuses computation for shared prompt prefixes across requests โ billed at a 75โ90% discount compared to fresh prompt tokens.
- ๐ToolingPrompt templates
Parameterized prompt patterns stored as reusable, version-controlled assets โ the basic abstraction for managing prompts at production scale.
- ๐ToolingPrompt versioning
The practice of treating system prompts as first-class code โ versioned, tested, and deployed through CI/CD instead of edited inline in source files.
- ๐ToolingPydantic AI
A Python agent framework from the Pydantic team โ type-safe agents with structured outputs, model-agnostic, and a thin API designed to feel like FastAPI for LLMs.
- ๐ToolingSemantic cache
A cache layer that matches incoming prompts to past prompts by embedding similarity rather than exact match โ serves stored responses for paraphrased queries.
- ๐ToolingStructured output
A model feature that constrains output to a specific JSON schema, making LLM responses safely parseable by downstream code.
- ๐ToolingSystem prompt
The initial instruction text given to an LLM that sets its persona, tools, constraints, and default behavior for the session.
- ๐ToolingVertex AI Agents
Google Cloud's managed agent platform inside Vertex AI โ visual builder, code SDK, native Gemini models, and A2A-protocol-first orchestration.