LLM gateway
A specific kind of AI gateway focused on LLM API calls — provides a unified interface to multiple LLM providers (OpenAI, Anthropic, Google, Mistral, etc.) with routing, caching, and observability.
LLM gateway is the more common term in 2026 for what we used to call "AI proxy." It is essentially synonymous with AI gateway but specifically scoped to LLM calls (rather than the broader AI infrastructure that might include vector databases, embedding services, and other components).
The functional split: one unified endpoint (your app talks to this) → routing logic (which model to use for this request) → multiple LLM providers (Anthropic, OpenAI, etc.) → consolidated response back. Observability, cost tracking, caching, and failover all sit in the gateway layer.
Production teams in 2026 increasingly treat LLM gateways as critical infrastructure — alongside CDN, database, and monitoring. The 2026 leaders: Portkey, LiteLLM, Helicone, Braintrust, Cloudflare AI Gateway. Pick by ecosystem fit.
Frequently asked
LLM gateway vs AI gateway — are they the same?+
In 2026 practice, mostly. LLM gateway is more specific (LLM calls only); AI gateway is broader (all AI infrastructure). The terms are increasingly used interchangeably.
Can I build my own LLM gateway?+
Possible but rarely worth it. The off-the-shelf options (LiteLLM open-source, Portkey commercial) cover 95% of needs at a fraction of the build cost. Build only when you have specific requirements no vendor meets.