Usage-based pricing
A pricing model where the customer pays for what they actually use — typically tokens, tool calls, compute minutes, or active agent hours — with no minimum seat commitment.
Usage-based pricing maps spend to consumption. For LLM APIs the unit is tokens (input + output, often priced separately, with cached input cheaper). For agent platforms the unit varies: agent-runtime-hours, tool invocations, workflow executions.
The model fits product-led growth motions because adoption can start small. The risk is bill shock — a single misconfigured loop can burn thousands of dollars overnight. Production-grade usage-based products ship budget caps, alerts, and per-tenant quotas.
In 2026, most agent platforms run usage-based as the default with seat-based or outcome-based options on top. Enterprise buyers almost always want a committed-spend discount layered on usage.
Frequently asked
How do I avoid bill shock with usage-based pricing?+
Set per-tenant token caps. Alert at 50/75/90% of budget. Default agent loops to a max-step counter. Audit any month-over-month spend jump >50%.