“What does one AI answer cost us?” is the question every budget conversation eventually lands on. The good news: it's genuinely easy to calculate. You need four numbers, one formula, and about five minutes with your usage logs.
The formula
The four numbers: your average input tokens and output tokens per query (both are in your provider's usage logs or API responses), and the input and output rates from the provider's pricing page. That's it — everything else is multiplication.
A realistic worked example
Picture a knowledge assistant answering an employee's policy question. The request that actually hits the model looks like this:
- System prompt: ~400 tokens — instructions, tone, guardrails.
- Retrieved context: ~900 tokens — the relevant policy passages.
- History + question: ~200 tokens — a short running conversation.
- Answer: ~300 tokens — a few clear paragraphs with a citation.
That's 1,500 tokens in and 300 out. On a mid-tier model at $3 / $15 per million: input costs $0.0045, output costs $0.0045 — about $0.009 per query. Under a cent.
Cost per query across model tiers
| Model tier (in / out per 1M) | Same query | Per 10,000 queries |
|---|---|---|
| Flagship ($12.50 / $62.50) | $0.0375 | $375 |
| Mid-tier ($3 / $15) | $0.009 | $90 |
| Small / fast ($0.50 / $2.50) | $0.0015 | $15 |
Illustrative rates, identical 1,500-in / 300-out query. Tier choice changes the bill 25×, which is why routing matters.
What the formula leaves out
API token costs are the visible part of the iceberg. Budget for the rest:
- Retries and fallbacks — failed or re-routed calls are billed too.
- Follow-up turns — each one resends the conversation history.
- Embeddings and indexing — RAG pipelines pay to ingest content, not just query it.
- Evaluation and monitoring — test suites run real tokens.
A practical planning buffer is 20–40% on top of raw per-query math.
Cost per query vs cost per answer
Here's the comparison that actually matters. When a person answers the same policy question, it takes an expert about five minutes. At a fully-loaded $53/hour, that's roughly $4.42 per question — around 500× the cost of the mid-tier AI query above. The point isn't that AI replaces the expert; it's that the expert stops being the bottleneck for questions a system can answer instantly.
Our AI ROI Calculator runs exactly this math across your team's real salary and question volume — it's the fastest way to see what repetitive questions cost your organization today.
- Cost per query = token counts × rates ÷ 1M. Four numbers, one formula.
- A typical knowledge query costs fractions of a cent to a few cents, depending on model tier.
- Add a 20–40% buffer for retries, follow-ups, embeddings, and evals.
- The real benchmark is the human alternative — minutes of expert time per question, at hundreds of times the cost.