Blog
Data

AI Cost per Query: How to Calculate It

One formula, a few realistic numbers, and you can put a dollar figure on every question your AI answers — and compare it to the human alternative.

QueryInput tokensAIModelInput + output rates$CostCost per query

“What does one AI answer cost us?” is the question every budget conversation eventually lands on. The good news: it's genuinely easy to calculate. You need four numbers, one formula, and about five minutes with your usage logs.

The formula

Cost per query = (input tokens × input rate + output tokens × output rate) ÷ 1,000,000

The four numbers: your average input tokens and output tokens per query (both are in your provider's usage logs or API responses), and the input and output rates from the provider's pricing page. That's it — everything else is multiplication.

A realistic worked example

Picture a knowledge assistant answering an employee's policy question. The request that actually hits the model looks like this:

  • System prompt: ~400 tokens — instructions, tone, guardrails.
  • Retrieved context: ~900 tokens — the relevant policy passages.
  • History + question: ~200 tokens — a short running conversation.
  • Answer: ~300 tokens — a few clear paragraphs with a citation.

That's 1,500 tokens in and 300 out. On a mid-tier model at $3 / $15 per million: input costs $0.0045, output costs $0.0045 — about $0.009 per query. Under a cent.

Cost per query across model tiers

Model tier (in / out per 1M)Same queryPer 10,000 queries
Flagship ($12.50 / $62.50)$0.0375$375
Mid-tier ($3 / $15)$0.009$90
Small / fast ($0.50 / $2.50)$0.0015$15

Illustrative rates, identical 1,500-in / 300-out query. Tier choice changes the bill 25×, which is why routing matters.

What the formula leaves out

API token costs are the visible part of the iceberg. Budget for the rest:

  • Retries and fallbacks — failed or re-routed calls are billed too.
  • Follow-up turns — each one resends the conversation history.
  • Embeddings and indexing — RAG pipelines pay to ingest content, not just query it.
  • Evaluation and monitoring — test suites run real tokens.

A practical planning buffer is 20–40% on top of raw per-query math.

Cost per query vs cost per answer

Here's the comparison that actually matters. When a person answers the same policy question, it takes an expert about five minutes. At a fully-loaded $53/hour, that's roughly $4.42 per question — around 500× the cost of the mid-tier AI query above. The point isn't that AI replaces the expert; it's that the expert stops being the bottleneck for questions a system can answer instantly.

Put your own numbers in

Our AI ROI Calculator runs exactly this math across your team's real salary and question volume — it's the fastest way to see what repetitive questions cost your organization today.

Key takeaways
  • Cost per query = token counts × rates ÷ 1M. Four numbers, one formula.
  • A typical knowledge query costs fractions of a cent to a few cents, depending on model tier.
  • Add a 20–40% buffer for retries, follow-ups, embeddings, and evals.
  • The real benchmark is the human alternative — minutes of expert time per question, at hundreds of times the cost.

Frequently asked questions

What's a typical AI cost per query?

For a knowledge-assistant query with moderate context, expect fractions of a cent on small models, around a cent on mid-tier models, and a few cents on flagship models. Context length is the biggest variable.

Do follow-up questions cost more?

Yes — every turn resends the conversation history, so token counts grow as a chat continues. Prompt caching and history summarization keep long conversations affordable.

Where do I find my token counts?

Every major provider returns token usage in the API response and aggregates it in a usage dashboard. Average the input and output counts over a representative week of traffic.

Curious what answers cost your organization?

Put real numbers on it in two minutes with our AI ROI Calculator — or see AskBobAI answer your team's questions live.