Tools · Cost cap

AI Feature Cost Estimator

Before you ship an LLM feature, know what it costs at scale. Put in the traffic and the token shape; get cost per request, per month, and per year. Token rates are editable — set them to your provider’s current pricing.

Prompt + context + retrieved chunks.

Model tier (ballpark — fills the rates below, then edit freely)
Monthly
Per request
Annual

How it works: monthly cost = requests × (input tokens × input rate + output tokens × output rate) ÷ 1,000,000. Deterministic from published per-token pricing — it doesn’t model caching, batch discounts, or retries, so treat it as the floor of a budget, not the ceiling. Nothing is sent anywhere; the math runs in your browser.

Want this number capped in your app, not just estimated? A cost ceiling, a latency budget, and an eval harness are the boring parts I build in. That’s the engagement.

Email me More tools