2 weeks ago

Tues Jan 27, 2026 3:10pm PST

Ask HN: How do you budget for token based AI APIs?

The default norm today for using AI models via APIs is token based pricing, where you pay based on how much you use.

While this isn’t hard to understand, in practice it makes costs harder to predict, especially for small teams moving from experiments to early production. This feels less like a technical problem and more like a budgeting and planning problem.

I’m curious about alternative pricing abstractions, for example a subscription with unlimited tokens but a capped number of requests, aimed at making monthly spend easier to reason about while building.

For people running AI in production today, does token based billing give you enough predictability, or would a model like this actually reduce friction? What tradeoffs would matter most to you?

comments:

add comment

loading comments...