We’re quickly coming to an inflection point for using AI to build stuff.
Until the near future, most companies encouraging AI-usage amongst their workforce have been worried mainly “that” they’re using it. Use more tokens. More token usage means our team uses it. We think that’s a good thing.
However, that will soon change because a developer all-in on AI now costs the company between $500 and $1500 extra per day in tokens.
Like all tool and resource costs, organizations will quickly have to start managing token budgets. Here’s a current list of token costs for the mainstream models:
| Model | Input Cost | Output Cost | Cache Read Cost |
|---|---|---|---|
glm | 0.6 | 2.2 | 0.11 |
kimi-k2.5 | 0.5 | 2.8 | — |
gemini-3-flash | 0.5 | 3.0 | — |
gpt-5.3-codex | 1.75 | 14.0 | 0.175 |
gpt-5.4 | 2.5 | 15.0 | 0.25 |
claude-sonnet-4-5 | 3.0 | 15.0 | 0.3 |
claude-sonnet-4-6 | 3.0 | 15.0 | 0.3 |
gemini-3-1-pro | 2.0 | 12.0 | 0.2 |
claude-opus-4-6 | 5.0 | 25.0 | 0.5 |
Is Opus 4.6 2X better than GPT-5.4 for software?
What is safe for GLM use?
I’ve heard gemini is good at docs, can we use it for that?
Much like figuring out which person should do which task, we now have to do this with models based on token cost.
Figure out how to get AI working for you, or you’ll be working for AI.
Discover more from johnmaconline
Subscribe to get the latest posts sent to your email.