Cheapest LLM for coding
Coding agents read a lot of code and write a fair amount back, so both input and output count — and they need a large context window to hold the files in play. These models clear a 200K-token floor, ranked cheapest first.
Cheapest models for coding
Monthly cost for a coding agent burning ~90M input and ~25M output tokens a month. Sorted cheapest first.
| # | Model | Context | Input $/M | Output $/M | Monthly cost |
|---|---|---|---|---|---|
| 1 | Amazon Nova Lite Amazon | 300K | $0.06 | $0.24 | $11.40 ◎ |
| 2 | Qwen-Flash Alibaba | 1M | $0.05 | $0.4 | $14.50 |
| 3 | Llama 4 Scout (17B-16E Instruct) Meta | 10M | $0.1 | $0.3 | $16.50 |
| 4 | Ministral 3 8B Mistral | 256K | $0.15 | $0.15 | $17.25 |
| 5 | Qwen3.5-Flash Alibaba | 1M | $0.1 | $0.4 | $19.00 |
| 6 | Llama 4 Maverick (17B-128E Instruct) Meta | 1M | $0.15 | $0.6 | $28.50 |
| 7 | Mistral Small 4 Mistral | 256K | $0.15 | $0.6 | $28.50 |
| 8 | GPT-5.4 nano OpenAI | 400K | $0.2 | $1.25 | $49.25 |
| 9 | Gemini 3.1 Flash-Lite Google | 1.0M | $0.25 | $1.50 | $60.00 |
| 10 | Qwen3.6-Flash Alibaba | 1M | $0.25 | $1.50 | $60.00 |
| 11 | Qwen-Plus (Qwen3-series) Alibaba | 1M | $0.4 | $1.20 | $66.00 |
| 12 | Qwen3.7-Plus Alibaba | 1M | $0.4 | $1.60 | $76.00 |
Estimate only; excludes prompt caching, batch discounts and free tiers. Different volumes change the ranking —run your own numbers. Prices verified against official docs · catalog updated 2026-06-28.
Coding workloads carry whole files and diffs into context and generate substantial output, so we weight a 90M-in / 25M-out monthly mix and require ≥200K context to fit a real working set. Cheapest is not always best here — verify the model can actually pass your tests before committing.
Cheapest LLM for coding
What is the cheapest LLM for coding?
Amazon Nova Lite (Amazon) is the cheapest generally-available model we track for coding, at $0.06 per 1M input tokens and $0.24 per 1M output tokens — about $11.40/month for a coding agent burning ~90M input and ~25M output tokens a month. Qwen-Flash is the next cheapest at $14.50/month.
How is "cheapest for coding" calculated?
We price a representative monthly workload — a coding agent burning ~90M input and ~25M output tokens a month — against every generally-available model, then rank by total cost. Only models with at least a 200K-token context window are included. All prices are USD per 1M tokens, sourced from official provider documentation.
Is the cheapest model always the right choice for coding?
No. Price is one axis; quality, latency, rate limits and reliability matter too. Use this ranking to shortlist, then test the top candidates on your own coding workload before committing. Cost is easy to measure — fit is not.
Get alerted when a cheaper model for coding ships
New models, price cuts, and deprecations — a short email when something actually changes. No spam, unsubscribe anytime.
◎ You're on the watch list. We'll ping you the moment a model launches, changes price, or gets deprecated.
Free forever · powered by the same data on this page.