Question 1

What is the cheapest LLM for vision?

Accepted Answer

Ministral 3 3B (Mistral) is the cheapest generally-available model we track for vision, at $0.04 per 1M input tokens and $0.04 per 1M output tokens — about $1.00/month for an image-understanding workload of ~20M input and ~5M output tokens a month. Ministral 3 8B is the next cheapest at $3.75/month.

Question 2

How is "cheapest for vision" calculated?

Accepted Answer

We price a representative monthly workload — an image-understanding workload of ~20M input and ~5M output tokens a month — against every generally-available model, then rank by total cost. Only models that accept image input qualify. All prices are USD per 1M tokens, sourced from official provider documentation.

Question 3

Is the cheapest model always the right choice for vision?

Accepted Answer

No. Price is one axis; quality, latency, rate limits and reliability matter too. Use this ranking to shortlist, then test the top candidates on your own vision workload before committing. Cost is easy to measure — fit is not.

#	Model	Context	Input $/M	Output $/M	Monthly cost
1	Ministral 3 3B Mistral	—	$0.04	$0.04	$1.00 ◎
2	Ministral 3 8B Mistral	256K	$0.15	$0.15	$3.75
3	Ministral 3 14B Mistral	—	$0.2	$0.2	$5.00
4	Mistral Small 4 Mistral	256K	$0.15	$0.6	$6.00
5	Mistral Large 3 Mistral	256K	$0.5	$1.50	$17.50
6	Grok Build 0.1 xAI	256K	$1	$2	$30.00
7	Grok 4.3 xAI	1M	$1.25	$2.50	$37.50
8	Grok 4.20 (0309) Reasoning xAI	1M	$1.25	$2.50	$37.50
9	Grok 4.20 (0309) Non-Reasoning xAI	1M	$1.25	$2.50	$37.50
10	Claude Haiku 4.5 Anthropic	200K	$1	$5	$45.00
11	Mistral Medium 3.5 Mistral	—	$1.50	$7.50	$67.50
12	Claude Sonnet 4.6 Anthropic	1M	$3	$15	$135

Cheapest LLM for vision

Cheapest models for vision