Models / Meta

Llama 4 Scout (17B-16E Instruct)

GA

Open-weight, natively multimodal MoE: 17B active / 109B total params, 16 experts; fits on a single H100. License: Llama 4 Community License Agreement. Open weights support up to 10M-token context; the official Llama API serves it at 128k (model ID 'Llama-4-Scout-17B-16E-Instruct-FP8'). Hosted price is OpenRouter slug 'meta-llama/llama-4-scout' = $0.10 in / $0.30 out per 1M (page accessed 2026-06-28); Together AI reported ~$0.08 in / $0.30 out per 1M. No cached-input discount published. Same 12 supported languages as Maverick.

Provider
Meta
Status
GA
Input price
$0.1 / 1M tokens
Output price
$0.3 / 1M tokens
Cached input
Blended price
$0.15 / 1M tokens
Context window
10,000,000 tokens (10M)
Max output
Modality
text, image
Knowledge cutoff
2024-08
Released
5 Apr 2025
API string
meta-llama/Llama-4-Scout-17B-16E-Instruct-FP8

Source: Meta official documentation ↗