- Yes: the cost to run LLMs (inference) has fallen dramatically since 2022—by orders of magnitude for equivalent quality levels—while training costs for cutting-edge frontier models have generally increased.[1][2][3][4][5][6]
- Early public benchmarks and pricing baselines: models achieving mid-tier MMLU performance (e.g., GPT‑3.5-level) were priced at roughly $20 per million tokens in late 2022, setting an initial reference point for subsequent price declines.[2]
- Frontier training costs surge: estimates for training GPT‑4 and Google’s Gemini rose into the tens to hundreds of millions, reflecting growing scale and complexity of top-tier models.[3][5]
- Inference begins rapid decline for same-quality targets: comparing consistent quality levels (e.g., MMLU), prices start falling sharply across providers.[6][2]
- Massive drop in inference price for GPT‑3.5‑equivalent quality: the cost to query a model scoring about GPT‑3.5 on MMLU fell from $20.00/million tokens (Nov 2022) to $0.07/million tokens (Oct 2024), a >280× reduction in ~18 months.[2]
- Training costs remain high across major releases (e.g., Llama 3.1‑405B $170M; Grok‑2 $107M; Mistral Large $41M), underscoring divergence between training and inference economics at the frontier.[4]
- “Inference price declines are rapid but uneven”: analysis shows 9× to 900× per year declines depending on the task/benchmark, with some of the fastest drops occurring in the past year; uncertainty remains about persistence of the fastest rates.[1][2]
- Cross‑provider dispersion persists: open‑weight models (e.g., Llama‑3.1‑70B, 405B) show wide price spreads across serving providers—from ~$0.20 to ~$2.90 per million tokens for 70B, and ~$0.90 to ~$9.50 for 405B—indicating a non‑commodity market despite overall downtrend.[7]
- Overall “LLMflation” pattern: multiple analyses find roughly an order‑of‑magnitude (10×) per year decrease in inference cost for constant quality, with ~1,000× over three years at lower MMLU targets and ~62× since 2023 at GPT‑4‑class MMLU levels.[6][1]
- Economic shift toward inference/test‑time compute: improvements in smaller, efficient models and reasoning‑time strategies change cost structures, moving more cost to inference cycles even as per‑token prices fall, driven by higher usage and longer “thinking.”[8]
- Inference costs: strongly down since 2022 for equivalent performance, with 10× per year as a useful rule‑of‑thumb across many benchmarks; exact declines vary widely by task and provider.[7][1][2][6]
- Training costs: at the frontier, still trending high (often $50M–$200M+) even as some efficient/open models are cheaper to train—so “compute cost” depends on whether discussing training or inference.[5][3][4]
- Market dynamics: significant price dispersion by provider and model size persists despite overall declines, suggesting room for further competition and optimization.[7]
- GPT‑3.5‑equivalent inference price: $20.00/million tokens (Nov 2022) → $0.07/million (Oct 2024), >280× decline.[2]
- Inference price declines across tasks: 9× to 900× per year; some fastest drops in the last year.[1][2]
- Three‑year view: ~1,000× decline at lower MMLU target (42) from $60 → $0.06 per million tokens; ~62× decline since GPT‑4 era at higher MMLU target (83).[6]
- Provider spread (example, open‑weight): Llama‑3.1‑70B Instruct ranges ~$0.20–$2.90 per million tokens; Llama‑3.1‑405B ~$0.90–$9.50.[7]
- Frontier training costs (examples): GPT‑4 ~$79M (estimate); Gemini 1.0 Ultra ~$192M; Llama 3.1‑405B ~$170M; Grok‑2 ~$107M; Mistral Large ~$41M.[3][4][5]
- For inference: compute cost is decisively going down since 2022, with step‑changes in 2023–2025; pace varies by task and provider.[1][2][6][7]
- For training: leading‑edge training costs are not falling; they remain very high or rising for the most capable frontier systems.[4][5][3]
[1] https://epoch.ai/data-insights/llm-inference-price-trends [2] https://hai.stanford.edu/ai-index/2025-ai-index-report/research-and-development [3] https://www.statista.com/chart/33114/estimated-cost-of-training-selected-ai-models/ [4] https://www.visualcapitalist.com/the-surging-cost-of-training-ai-models/ [5] https://www.forbes.com/sites/katharinabuchholz/2024/08/23/the-extreme-cost-of-training-ai-models/ [6] https://a16z.com/llmflation-llm-inference-cost/ [7] https://techgov.intelligence.org/blog/observations-about-llm-inference-pricing [8] https://www.bruegel.org/policy-brief/how-deepseek-has-changed-artificial-intelligence-and-what-it-means-europe [9] https://skywork.ai/skypage/en/Analysis%20of%20the%20Evolution%20Path%20of%20%22Inference%20Cost%22%20of%20Large%20Models%20in%202025:%20The%20API%20Price%20War%20Erupts/1948243097032671232 [10] https://hai-production.s3.amazonaws.com/files/hai_ai_index_report_2025.pdf [11] https://epoch.ai/trends [12] https://www.reddit.com/r/LocalLLaMA/comments/1gpr2p4/llms_cost_is_decreasing_by_10x_each_year_for/