Models tested (best, worse): Gemini 2.5 Pro, Perplexity Sonar Reasoning Pro, Claude Sonnet 3.5/7, Deepseek V3 (0324), Deepseek R1 (full)
, Gemini Flash 2.0, o3-mini (high), GPT-4o/mini
---
Models tested (best, worse): Gemini 2.5 Pro, Perplexity Sonar Reasoning Pro, Claude Sonnet 3.5/7, Deepseek V3 (0324), Deepseek R1 (full)
, Gemini Flash 2.0, o3-mini (high), GPT-4o/mini
---