Skip to content

Instantly share code, notes, and snippets.

@pszemraj
Created March 23, 2025 16:45
Show Gist options
  • Save pszemraj/3aaaa3ce1874eaa6130abe260ab15efe to your computer and use it in GitHub Desktop.
Save pszemraj/3aaaa3ce1874eaa6130abe260ab15efe to your computer and use it in GitHub Desktop.
Model Average CR⬆️ AGIEval Mean (Min, Max) AGIEval CR MMLU-Pro Mean (Min, Max) MMLU-Pro CR Math Mean (Min, Max) Math CR #Params (B)
meta-llama/Llama-3.1-70B-Instruct 72.39 72.43, (65.34, 74.66) 81.79 66.63, (55.16, 70.68) 73.19 65.88, (64.58, 67.86) 62.18 0
mistralai/Mistral-Large-Instruct-2407 71.93 68.78, (61.41, 74.49) 75.77 65.1, (50.28, 69.23) 72.31 71.04, (69.66, 72.72) 67.71 0
meta-llama/Meta-Llama-3-70B-Instruct 69.11 69.71, (60.77, 71.2) 83.13 58.75, (49.3, 63.16) 75.24 51.29, (49.66, 54.2) 48.96 0
01-ai/Yi-1.5-34B-Chat 58.43 63.89, (50.85, 70.98) 69.95 49.91, (36.47, 55.76) 57.31 53.46, (51.7, 54.42) 48.04 0
meta-llama/Llama-3.1-8B-Instruct 52.74 54.59, (44.62, 59.66) 62.54 45.3, (32.34, 51.94) 52.79 49.21, (46.88, 51.18) 42.9 0
mistralai/Mistral-Nemo-Instruct-2407 49.46 51.57, (38.46, 63.8) 58.7 40.63, (31.49, 47.65) 51.43 42.91, (40.72, 45.22) 38.26 0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment