Open models ranked by intelligence and speed — hosted on GPUs in the EU.
Ranked by a composite of publicly reported benchmarks. Higher is better.
| # | Model | Size | MMLU | GPQA | MATH | Code | Index |
|---|---|---|---|---|---|---|---|
| 1 |
DeepSeek
|
685B | 91.4 | 81.0 | 97.0 | 92.0 |
90.4
|
| 2 |
OpenAI
|
120B | — | 80.1 | 97.9 | — |
89.0
|
| 3 |
DeepSeek
|
685B | 90.8 | 71.5 | 97.3 | 90.0 |
87.4
|
| 4 |
GLM 4.6EU
Z.AI
|
357B | 85.5 | 82.0 | 93.5 | — |
87.0
|
| 5 |
GLM 4.5EU
Z.AI
|
358B | 84.6 | 79.1 | 91.0 | — |
84.9
|
| 6 |
Qwen
|
32B | 83.3 | — | 83.1 | 88.4 |
84.9
|
| 7 |
OpenAI
|
22B | — | 71.5 | 96.0 | — |
83.8
|
| 8 |
Qwen
|
235B | 87.8 | 71.1 | 92.2 | — |
83.7
|
| 9 |
DeepSeek
|
685B | 90.0 | 67.0 | 92.0 | 85.0 |
83.5
|
| 10 |
DeepSeek
|
685B | 89.5 | 62.7 | 91.6 | 84.0 |
82.0
|
| 11 |
Z.AI
|
110B | 81.0 | 75.0 | 89.0 | — |
81.7
|
| 12 |
Qwen
|
14B | 79.7 | — | 80.0 | 83.5 |
81.1
|
| 13 |
DeepSeek
|
685B | 88.5 | 59.1 | 90.2 | 82.6 |
80.1
|
| 14 |
DeepSeek
|
70B | — | 65.2 | 94.5 | — |
79.9
|
| 15 |
Qwen
|
32B | 82.4 | 68.4 | 88.5 | — |
79.8
|
| 16 |
DeepSeek
|
32B | — | 62.1 | 94.3 | — |
78.2
|
| 17 |
Qwen
|
14B | 80.7 | 65.8 | 87.7 | — |
78.1
|
| 18 |
Qwen
|
30B | 79.5 | 65.8 | 86.6 | — |
77.3
|
| 19 |
DeepSeek
|
14B | — | 59.1 | 93.9 | — |
76.5
|
| 20 |
Qwen
|
72B | 86.1 | 49.0 | 83.1 | 86.6 |
76.2
|
| 21 |
Microsoft
|
14B | 84.8 | 56.1 | 80.4 | 82.6 |
76.0
|
| 22 |
Meta
|
70B | 86.0 | 50.5 | 77.0 | 88.4 |
75.5
|
| 23 |
Meta
|
406B | 87.3 | 50.7 | 73.8 | 89.0 |
75.2
|
| 24 |
Qwen
|
32B | — | — | 57.2 | 92.7 |
75.0
|
| 25 |
Qwen3 8BEU
Qwen
|
8B | 76.9 | 62.0 | 85.0 | — |
74.6
|
| 26 |
Qwen
|
7B | — | — | 57.2 | 88.4 |
72.8
|
| 27 |
DeepSeek
|
7.6B | — | 49.1 | 92.8 | — |
71.0
|
| 28 |
Meta
|
70B | 83.6 | 41.7 | 68.0 | 80.5 |
68.5
|
| 29 |
Qwen
|
7B | 74.2 | 36.4 | 75.5 | 84.8 |
67.7
|
| 30 |
Microsoft
|
3.8B | 67.3 | — | 64.0 | — |
65.7
|
| 31 |
Google
|
27B | — | 42.4 | 89.0 | — |
65.7
|
| 32 |
Google
|
12B | — | 34.9 | 83.8 | — |
59.4
|
| 33 |
Google
|
27B | 75.2 | — | 42.3 | — |
58.8
|
| 34 |
Microsoft
|
3.8B | 69.0 | — | 48.5 | — |
58.8
|
| 35 |
Meta
|
8B | 69.4 | 30.4 | 51.9 | 72.6 |
56.1
|
| 36 |
Meta
|
3B | 63.4 | — | 48.0 | — |
55.7
|
| 37 |
Google
|
9.2B | 71.3 | — | 36.6 | — |
54.0
|
| 38 |
Google
|
4B | — | 30.8 | 75.6 | — |
53.2
|
| 39 |
Mistral
|
12B | 68.0 | — | 38.0 | — |
53.0
|
| 40 |
Meta
|
1B | 49.3 | — | 30.6 | — |
40.0
|
| 41 |
Mistral
|
7.2B | 60.1 | — | 11.2 | 30.5 |
33.9
|
Benchmarks are publicly reported figures from model authors and open leaderboards (last updated 2026-06). Methodology varies per model, so the index is an ordering aid, not a controlled benchmark.
Ranked by the real time-to-respond we measured on our own EU GPUs. Lower is better.
| # | Model | Size | Response |
|---|---|---|---|
| 1 |
Microsoft
|
4B | 110 ms |
| 2 |
Meta
|
1.2B | 123 ms |
| 3 |
Meta
|
71B | 127 ms |
| 4 |
Meta
|
71B | 131 ms |
| 5 |
Meta
|
1.2B | 140 ms |
| 6 |
Meta
|
8B | 147 ms |
| 7 |
Google
|
2.9B | 150 ms |
| 8 |
Meta
|
3B | 156 ms |
| 9 |
Meta
|
8B | 159 ms |
| 10 |
HostYourAI
|
22B | 161 ms |
| 11 |
Qwen
|
1.7B | 173 ms |
| 12 |
Microsoft
|
32B | 197 ms |
| 13 |
Google
|
2.9B | 218 ms |
| 14 |
Qwen
|
4B | 234 ms |
| 15 |
Mistral
|
7.2B | 240 ms |
| 16 |
Microsoft
|
4B | 262 ms |
| 17 |
Google
|
3B | 272 ms |
| 18 |
Qwen
|
4B | 276 ms |
| 19 |
Google
|
2.5B | 294 ms |
| 20 |
Qwen
|
4B | 363 ms |
| 21 |
Google
|
1B | 365 ms |
| 22 |
Qwen
|
0.6B | 380 ms |
| 23 |
DeepSeek
|
6.9B | 396 ms |
| 24 |
Google
|
3B | 404 ms |
| 25 |
Meta
|
8B | 411 ms |
Response times were measured at our last verification of each model and vary with load and cold starts.