Rank
Provider
Model
Score
Efficiency
Exactness
LLM Gen Time (s)
Avg Attempts
Avg Query Latency
Avg Rows Read
Avg Data Read
--
human
human
--
--
--
--
--
332.6 ms
31,006,852
759.83 MB
#1
anthropic
claude-opus-4
79.04
94.00
64.07
4.243
1.97
352.457 ms
28,250,540
112.14 MB
#2
google
gemini-2.5-pro
77.49
97.00
57.98
19.100
1.06
443.22 ms
42,878,115
826.88 MB
#3
anthropic
claude-3.7-sonnet
76.82
97.55
56.08
3.149
1.10
374.224 ms
40,099,998
824.57 MB
#4
moonshotai
kimi-k2
76.43
97.09
55.77
4.265
1.06
589.22 ms
49,539,148
903.11 MB
#5
deepseek
deepseek-chat-v3.1
76.10
98.18
54.02
3.914
1.04
608.681 ms
45,651,463
322.46 MB
#6
openai
o3-mini
76.08
97.33
54.83
9.886
1.14
448.84 ms
49,432,133
844.29 MB
#7
openrouter
horizon-beta
75.83
94.89
56.78
1.401
1.02
1,202.46 ms
72,041,510
1,141.62 MB
#8
anthropic
claude-3.5-sonnet
75.55
98.68
52.41
3.234
1.02
388.96 ms
37,145,042
684.44 MB
#9
qwen
qwen3-coder
75.37
99.26
51.48
4.571
1.04
457.224 ms
46,666,126
333.42 MB
#10
qwen
qwen3-235b-a22b-07-25
75.37
96.63
54.11
8.620
1.18
397.755 ms
38,751,330
781.80 MB
#11
openai
gpt-4.1
74.92
99.77
50.08
2.074
1.00
421.6 ms
52,027,773
246.69 MB
#12
openai
o3
74.57
99.30
49.84
16.292
1.04
549.54 ms
53,315,039
303.04 MB
#13
anthropic
claude-opus-4.1
74.21
94.78
53.65
6.342
1.04
580.51 ms
39,294,543
936.76 MB
#14
qwen
qwen3-next-80b-a3b-instruct
74.18
98.36
50.00
1.474
1.00
421.14 ms
38,561,447
879.98 MB
#15
openai
gpt-4.5-preview
74.14
97.77
50.51
2.955
1.00
442.98 ms
41,636,677
756.28 MB
#16
deepseek
deepseek-chat-v3-0324
73.86
98.62
49.10
5.366
1.24
362.62 ms
39,914,537
612.03 MB
#17
openai
o4-mini
73.86
96.03
51.69
10.228
1.08
613.66 ms
52,581,751
940.75 MB
#18
x-ai
grok-4
73.49
91.18
55.81
61.602
1.00
677.06 ms
49,360,869
1,145.95 MB
#19
qwen
qwen3-32b
73.48
92.00
54.95
37.553
1.06
761.347 ms
44,676,197
795.72 MB
#20
qwen
qwen3-max
73.39
86.18
60.61
5.172
1.17
550.043 ms
40,419,858
895.38 MB
#21
meta-llama
llama-4-maverick
73.31
98.32
48.30
3.095
1.04
410.78 ms
40,161,866
793.26 MB
#22
mistralai
mistral-medium-3.1
73.15
97.49
48.81
2.088
1.04
666.02 ms
53,051,447
878.95 MB
#23
openai
gpt-oss-20b
73.12
96.23
50.00
2.190
1.06
818.38 ms
54,736,481
995.56 MB
#24
qwen
qwen3-235b-a22b
73.11
96.88
49.34
36.262
1.04
439.38 ms
45,468,824
791.67 MB
#25
x-ai
grok-3-mini-beta
72.94
95.36
50.52
7.127
1.06
651.74 ms
55,296,404
869.75 MB
#26
google
gemini-2.5-flash
72.92
99.75
46.09
2.126
1.02
337.4 ms
36,295,667
262.45 MB
#27
mistralai
codestral-2508
72.87
98.52
47.22
0.855
1.00
775.14 ms
42,657,411
620.15 MB
#28
openrouter
horizon-alpha
72.76
94.13
51.40
1.362
1.02
1,358.24 ms
67,797,316
1,137.85 MB
#29
anthropic
claude-3.5-haiku
72.74
99.54
45.93
2.731
1.08
522.38 ms
47,370,988
297.58 MB
#30
anthropic
claude-sonnet-4
72.60
92.79
52.41
3.915
1.02
492.708 ms
41,642,822
913.54 MB
#31
openai
o4-mini-high
72.59
94.92
50.26
21.133
1.04
702.64 ms
68,364,075
1,005.01 MB
#32
openai
o3-pro
72.48
92.37
52.59
76.620
1.04
746.8 ms
52,804,037
936.55 MB
#33
qwen
qwen-2.5-coder-32b-instruct
72.41
96.33
48.49
2.456
1.08
732.878 ms
46,841,414
767.00 MB
#34
meta-llama
llama-4-scout
72.40
99.85
44.96
2.048
1.04
289.875 ms
39,101,618
134.66 MB
#35
openai
gpt-4o-mini
72.28
99.73
44.83
2.145
1.04
690.28 ms
54,131,214
193.58 MB
#36
google
gemini-2.5-pro-preview-06-05
72.17
96.02
48.32
20.782
1.04
579.36 ms
38,815,820
806.77 MB
#37
qwen
qwen3-next-80b-a3b-thinking
71.54
91.12
51.96
17.344
1.02
720.837 ms
54,897,195
1,106.61 MB
#38
qwen
qwen3-30b-a3b-thinking-2507
71.53
95.76
47.30
19.002
1.02
602.48 ms
45,928,106
890.50 MB
#39
google
gemini-2.0-flash-001
70.90
99.76
42.04
1.426
1.02
350.146 ms
44,547,543
181.54 MB
#40
x-ai
grok-3-beta
70.50
96.24
44.76
1.701
1.04
633.612 ms
42,572,577
720.40 MB
#41
x-ai
grok-code-fast-1
70.41
94.37
46.46
6.570
1.02
830.98 ms
51,488,460
1,077.86 MB
#42
mistralai
devstral-medium
70.19
98.31
42.08
1.405
1.08
420.714 ms
44,380,748
715.11 MB
#43
qwen
qwen3-30b-a3b-instruct-2507
70.05
98.66
41.44
2.806
1.09
308 ms
31,184,916
374.88 MB
#44
openai
gpt-oss-120b
69.93
91.17
48.68
3.205
1.09
533.957 ms
41,234,766
980.50 MB
#45
google
gemini-2.5-pro-preview-05-06
69.80
91.76
47.83
39.798
1.10
686.857 ms
53,855,819
893.51 MB
#46
nvidia
nemotron-nano-9b-v2
69.68
94.77
44.59
12.717
1.33
483.347 ms
40,823,966
813.29 MB
#47
google
gemini-2.5-flash-preview
69.16
98.42
39.90
1.622
1.00
384.551 ms
42,309,547
735.32 MB
#48
openai
gpt-5
69.12
93.60
44.64
25.613
1.04
643.3 ms
61,356,069
1,161.59 MB
#49
mistralai
magistral-small-2506
68.51
96.06
40.96
12.425
1.18
522.531 ms
39,072,130
681.69 MB
#50
deepseek
deepseek-chat-v3-0324:free
68.27
83.13
53.41
5.875
1.11
383.682 ms
38,010,973
813.72 MB
#51
google
gemini-2.5-flash-lite
68.22
99.62
36.81
0.962
1.04
495.78 ms
44,671,754
328.20 MB
#52
mistralai
devstral-small
67.14
94.70
39.58
2.412
1.09
372.696 ms
44,597,846
757.05 MB
#53
mistralai
ministral-8b
66.84
98.82
34.86
0.925
1.00
385.911 ms
40,043,041
257.63 MB
#54
openai
gpt-5-nano
65.18
88.23
42.12
27.104
1.08
1,015.46 ms
45,844,074
1,126.98 MB
#55
mistralai
mistral-nemo
63.87
97.73
30.00
3.307
1.09
680.644 ms
48,641,279
222.69 MB
#56
openai
gpt-4.1-nano
63.81
99.68
27.93
1.538
1.06
445.694 ms
52,428,071
239.26 MB
#57
openai
gpt-5-mini
63.18
89.17
37.20
20.092
1.02
906.12 ms
61,641,565
1,386.39 MB
#58
mistralai
mistral-small-3.1-24b-instruct
45.67
47.31
44.02
1.809
1.00
376.5 ms
37,893,118
912.60 MB
#59
meta-llama
llama-3.3-70b-instruct
17.78
0.00
35.56
3.501
1.21
445.242 ms
38,658,489
992.39 MB