LLM SQL Generation Benchmark Results

We assessed the ability of popular LLMs to generate accurate and efficient SQL from natural language prompts. Using a 200 million record dataset from the GH Archive uploaded to Tinybird, we asked the LLMs to generate SQL based on 50 prompts. The results are shown below and can be compared to a human baseline.

--
human
human
--
--
--
--
--
332.6 ms
31,006,852
759.83 MB
#1
anthropic
79.04
94.00
64.07
4.243
1.97
352.457 ms
28,250,540
112.14 MB
#2
anthropic
76.82
97.55
56.08
3.149
1.10
374.224 ms
40,099,998
824.57 MB
#3
openai
76.08
97.33
54.83
9.886
1.14
448.84 ms
49,432,133
844.29 MB
#4
anthropic
75.55
98.68
52.41
3.234
1.02
388.96 ms
37,145,042
684.44 MB
#5
openai
74.92
99.77
50.08
2.074
1.00
421.6 ms
52,027,773
246.69 MB
#6
openai
74.14
97.77
50.51
2.955
1.00
442.98 ms
41,636,677
756.28 MB
#7
deepseek
73.86
98.62
49.10
5.366
1.24
362.62 ms
39,914,537
612.03 MB
#8
openai
73.86
96.03
51.69
10.228
1.08
613.66 ms
52,581,751
940.75 MB
#9
meta-llama
73.31
98.32
48.30
3.095
1.04
410.78 ms
40,161,866
793.26 MB
#10
x-ai
72.94
95.36
50.52
7.127
1.06
651.74 ms
55,296,404
869.75 MB
#11
anthropic
72.60
92.79
52.41
3.915
1.02
492.708 ms
41,642,822
913.54 MB
#12
openai
72.59
94.92
50.26
21.133
1.04
702.64 ms
68,364,075
1,005.01 MB
#13
openai
72.48
92.37
52.59
76.620
1.04
746.8 ms
52,804,037
936.55 MB
#14
meta-llama
72.40
99.85
44.96
2.048
1.04
289.875 ms
39,101,618
134.66 MB
#15
openai
72.28
99.73
44.83
2.145
1.04
690.28 ms
54,131,214
193.58 MB
#16
google
72.17
96.02
48.32
20.782
1.04
579.36 ms
38,815,820
806.77 MB
#17
google
70.90
99.76
42.04
1.426
1.02
350.146 ms
44,547,543
181.54 MB
#18
x-ai
70.50
96.24
44.76
1.701
1.04
633.612 ms
42,572,577
720.40 MB
#19
google
69.80
91.76
47.83
39.798
1.10
686.857 ms
53,855,819
893.51 MB
#20
google
69.16
98.42
39.90
1.622
1.00
384.551 ms
42,309,547
735.32 MB
#21
mistralai
68.51
96.06
40.96
12.425
1.18
522.531 ms
39,072,130
681.69 MB
#22
deepseek
68.17
83.13
53.21
5.875
1.11
383.682 ms
38,010,973
813.72 MB
#23
mistralai
66.84
98.82
34.86
0.925
1.00
385.911 ms
40,043,041
257.63 MB
#24
mistralai
63.87
97.73
30.00
3.307
1.09
680.644 ms
48,641,279
222.69 MB
#25
openai
63.81
99.68
27.93
1.538
1.06
445.694 ms
52,428,071
239.26 MB
#26
mistralai
45.67
47.31
44.02
1.809
1.00
376.5 ms
37,893,118
912.60 MB
#27
meta-llama
17.78
0.00
35.56
3.501
1.21
445.242 ms
38,658,489
992.39 MB