LLM SQL Generation Benchmark Results

We assessed the ability of popular LLMs to generate accurate and efficient SQL from natural language prompts. Using a 200 million record dataset from the GH Archive uploaded to Tinybird, we asked the LLMs to generate SQL based on 50 prompts. The results are shown below and can be compared to a human baseline.

Show metrics relative to human baseline

human

332.6 ms

31,006,852

759.83 MB

anthropic

claude-opus-4

80.41

96.74

64.07

4.243

1.97

352.457 ms

28,250,540

112.14 MB

google

gemini-2.5-pro

78.17

98.37

57.98

19.100

1.06

443.22 ms

42,878,115

826.88 MB

anthropic

claude-3.7-sonnet

77.38

98.67

56.08

3.149

1.10

374.224 ms

40,099,998

824.57 MB

moonshotai

kimi-k2

77.09

98.42

55.77

4.265

1.06

589.22 ms

49,539,148

903.11 MB

openrouter

horizon-beta

77.00

97.22

56.78

1.401

1.02

1,202.46 ms

72,041,510

1,141.62 MB

openai

o3-mini

76.69

98.55

54.83

9.886

1.14

448.84 ms

49,432,133

844.29 MB

qwen

qwen3-max

76.55

92.49

60.61

5.172

1.17

550.043 ms

40,419,858

895.38 MB

deepseek

deepseek-chat-v3.1

76.52

99.01

54.02

3.914

1.04

608.681 ms

45,651,463

322.46 MB

qwen

qwen3-235b-a22b-07-25

76.14

98.17

54.11

8.620

1.18

397.755 ms

38,751,330

781.80 MB

#10

anthropic

claude-3.5-sonnet

75.85

99.28

52.41

3.234

1.02

388.96 ms

37,145,042

684.44 MB

#11

qwen

qwen3-coder

75.54

99.60

51.48

4.571

1.04

457.224 ms

46,666,126

333.42 MB

#12

x-ai

grok-4

75.51

95.21

55.81

61.602

1.00

677.06 ms

49,360,869

1,145.95 MB

#13

qwen

qwen3-coder-plus

75.41

99.43

51.39

2.303

1.02

679.163 ms

53,829,416

387.08 MB

#14

anthropic

claude-opus-4.1

75.40

97.16

53.65

6.342

1.04

580.51 ms

39,294,543

936.76 MB

#15

qwen

qwen3-32b

75.30

95.65

54.95

37.553

1.06

761.347 ms

44,676,197

795.72 MB

#16

openai

gpt-4.1

74.98

99.88

50.08

2.074

1.00

421.6 ms

52,027,773

246.69 MB

#17

openai

o4-mini

74.77

97.84

51.69

10.228

1.08

613.66 ms

52,581,751

940.75 MB

#18

openai

74.73

99.62

49.84

16.292

1.04

549.54 ms

53,315,039

303.04 MB

#19

openai

gpt-4.5-preview

74.65

98.79

50.51

2.955

1.00

442.98 ms

41,636,677

756.28 MB

#20

qwen

qwen3-next-80b-a3b-instruct

74.55

99.11

50.00

1.474

1.00

421.14 ms

38,561,447

879.98 MB

#21

anthropic

claude-sonnet-4

74.25

96.08

52.41

3.915

1.02

492.708 ms

41,642,822

913.54 MB

#22

openai

o3-pro

74.22

95.85

52.59

76.620

1.04

746.8 ms

52,804,037

936.55 MB

#23

openai

gpt-5-codex

74.20

97.39

51.01

11.124

1.00

596.52 ms

48,389,329

1,097.99 MB

#24

deepseek

deepseek-chat-v3-0324

74.18

99.25

49.10

5.366

1.24

362.62 ms

39,914,537

612.03 MB

#25

anthropic

claude-4.5-sonnet

74.13

98.43

49.84

3.702

1.02

585.62 ms

43,365,288

907.62 MB

#26

openrouter

horizon-alpha

74.10

96.81

51.40

1.362

1.02

1,358.24 ms

67,797,316

1,137.85 MB

#27

x-ai

grok-3-mini-beta

74.00

97.48

50.52

7.127

1.06

651.74 ms

55,296,404

869.75 MB

#28

openai

gpt-oss-20b

73.98

97.95

50.00

2.190

1.06

818.38 ms

54,736,481

995.56 MB

#29

qwen

qwen3-235b-a22b

73.82

98.30

49.34

36.262

1.04

439.38 ms

45,468,824

791.67 MB

#30

openai

o4-mini-high

73.75

97.24

50.26

21.133

1.04

702.64 ms

68,364,075

1,005.01 MB

#31

mistralai

mistral-medium-3.1

73.72

98.64

48.81

2.088

1.04

666.02 ms

53,051,447

878.95 MB

#32

meta-llama

llama-4-maverick

73.69

99.09

48.30

3.095

1.04

410.78 ms

40,161,866

793.26 MB

#33

qwen

qwen3-next-80b-a3b-thinking

73.56

95.17

51.96

17.344

1.02

720.837 ms

54,897,195

1,106.61 MB

#34

qwen

qwen-2.5-coder-32b-instruct

73.25

98.01

48.49

2.456

1.08

732.878 ms

46,841,414

767.00 MB

#35

mistralai

codestral-2508

73.21

99.19

47.22

0.855

1.00

775.14 ms

42,657,411

620.15 MB

#36

thedrummer

cydonia-24b-v4.1

73.16

98.63

47.68

1.966

1.10

412.306 ms

36,265,794

823.24 MB

#37

google

gemini-2.5-pro-preview-06-05

73.08

97.83

48.32

20.782

1.04

579.36 ms

38,815,820

806.77 MB

#38

google

gemini-2.5-flash

72.98

99.87

46.09

2.126

1.02

337.4 ms

36,295,667

262.45 MB

#39

anthropic

claude-3.5-haiku

72.84

99.75

45.93

2.731

1.08

522.38 ms

47,370,988

297.58 MB

#40

x-ai

grok-4-fast

72.81

95.48

50.15

10.611

1.00

762.122 ms

50,156,062

1,088.51 MB

#41

qwen

qwen3-coder-flash

72.64

98.72

46.56

2.453

1.02

556.74 ms

42,185,121

868.81 MB

#42

qwen

qwen3-30b-a3b-thinking-2507

72.50

97.70

47.30

19.002

1.02

602.48 ms

45,928,106

890.50 MB

#43

meta-llama

llama-4-scout

72.44

99.92

44.96

2.048

1.04

289.875 ms

39,101,618

134.66 MB

#44

openai

gpt-4o-mini

72.34

99.85

44.83

2.145

1.04

690.28 ms

54,131,214

193.58 MB

#45

deepseek

deepseek-chat-v3-0324:free

72.12

90.83

53.41

5.875

1.11

383.682 ms

38,010,973

813.72 MB

#46

openai

gpt-oss-120b

71.94

95.20

48.68

3.205

1.09

533.957 ms

41,234,766

980.50 MB

#47

x-ai

grok-code-fast-1

71.70

96.94

46.46

6.570

1.02

830.98 ms

51,488,460

1,077.86 MB

#48

google

gemini-2.5-pro-preview-05-06

71.68

95.52

47.83

39.798

1.10

686.857 ms

53,855,819

893.51 MB

#49

x-ai

grok-3-beta

71.36

97.96

44.76

1.701

1.04

633.612 ms

42,572,577

720.40 MB

#50

google

gemini-2.0-flash-001

70.96

99.87

42.04

1.426

1.02

350.146 ms

44,547,543

181.54 MB

#51

nvidia

nemotron-nano-9b-v2

70.87

97.15

44.59

12.717

1.33

483.347 ms

40,823,966

813.29 MB

#52

openai

gpt-5

70.58

96.52

44.64

25.613

1.04

643.3 ms

61,356,069

1,161.59 MB

#53

mistralai

devstral-medium

70.58

99.08

42.08

1.405

1.08

420.714 ms

44,380,748

715.11 MB

#54

qwen

qwen3-30b-a3b-instruct-2507

70.36

99.27

41.44

2.806

1.09

308 ms

31,184,916

374.88 MB

#55

google

gemini-2.5-flash-preview

69.52

99.14

39.90

1.622

1.00

384.551 ms

42,309,547

735.32 MB

#56

mistralai

magistral-small-2506

69.41

97.86

40.96

12.425

1.18

522.531 ms

39,072,130

681.69 MB

#57

mistralai

devstral-small

68.35

97.12

39.58

2.412

1.09

372.696 ms

44,597,846

757.05 MB

#58

google

gemini-2.5-flash-lite

68.30

99.79

36.81

0.962

1.04

495.78 ms

44,671,754

328.20 MB

#59

openai

gpt-5-nano

67.86

93.60

42.12

27.104

1.08

1,015.46 ms

45,844,074

1,126.98 MB

#60

mistralai

ministral-8b

67.11

99.36

34.86

0.925

1.00

385.911 ms

40,043,041

257.63 MB

#61

openai

gpt-5-mini

65.66

94.11

37.20

20.092

1.02

906.12 ms

61,641,565

1,386.39 MB

#62

mistralai

mistral-nemo

64.38

98.77

30.00

3.307

1.09

680.644 ms

48,641,279

222.69 MB

#63

openai

gpt-4.1-nano

63.88

99.83

27.93

1.538

1.06

445.694 ms

52,428,071

239.26 MB

#64

mistralai

mistral-small-3.1-24b-instruct

57.69

71.36

44.02

1.809

1.00

376.5 ms

37,893,118

912.60 MB

#65

meta-llama

llama-3.3-70b-instruct

40.59

45.63

35.56

3.501

1.21

445.242 ms

38,658,489

992.39 MB

#66

alibaba

tongyi-deepresearch-30b-a3b

27.14

0.00

54.29

20.128

1.13

553.026 ms

57,570,886

1,153.90 MB