全球AI大模型排行榜

🚀 AI大模型排行榜 全球领先的人工智能模型性能评测

267 模型总数
2025-09-30 更新日期
模型 公司 上下文长度 AI分析指数 ℹ️ MMLUPRO推理与知识 ℹ️ GPQA_DIAMOND科学推理 ℹ️ 期末考试 LIVECODEBENCH SCICODE HUMANEVAL MATH500 AIME2024 Chatbot_Arena 每百万Tokens价格 每秒输出Tokens
GPT-5 (high)
400k
67
87%
85%
27%
67%
43%
73%
94%
76%
96%
99%
99%
GPT-5 (medium)
400k
66
87%
84%
24%
70%
41%
71%
92%
73%
92%
99%
98%
Grok 4
256k
65
87%
88%
24%
82%
46%
54%
93%
68%
94%
99%
98%
o3-pro
200k
65
85%
o3
200k
65
85%
83%
20%
78%
41%
71%
88%
69%
90%
99%
99%
GPT-5 mini (high)
400k
62
84%
83%
20%
64%
39%
75%
91%
68%
GPT-5 (low)
400k
62
86%
81%
18%
75%
39%
67%
83%
59%
83%
99%
99%
GPT-5 mini (medium)
400k
61
83%
80%
15%
69%
41%
71%
85%
66%
Gemini 2.5 Pro
1m
60
86%
84%
21%
80%
43%
49%
88%
66%
89%
97%
Claude 4.1 Opus
200k
59
88%
81%
12%
65%
41%
55%
80%
66%
o4-mini (high)
200k
59
83%
78%
18%
80%
47%
69%
91%
55%
94%
99%
99%
gpt-oss-120B (high)
131k
58
81%
78%
19%
64%
36%
69%
93%
51%
Qwen3 235B 2507
256k
57
84%
79%
15%
79%
42%
51%
91%
67%
94%
98%
98%
Grok 3 mini Reasoning (high)
1m
57
83%
79%
11%
70%
41%
46%
85%
50%
93%
99%
98%
Claude 4 Sonnet
1m
57
84%
78%
10%
66%
40%
55%
74%
65%
77%
99%
Claude 4 Opus
200k
54
87%
80%
12%
64%
40%
54%
73%
34%
76%
98%
Gemini 2.5 Pro (Mar)
1m
54
86%
84%
17%
78%
40%
87%
98%
99%
DeepSeek V3.1
128k
54
85%
78%
13%
78%
39%
42%
90%
53%
Gemini 2.5 Pro (May)
1m
53
84%
82%
15%
77%
42%
84%
99%
99%
DeepSeek R1 0528
128k
52
85%
81%
15%
77%
40%
40%
76%
55%
89%
98%
97%
Gemini 2.5 Flash
1m
51
83%
79%
11%
70%
39%
50%
73%
62%
82%
98%
96%
o3-mini (high)
200k
51
80%
77%
12%
73%
40%
86%
99%
Kimi K2 0905
256k
50
82%
77%
6%
61%
31%
42%
57%
52%
GLM-4.5
128k
49
84%
78%
12%
74%
35%
44%
74%
48%
87%
98%
98%
Grok Code Fast 1
256k
49
79%
73%
8%
66%
36%
41%
43%
48%
GPT-5 nano (high)
400k
49
78%
68%
8%
55%
37%
68%
84%
42%
o3-mini
200k
48
79%
75%
9%
72%
40%
77%
97%
97%
GLM-4.5-Air
128k
48
82%
73%
7%
68%
31%
38%
81%
44%
67%
97%
93%
Kimi K2
128k
48
82%
77%
7%
56%
35%
42%
57%
51%
69%
97%
93%
o1-pro
200k
48
GPT-5 nano (medium)
400k
48
77%
67%
8%
60%
34%
66%
78%
40%
o1
200k
47
84%
75%
8%
68%
36%
72%
97%
97%
Qwen3 30B 2507
262k
46
81%
71%
10%
71%
33%
51%
56%
59%
91%
98%
Sonar Reasoning Pro
127k
46
79%
96%
MiniMax M1 80k
1m
46
82%
70%
8%
71%
37%
42%
61%
54%
85%
98%
Gemini 2.5 Flash
1m
46
80%
70%
12%
51%
36%
84%
98%
Qwen3 235B 2507
256k
45
83%
75%
11%
52%
36%
46%
72%
31%
72%
98%
96%
Llama Nemotron Super 49B v1.5
128k
45
81%
75%
7%
74%
35%
37%
77%
34%
86%
98%
95%
o1-preview
128k
45
92%
96%
gpt-oss-20B (high)
131k
45
74%
62%
9%
72%
35%
61%
62%
19%
DeepSeek V3.1
128k
45
83%
74%
6%
58%
37%
38%
50%
45%
Claude 4.1 Opus
200k
45
Claude 4 Sonnet
1m
44
84%
68%
4%
45%
37%
45%
38%
44%
41%
93%
97%
DeepSeek R1 (Jan)
128k
44
84%
71%
9%
62%
36%
39%
68%
52%
68%
97%
98%
GPT-5 (minimal)
400k
43
81%
67%
5%
56%
39%
46%
32%
25%
37%
86%
95%
GPT-4.1
1m
43
81%
67%
5%
46%
38%
43%
35%
61%
44%
91%
96%
Qwen3 4B 2507
262k
43
74%
67%
6%
64%
26%
50%
83%
38%
Claude 3.7 Sonnet
200k
43
84%
77%
10%
47%
40%
49%
95%
98%
EXAONE 4.0 32B
131k
43
82%
74%
11%
75%
34%
36%
80%
14%
84%
98%
97%
GPT-4.1 mini
1m
42
78%
66%
5%
48%
40%
38%
46%
42%
43%
93%
95%
Claude 4 Opus
200k
42
86%
70%
6%
54%
41%
43%
36%
36%
56%
94%
97%
Qwen3 Coder 480B
262k
42
79%
62%
4%
59%
36%
41%
39%
42%
48%
94%
97%
Qwen3 235B
33k
42
83%
70%
12%
62%
40%
39%
82%
0%
84%
93%
MiniMax M1 40k
1m
42
81%
68%
8%
66%
38%
41%
14%
52%
81%
97%
GPT-5 mini (minimal)
400k
42
78%
69%
5%
55%
37%
46%
47%
36%
Grok 3 Reasoning Beta
1m
41
DeepSeek V3 0324
128k
41
82%
66%
5%
41%
36%
41%
41%
41%
52%
94%
92%
Gemini 2.5 Flash
1m
40
81%
68%
5%
50%
29%
39%
60%
46%
50%
93%
95%
Gemini 2.5 Flash-Lite
1m
40
76%
63%
6%
59%
19%
50%
53%
51%
70%
97%
97%
o1-mini
128k
39
74%
60%
5%
58%
32%
60%
94%
97%
Hermes 4 - Llama-3.1 70B
128k
39
81%
70%
8%
65%
34%
31%
69%
7%
GLM-4.5V
64k
39
79%
68%
6%
60%
22%
34%
73%
0%
Qwen3 32B
33k
39
80%
67%
8%
55%
35%
36%
73%
0%
81%
96%
Llama Nemotron Ultra
128k
38
83%
73%
8%
64%
35%
38%
64%
7%
75%
95%
GPT-4.5 (Preview)
128k
38
NVIDIA Nemotron Nano 9B V2
131k
38
74%
56%
4%
70%
21%
27%
62%
23%
QwQ-32B
131k
38
76%
59%
8%
63%
36%
39%
29%
25%
78%
96%
98%
Solar Pro 2
66k
38
81%
69%
7%
62%
30%
37%
61%
0%
69%
97%
97%
Gemini 2.0 Flash Thinking exp. (Jan)
1m
38
80%
70%
7%
32%
33%
50%
94%
Qwen3 30B 2507
262k
37
78%
66%
7%
52%
30%
33%
66%
23%
73%
98%
94%
NVIDIA Nemotron Nano 9B V2
131k
37
74%
57%
5%
72%
22%
28%
70%
18%
Qwen3 8B
131k
37
74%
59%
4%
41%
23%
34%
19%
75%
90%
Qwen3 30B
33k
37
78%
62%
7%
51%
28%
42%
72%
0%
75%
96%
Solar Pro 2
64k
36
77%
58%
6%
46%
16%
66%
90%
Qwen3 14B
33k
36
77%
60%
4%
52%
32%
41%
56%
0%
76%
96%
96%
Grok 3
1m
36
80%
69%
5%
43%
37%
33%
87%
91%
Llama 4 Maverick
1m
36
81%
67%
5%
40%
33%
43%
19%
46%
39%
89%
88%
GPT-4o (Mar)
128k
36
80%
66%
5%
43%
37%
26%
33%
89%
96%
Llama 3.3 Nemotron Super 49B
128k
35
79%
64%
7%
28%
28%
38%
55%
17%
58%
96%
96%
Mistral Medium 3.1
128k
35
68%
59%
4%
41%
34%
40%
38%
20%
DeepSeek R1 0528 Qwen3 8B
33k
35
74%
61%
6%
51%
20%
20%
64%
13%
65%
93%
91%
Mistral Medium 3
128k
35
76%
58%
4%
40%
33%
39%
30%
28%
44%
91%
90%
Gemini 2.0 Pro Experimental
2m
35
81%
62%
7%
35%
31%
36%
92%
95%
Sonar Reasoning
127k
34
62%
77%
92%
Gemini 2.5 Flash
1m
34
78%
59%
5%
41%
23%
43%
93%
Gemini 2.0 Flash
1m
34
78%
62%
5%
33%
33%
40%
22%
28%
33%
93%
90%
Magistral Medium
40k
34
75%
68%
10%
53%
30%
25%
40%
0%
70%
92%
Claude 3.7 Sonnet
200k
34
80%
66%
5%
39%
38%
21%
22%
85%
95%
EXAONE 4.0 32B
131k
33
77%
63%
5%
47%
25%
34%
39%
8%
47%
94%
91%
Qwen3 Coder 30B
262k
33
71%
52%
4%
40%
28%
33%
29%
29%
30%
89%
92%
DeepSeek V3 (Dec)
128k
33
75%
56%
4%
36%
35%
35%
26%
29%
25%
89%
91%
DeepSeek R1 Distill Qwen 32B
128k
33
74%
62%
6%
27%
38%
23%
63%
0%
69%
94%
95%
Hermes 4 405B
128k
33
73%
54%
4%
55%
35%
35%
15%
20%
Reka Flash 3
128k
33
67%
53%
5%
44%
27%
51%
89%
95%
Magistral Small
40k
32
75%
64%
7%
51%
24%
25%
41%
0%
71%
96%
96%
Gemini 2.0 Flash (exp)
1m
32
78%
64%
5%
21%
34%
30%
91%
91%
Nova Premier
1m
31
73%
57%
5%
32%
28%
36%
17%
30%
17%
84%
91%
DeepSeek R1 Distill Llama 70B
128k
31
80%
40%
6%
27%
31%
28%
54%
11%
67%
94%
97%
Qwen2.5 Max
32k
31
76%
59%
5%
36%
34%
23%
84%
93%
Solar Pro 2
66k
30
75%
56%
4%
42%
25%
34%
30%
0%
41%
89%
88%
Gemini 2.5 Flash-Lite
1m
30
72%
47%
4%
40%
18%
32%
35%
31%
50%
93%
93%
Gemini 1.5 Pro (Sep)
2m
30
75%
59%
5%
32%
30%
23%
88%
90%
Solar Pro 2
64k
30
73%
54%
4%
39%
27%
30%
87%
88%
Qwen3 235B
33k
30
76%
61%
5%
34%
30%
37%
24%
0%
33%
90%
Claude 3.5 Sonnet (Oct)
200k
30
77%
60%
4%
38%
37%
16%
77%
93%
DeepSeek R1 Distill Qwen 14B
128k
30
74%
48%
4%
38%
24%
22%
56%
0%
67%
95%
93%
Mistral Small 3.2
128k
29
68%
51%
4%
28%
26%
34%
27%
17%
32%
88%
85%
Sonar
127k
29
69%
47%
7%
30%
23%
49%
82%
82%
GPT-5 nano (minimal)
400k
29
56%
43%
4%
47%
29%
33%
27%
20%
Qwen3 14B
33k
28
68%
47%
4%
28%
27%
24%
58%
0%
28%
87%
Sonar Pro
200k
28
76%
58%
8%
28%
23%
29%
75%
85%
Llama 4 Scout
10m
28
75%
59%
4%
30%
17%
40%
14%
26%
28%
84%
83%
Command A
256k
28
71%
53%
5%
29%
28%
37%
13%
18%
10%
82%
82%
QwQ 32B-Preview
33k
28
65%
56%
5%
34%
4%
45%
91%
87%
Llama 3.3 70B
128k
28
71%
50%
4%
29%
26%
47%
8%
15%
30%
77%
86%
Qwen2.5 72B
131k
28
72%
49%
4%
28%
27%
37%
14%
20%
16%
86%
88%
Devstral Medium
256k
28
71%
49%
4%
34%
29%
30%
5%
29%
7%
71%
94%
GPT-4.1 nano
1m
27
66%
51%
4%
33%
26%
32%
24%
17%
24%
85%
88%
GPT-4o (Nov)
128k
27
75%
54%
3%
31%
33%
34%
6%
0%
15%
76%
93%
Gemini 2.0 Flash-Lite (Feb)
1m
27
72%
54%
4%
19%
25%
28%
87%
88%
Exaone 4.0 1.2B
64k
27
59%
52%
6%
52%
9%
23%
50%
0%
Llama Nemotron Super 49B v1.5
128k
27
69%
48%
4%
29%
24%
33%
8%
22%
14%
77%
86%
Qwen3 30B
33k
26
71%
52%
5%
32%
26%
32%
22%
0%
26%
86%
Qwen3 32B
33k
26
73%
54%
4%
29%
28%
32%
20%
0%
30%
87%
90%
GPT-4o (May)
128k
26
74%
53%
3%
33%
31%
11%
79%
94%
Gemini 2.0 Flash-Lite (Preview)
1m
26
54%
4%
18%
25%
30%
87%
90%
Llama 3.1 Nemotron Nano 4B v1.1
128k
26
56%
41%
5%
49%
10%
26%
50%
0%
71%
95%
GPT-4o (Aug)
128k
26
52%
3%
32%
12%
80%
93%
Llama 3.3 Nemotron Super 49B v1
128k
26
70%
52%
4%
28%
23%
40%
8%
11%
19%
78%
83%
GLM-4.5V
64k
26
75%
57%
4%
35%
19%
29%
15%
0%
MiniMax-Text-01
4m
26
76%
58%
4%
25%
25%
13%
75%
86%
Llama 3.1 405B
128k
26
73%
52%
4%
31%
30%
39%
3%
0%
21%
70%
85%
Qwen3 4B
32k
26
70%
52%
5%
47%
4%
33%
22%
0%
66%
93%
91%
Mistral Large 2 (Nov)
128k
26
70%
49%
4%
29%
29%
31%
14%
5%
11%
74%
90%
Nova Pro
300k
25
69%
50%
3%
23%
21%
38%
7%
19%
11%
79%
83%
Claude 3.5 Sonnet (June)
200k
25
75%
56%
4%
32%
10%
70%
90%
Tulu3 405B
128k
25
72%
52%
4%
29%
30%
13%
78%
89%
GPT-4o (ChatGPT)
128k
25
77%
51%
4%
33%
53%
10%
80%
94%
Pixtral Large
128k
25
70%
51%
4%
26%
29%
35%
2%
10%
7%
71%
85%
Grok 2
131k
25
71%
51%
4%
27%
28%
13%
78%
86%
Phi-4
16k
25
71%
57%
4%
23%
26%
24%
18%
0%
14%
81%
87%
Gemini 1.5 Flash (Sep)
1m
24
68%
46%
4%
27%
27%
18%
83%
84%
GPT-4 Turbo
128k
24
69%
3%
29%
32%
15%
74%
92%
Hermes 4 70B
128k
24
66%
49%
4%
27%
28%
29%
11%
2%
Mistral Small 3.1
128k
23
66%
45%
5%
21%
27%
30%
4%
14%
9%
71%
86%
Grok Beta
128k
23
70%
47%
5%
24%
30%
10%
74%
87%
Qwen3 8B
33k
23
64%
45%
3%
20%
17%
29%
24%
0%
24%
83%
Llama 3.1 Nemotron 70B
128k
23
69%
47%
5%
17%
23%
31%
11%
7%
25%
73%
82%
Qwen2.5 Instruct 32B
128k
23
70%
47%
4%
25%
23%
11%
81%
90%
Llama 3.1 70B
128k
23
68%
41%
5%
23%
27%
34%
4%
6%
17%
65%
81%
Qwen3 1.7B
32k
22
57%
36%
5%
31%
4%
27%
39%
0%
51%
89%
85%
Mistral Large 2 (Jul)
128k
22
68%
47%
3%
27%
27%
32%
0%
0%
9%
71%
89%
Gemma 3 27B
128k
22
67%
43%
5%
14%
21%
32%
21%
0%
25%
88%
89%
Qwen2.5 Coder 32B
131k
22
64%
42%
4%
30%
27%
12%
77%
90%
GPT-4
8k
21
Nova Lite
300k
21
59%
43%
5%
17%
14%
34%
7%
18%
11%
77%
84%
Mistral Small 3
32k
21
65%
46%
4%
25%
24%
26%
4%
0%
8%
72%
85%
GPT-4o mini
128k
21
65%
43%
4%
23%
23%
31%
15%
12%
79%
88%
Jamba 1.7 Large
256k
21
58%
39%
4%
18%
19%
35%
2%
17%
6%
60%
71%
Gemma 3 12B
128k
21
60%
35%
5%
14%
17%
37%
18%
7%
22%
85%
83%
DeepSeek-V2.5 (Dec)
128k
21
76%
88%
Qwen3 4B
32k
21
59%
40%
4%
23%
17%
21%
84%
Claude 3 Opus
200k
21
70%
49%
3%
28%
23%
3%
64%
85%
Exaone 4.0 1.2B
64k
20
50%
42%
6%
29%
7%
25%
24%
0%
Claude 3.5 Haiku
200k
20
63%
41%
4%
31%
27%
3%
72%
86%
Gemini 2.0 Flash Thinking exp. (Dec)
2m
20
48%
94%
DeepSeek-V2.5
128k
20
87%
Devstral Small (May)
256k
20
63%
43%
4%
26%
25%
7%
68%
85%
Mistral Saba
32k
20
61%
42%
4%
24%
13%
68%
85%
DeepSeek R1 Distill Llama 8B
128k
19
54%
30%
4%
23%
12%
18%
41%
0%
33%
85%
84%
Reka Core
128k
19
56%
73%
Gemini 1.5 Pro (May)
2m
19
66%
37%
4%
24%
27%
8%
67%
83%
R1 1776
128k
19
95%
Qwen2.5 Turbo
1m
19
63%
41%
4%
16%
15%
12%
81%
85%
Reka Flash
128k
19
53%
74%
Llama 3.2 90B (Vision)
128k
19
67%
43%
5%
21%
24%
5%
63%
82%
Solar Mini
4k
19
33%
59%
Reka Flash (Feb)
128k
19
33%
61%
Reka Edge
128k
18
22%
41%
Grok-1
8k
18
Qwen2 72B
131k
18
62%
37%
4%
16%
23%
15%
70%
83%
Devstral Small
256k
18
62%
41%
4%
25%
24%
0%
64%
85%
Nova Micro
130k
17
53%
36%
5%
14%
9%
29%
6%
10%
8%
70%
80%
Gemma 2 27B
8k
17
57%
36%
4%
28%
13%
30%
54%
76%
Llama 3.1 8B
128k
17
48%
26%
5%
12%
13%
29%
4%
16%
8%
52%
67%
Gemini 1.5 Flash-8B
1m
16
57%
36%
5%
22%
23%
3%
69%
12%
Phi-4 Mini
128k
16
47%
33%
4%
13%
11%
21%
7%
14%
3%
70%
74%
Gemma 3n E4B
32k
16
49%
30%
4%
15%
8%
28%
14%
0%
14%
77%
DeepHermes 3 - Mistral 24B
32k
16
58%
38%
4%
20%
23%
5%
60%
75%
Granite 3.3 8B
128k
15
47%
34%
4%
13%
10%
22%
7%
4%
5%
67%
71%
Jamba 1.5 Large
256k
15
57%
43%
4%
14%
16%
5%
61%
24%
Gemma 3 4B
128k
15
42%
29%
5%
11%
7%
28%
13%
6%
6%
77%
72%
Hermes 3 - Llama-3.1 70B
128k
15
57%
40%
4%
19%
23%
2%
54%
75%
Llama 3.2 11B (Vision)
128k
15
46%
22%
5%
11%
11%
30%
2%
12%
9%
52%
69%
DeepSeek-Coder-V2
128k
15
74%
87%
Qwen3 1.7B
32k
14
41%
28%
5%
13%
7%
21%
7%
0%
10%
72%
Jamba 1.6 Large
256k
14
56%
39%
4%
17%
18%
5%
58%
70%
Qwen3 0.6B
32k
14
35%
24%
6%
12%
3%
23%
18%
0%
10%
75%
49%
Gemini 1.5 Flash (May)
1m
14
57%
32%
4%
20%
18%
9%
55%
72%
Yi-Large
32k
13
59%
36%
3%
11%
19%
7%
56%
74%
Claude 3 Sonnet
200k
13
58%
40%
4%
18%
23%
5%
41%
71%
Codestral (Jan)
256k
13
45%
31%
5%
24%
25%
4%
61%
85%
Phi-3 Mini
4k
13
44%
32%
4%
12%
9%
24%
0%
2%
4%
46%
25%
Llama 3 70B
8k
13
57%
38%
4%
20%
19%
0%
48%
79%
Mistral Small (Sep)
33k
13
53%
38%
4%
14%
16%
6%
56%
81%
Gemini 1.0 Ultra
33k
13
Gemma 3n E4B (May)
32k
13
48%
28%
5%
14%
9%
11%
75%
76%
Phi-4 Multimodal
128k
12
49%
32%
4%
13%
11%
9%
69%
73%
Qwen2.5 Coder 7B
131k
12
47%
34%
5%
13%
15%
5%
66%
90%
Mistral Large (Feb)
33k
12
52%
35%
3%
18%
21%
0%
53%
71%
Jamba Instruct
256k
12
34%
27%
5%
5%
8%
24%
0%
Mixtral 8x22B
65k
12
54%
33%
4%
15%
19%
0%
55%
72%
Llama 2 Chat 7B
4k
11
16%
23%
6%
0%
0%
0%
6%
13%
Llama 3.2 3B
128k
11
35%
26%
5%
8%
5%
26%
3%
2%
7%
49%
56%
Qwen3 0.6B
32k
11
23%
23%
5%
7%
4%
22%
10%
0%
2%
52%
34%
Qwen1.5 Chat 110B
32k
11
29%
Phi-3 Medium 14B
128k
10
54%
33%
5%
15%
12%
1%
46%
0%
Claude 2.1
200k
10
50%
32%
4%
20%
18%
3%
37%
16%
Claude 3 Haiku
200k
10
15%
19%
1%
39%
76%
Pixtral 12B
128k
9
47%
34%
5%
12%
14%
0%
46%
78%
DeepSeek R1 Distill Qwen 1.5B
128k
9
27%
10%
3%
7%
7%
13%
22%
0%
18%
69%
45%
Claude 2.0
100k
9
49%
34%
17%
19%
0%
DeepSeek-V2
128k
9
87%
Mistral Small (Feb)
33k
8
42%
30%
4%
11%
13%
1%
56%
79%
Mistral Medium
33k
8
49%
35%
3%
10%
12%
4%
41%
GPT-3.5 Turbo
4k
8
46%
30%
44%
70%
LFM2 1.2B
33k
8
26%
23%
6%
2%
3%
22%
3%
0%
Gemma 3n E2B
32k
8
38%
23%
4%
10%
5%
9%
69%
Ministral 8B
128k
8
39%
28%
5%
11%
12%
4%
57%
77%
Gemma 2 9B
8k
8
50%
31%
4%
13%
1%
0%
52%
65%
Arctic
4k
8
75%
Qwen Chat 72B
34k
8
LFM 40B
32k
7
43%
33%
5%
10%
7%
2%
48%
51%
Llama 3.2 1B
128k
7
20%
20%
5%
2%
2%
23%
0%
5%
0%
14%
40%
Command-R+
128k
7
43%
34%
5%
11%
12%
0%
40%
63%
Llama 3 8B
8k
7
41%
30%
5%
10%
12%
0%
50%
71%
PALM-2
8k
7
Gemini 1.0 Pro
33k
6
43%
28%
5%
12%
12%
1%
40%
2%
Gemma 3 1B
32k
6
14%
24%
5%
2%
1%
20%
3%
0%
0%
48%
32%
DeepSeek Coder V2 Lite
128k
6
43%
32%
5%
16%
14%
Codestral (May)
33k
6
33%
26%
5%
21%
22%
0%
35%
80%
Aya Expanse 32B
128k
6
38%
23%
5%
14%
15%
0%
45%
68%
Llama 2 Chat 70B
4k
6
41%
33%
5%
10%
0%
32%
34%
DeepSeek LLM 67B (V1)
4k
6
75%
Llama 2 Chat 13B
4k
6
41%
32%
5%
10%
12%
2%
33%
0%
Command-R+ (Apr)
128k
5
43%
32%
5%
12%
12%
1%
28%
64%
OpenChat 3.5
8k
5
31%
23%
5%
12%
0%
31%
68%
DBRX
33k
5
40%
33%
7%
9%
12%
3%
28%
67%
Ministral 3B
128k
5
34%
26%
6%
7%
9%
0%
54%
74%
Mistral NeMo
128k
5
40%
31%
4%
6%
10%
0%
40%
65%
Jamba 1.5 Mini
256k
4
37%
30%
5%
6%
8%
1%
36%
63%
Jamba 1.7 Mini
258k
4
39%
32%
5%
6%
9%
1%
26%
48%
Jamba 1.6 Mini
256k
3
37%
30%
5%
7%
10%
3%
26%
43%
Mixtral 8x7B
33k
3
39%
29%
5%
7%
3%
0%
30%
1%
DeepHermes 3 - Llama-3.1 8B
128k
2
37%
27%
4%
9%
9%
0%
22%
54%
Aya Expanse 8B
8k
2
31%
25%
5%
7%
8%
0%
32%
44%
Llama 65B
2k
1
Qwen Chat 14B
8k
1
Claude Instant
100k
1
43%
33%
4%
11%
0%
26%
2%
Codestral-Mamba
256k
1
21%
21%
5%
13%
11%
0%
24%
80%
Mistral 7B
8k
1
25%
18%
4%
5%
2%
0%
12%
40%
Command-R
128k
1
34%
29%
5%
4%
9%
0%
15%
42%
Command-R (Mar)
128k
1
34%
28%
5%
5%
6%
1%
16%
40%
Grok 3 mini Reasoning (low)
1m
GPT-4o mini Realtime (Dec)
128k
GPT-4o Realtime (Dec)
128k
GPT-3.5 Turbo (0613)
4k

📊 数据可视化

AI分析指数对比

价格vs性能散点图

输出速度对比

欢迎来到AI快讯网,开启AI资讯新时代!