Goodbye Claude! Alibaba's Strongest Trillion-Parameter Model Outperforms Opus 4 in Programming: Real-World Test Results Inside.

Alibaba’s Largest Model Yet: Qwen3-Max-Preview Arrives with a Trillion Parameters!

The highly anticipated Qwen3-Max-Preview (Instruct) has been released， boasting an impressive one trillion parameters.

This represents a massive leap from its predecessor， Qwen3 (235B)， with roughly a fourfold increase in size. This expansion signifies a significant advancement in model capabilities， aiming to equip users with an AI dramatically more powerful than before.

Goodbye Claude! Alibaba's Strongest Trillion-Parameter Model Excels in Programming， Opus4 Tested

According to official statements， the new version offers substantial enhancements:

Compared to the 2.5 series， Qwen3-Max-Preview demonstrates significant improvements in English and Chinese comprehension， complex instruction following， and tool utilization. Crucially， it has also drastically reduced knowledge hallucinations， leading to a more intelligent and reliable AI experience.

The model is available immediately across multiple platforms， including the Tongyi app， the Qwen Chat web interface， and via Alibaba Cloud APIs.

Further official benchmarks indicate that Qwen3-Max-Preview not only surpasses its predecessor， Qwen3 (235B)， but also outperforms international competitors like Claude Opus 4 in various tasks.

The release has been met with considerable enthusiasm globally， accompanied by a wave of early evaluations and user feedback.

The sheer scale of trillion-parameter models is a testament to the rapid growth in AI development， and many are eager to explore its potential firsthand.

Qwen continues to impress with its innovative capabilities.

To assess its real-world performance， let’s delve into some practical tests.

Multimodal Support and Effortless Coding: A Paradigm Shift

Based on official evaluations and user experiences， we will focus on Qwen3-Max-Preview’s ability to solve complex problems and generate code.

It’s important to note that Alibaba transitioned away from a mixed-thinking mode starting with Qwen3 (235B). Consequently， this Instruct version exclusively supports a non-thinking mode.

We began by presenting the model with an AIME mathematics competition problem via the Qwen Chat interface.

(AIME: The American Invitational Mathematics Examination， a crucial competition bridging the AMC10/12 and the USAMO.)

Leveraging its multimodal capabilities， Qwen3-Max-Preview can directly process image inputs. The model swiftly provided a detailed solution and the correct answer， “204，” which aligns with the official solution for this AIME problem.

After this initial success， we moved on to coding challenges.

Task: Create a colorful， interactive animation using p5.js.

Qwen3-Max-Preview delivered a complete and functional p5.js code snippet almost instantaneously. The generated code successfully created an interactive animation where user mouse clicks triggered visual responses.

The interactive effect is captivating， responding fluidly to mouse input.

Task: Generate a Minesweeper game.

The model effortlessly generated a classic Minesweeper game with a single prompt， requiring no iterative refinement or adjustments. The code executed perfectly on the first attempt.

While the game ran flawlessly， our brief play session ended quickly with an untimely “step on a mine”! (a common playful outcome).

Task: Create an introductory webpage for Qwen3-Max-Preview.

The model generated the code for an interactive webpage to introduce Qwen3-Max-Preview. Users can either save the code as a file for direct access or utilize a convenient “preview” function at the end of the response for immediate viewing.

The preview showcases a clean， intuitive website design with basic interactive elements.

For even more impressive results， users with advanced prompt engineering skills can achieve superior outcomes， as demonstrated by a user-created celebratory webpage for Qwen3-Max-Preview.

Additionally， user benchmarks have evaluated the generation speed of Qwen3-Max-Preview， which is notably fast.

One user reported generating 4467 tokens at a speed exceeding 107 tokens per second.

These practical tests clearly indicate the enhanced capabilities of the new model， particularly in programming， where all tasks were executed successfully on the first try.

Furthermore， Alibaba Cloud’s Baidu Platform has disclosed the API pricing for the model. The current version employs a tiered pricing structure based on the number of input tokens.

The natively supported context length and maximum input/output parameters are as follows:

While the official open-source release of this specific model has not yet been formally announced， Qwen’s prominent role in the open-source community suggests that future developments are highly anticipated.

One More Thing

Following the release of the foundational version of Qwen3-Max-Preview， Lin Junyang， the head of Tongyi Qianwen’s open-source efforts， shared on social media that the official release is imminent.

He also expressed his personal sentiment about the model:

“It’s truly the most engaging model we’ve worked on. While the core architecture hasn’t undergone a radical overhaul， it’s significantly improved compared to the previous 235B version.”

This experience has further bolstered his confidence in scaling model capabilities.

Additionally， there is speculation regarding the upcoming official release. Following Alibaba’s previous release patterns， it is highly probable that a dedicated inference version will be launched soon， possibly within days (given the four-day gap between Qwen3-235B’s inference and non-inference versions).

Finally， have you had a chance to try this new model? We encourage you to share your experiences in the comments section below!

Experience Now:

Qwen Chat:

Alibaba Cloud Baidu API Service:

免责声明：本网站内容主要来自原创、合作伙伴供稿和第三方自媒体作者投稿，凡在本网站出现的信息，均仅供参考。本网站将尽力确保所提供信息的准确性及可靠性，但不保证有关资料的准确性及可靠性，读者在使用前请进一步核实，并对任何自主决定的行为负责。本网站对有关资料所引致的错误、不确或遗漏，概不负任何法律责任。任何单位或个人认为本网站中的网页或链接内容可能涉嫌侵犯其知识产权或存在不实内容时，可联系本站进行审核删除。

一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Goodbye Claude! Alibaba’s Strongest Trillion-Parameter Model Outperforms Opus 4 in Programming: Real-World Test Results Inside.

关于作者

Rain科技

Goodbye Claude! Alibaba’s Strongest Trillion-Parameter Model Outperforms Opus 4 in Programming: Real-World Test Results Inside.

关于作者

Rain科技

相关推荐