WAIC 2025 Highlight: Huawei Ascend 384 SuperNode - Your "Computing Power Nuclear Bomb" Explained in One Image

At the World Artificial Intelligence Conference (WAIC) held in Shanghai from July 26th to 29th， Huawei unveiled the Ascend 384 Ultra-Node， also known as the Atlas 900 A3 SuperPoD， for the first time at booth H1-A301. This groundbreaking system， the largest ultra-node of its kind in the industry， has captured significant attention as a ‘gem of the exhibition’.

The Ascend 384 Ultra-Node represents a departure from the traditional CPU-centric Von Neumann architecture， introducing an innovative peer-to-peer computing model. This architecture extends the internal server bus to an entire rack， and even across multiple racks， fundamentally transforming data transmission and processing methods. Traditional AI training clusters， built by stacking servers， storage， and network devices， often suffer from low resource utilization and frequent failures， posing significant challenges to AI development.

WAIC 2025 'Gem of the Exhibition': Huawei Ascend 384 Ultra-Node - A Visual Guide

The Ascend Ultra-Node， by connecting multiple NPUs (Neural Processing Units) via a high-speed bus， overcomes interconnection bottlenecks， enabling the ultra-node to function collaboratively as a single， powerful computing unit.

Key advancements include:

Communication Bandwidth Leap: Cross-node communication bandwidth has been increased by 15 times， leading to significantly faster data transfer speeds.

Communication Latency Reduction: Communication latency has been reduced tenfold， from 2μs to 0.2μs， minimizing data processing waiting times.

Superior Interconnection Capabilities: The system supports interconnection of up to 384 NPUs with extreme bandwidth in a point-to-point manner. Notably， it is the industry’s only product that can complete all expert parallelism (EP) schemes for MoE models within a single ultra-node domain. This makes it an optimal solution for training and inferencing MoE models， greatly enhancing efficiency.

WAIC 2025 'Gem of the Exhibition': Huawei Ascend 384 Ultra-Node - A Visual Guide

The Ascend 384 Ultra-Node boasts three primary advantages:

Massive Bandwidth: The communication bandwidth between any two AI processors within the ultra-node is 15 times higher than in traditional architectures. Furthermore， single-hop communication latency within the ultra-node is reduced by 10 times， ensuring smoother data interaction.

Ultra-Low Latency: The Ascend Ultra-Node supports unified global memory addressing， enabling more efficient memory semantic communication. Its low-latency instruction-level memory semantic communication caters to the small packet communication needs in large model training and inference， improving the efficiency of small packet data transmission and discrete random access in expert networks. Critically， the Ascend 384 Ultra-Node is reportedly the industry’s first solution to break the 15ms decode latency barrier， meeting the demands of real-time， in-depth thinking user experiences.

Exceptional Performance: Actual tests indicate that on the Ascend Ultra-Node cluster， training performance for dense models with hundreds of billions of parameters， such as LLaMA3， can exceed 2.5 times that of traditional clusters. For multimodal and MoE models like Qwen and DeepSeek， which involve higher communication overheads， performance improvements can reach over 3 times.

免责声明：本网站内容主要来自原创、合作伙伴供稿和第三方自媒体作者投稿，凡在本网站出现的信息，均仅供参考。本网站将尽力确保所提供信息的准确性及可靠性，但不保证有关资料的准确性及可靠性，读者在使用前请进一步核实，并对任何自主决定的行为负责。本网站对有关资料所引致的错误、不确或遗漏，概不负任何法律责任。任何单位或个人认为本网站中的网页或链接内容可能涉嫌侵犯其知识产权或存在不实内容时，可联系本站进行审核删除。

一	二	三	四	五	六	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

WAIC 2025 Highlight: Huawei Ascend 384 SuperNode – Your “Computing Power Nuclear Bomb” Explained in One Image

关于作者

Rain科技

WAIC 2025 Highlight: Huawei Ascend 384 SuperNode – Your “Computing Power Nuclear Bomb” Explained in One Image

关于作者

Rain科技

相关推荐