Huawei Unveils AI Training Breakthrough Using Ascend Chips, Surpassing DeepSeek

tech360.tv
10 hours ago
2 min read

Huawei Technologies has introduced a new artificial intelligence training method that it claims outperforms DeepSeek’s approach, using its own Ascend chips to boost efficiency and performance.

Huawei store interior with modern gadgets displayed on wooden tables. Bright lighting and signs, including Huawei Fitness Zone, add a clean, techy vibe. — Credit: HUAWEI GLOBAL

The company’s Pangu research team published a paper detailing a new architecture called Mixture of Grouped Experts (MoGE), an evolution of the Mixture of Experts (MoE) technique used in cost-effective AI models like DeepSeek.

While MoE is known for its low execution costs and enhanced learning capacity, Huawei researchers said it suffers from inefficiencies due to uneven activation of sub-models, or “experts,” across devices.

MoGE addresses this by grouping experts during selection, resulting in better workload balance and more efficient execution during both training and inference, according to the paper.

The architecture was tested on Huawei’s Ascend neural processing units (NPUs), which are designed to accelerate AI tasks. Researchers found that MoGE led to improved expert load balancing and execution efficiency.

Huawei’s Pangu model, trained using this method, achieved state-of-the-art performance on most general English benchmarks and all Chinese benchmarks. It also demonstrated higher efficiency in long-context training compared to models like DeepSeek-V3, Alibaba’s Qwen2.5-72B and Meta’s Llama-405B.

The Pangu model also excelled in general language-comprehension and reasoning tasks, the researchers said.

The development comes as Chinese AI firms focus on improving model training and inference through algorithmic and hardware-software integration, amid US restrictions on advanced AI chip exports.

Huawei’s Ascend chips are seen as domestic alternatives to Nvidia processors, which are subject to US sanctions.

Pangu Ultra, a large language model with 135 billion parameters optimised for NPUs, showcases Huawei’s architectural and systemic advancements.

The training process includes three stages: pre-training, long context extension and post-training. It involves pre-training on 13.2 trillion tokens and uses 8,192 Ascend chips for long context extension.

Huawei said the model and system will soon be available to its commercial customers.

Huawei introduced MoGE, a new AI training method improving on MoE
MoGE tested on Ascend chips showed better efficiency and load balancing
Pangu model outperformed DeepSeek, Qwen2.5-72B and Llama-405B

Source: SCMP