top of page

Huawei Unveils AI Training Breakthrough Using Ascend Chips, Surpassing DeepSeek

  • Writer: tech360.tv
    tech360.tv
  • Jun 6
  • 2 min read

Huawei Technologies has introduced a new artificial intelligence training method that it claims outperforms DeepSeek’s approach, using its own Ascend chips to boost efficiency and performance.


Huawei store interior with modern gadgets displayed on wooden tables. Bright lighting and signs, including Huawei Fitness Zone, add a clean, techy vibe.
Credit: HUAWEI GLOBAL

The company’s Pangu research team published a paper detailing a new architecture called Mixture of Grouped Experts (MoGE), an evolution of the Mixture of Experts (MoE) technique used in cost-effective AI models like DeepSeek.


While MoE is known for its low execution costs and enhanced learning capacity, Huawei researchers said it suffers from inefficiencies due to uneven activation of sub-models, or “experts,” across devices.


MoGE addresses this by grouping experts during selection, resulting in better workload balance and more efficient execution during both training and inference, according to the paper.


The architecture was tested on Huawei’s Ascend neural processing units (NPUs), which are designed to accelerate AI tasks. Researchers found that MoGE led to improved expert load balancing and execution efficiency.


Huawei’s Pangu model, trained using this method, achieved state-of-the-art performance on most general English benchmarks and all Chinese benchmarks. It also demonstrated higher efficiency in long-context training compared to models like DeepSeek-V3, Alibaba’s Qwen2.5-72B and Meta’s Llama-405B.


The Pangu model also excelled in general language-comprehension and reasoning tasks, the researchers said.


The development comes as Chinese AI firms focus on improving model training and inference through algorithmic and hardware-software integration, amid US restrictions on advanced AI chip exports.


Huawei’s Ascend chips are seen as domestic alternatives to Nvidia processors, which are subject to US sanctions.


Pangu Ultra, a large language model with 135 billion parameters optimised for NPUs, showcases Huawei’s architectural and systemic advancements.


The training process includes three stages: pre-training, long context extension and post-training. It involves pre-training on 13.2 trillion tokens and uses 8,192 Ascend chips for long context extension.


Huawei said the model and system will soon be available to its commercial customers.

  • Huawei introduced MoGE, a new AI training method improving on MoE

  • MoGE tested on Ascend chips showed better efficiency and load balancing

  • Pangu model outperformed DeepSeek, Qwen2.5-72B and Llama-405B


Source: SCMP

Komentar


As technology advances and has a greater impact on our lives than ever before, being informed is the only way to keep up.  Through our product reviews and news articles, we want to be able to aid our readers in doing so. All of our reviews are carefully written, offer unique insights and critiques, and provide trustworthy recommendations. Our news stories are sourced from trustworthy sources, fact-checked by our team, and presented with the help of AI to make them easier to comprehend for our readers. If you notice any errors in our product reviews or news stories, please email us at editorial@tech360.tv.  Your input will be important in ensuring that our articles are accurate for all of our readers.

Tech360tv is Singapore's Tech News and Gadget Reviews platform. Join us for our in depth PC reviews, Smartphone reviews, Audio reviews, Camera reviews and other gadget reviews.

  • YouTube
  • Facebook
  • TikTok
  • Instagram
  • Twitter
  • LinkedIn

© 2021 tech360.tv. All rights reserved.

bottom of page