PANews reported on September 12th that Alibaba's Tongyi Qianwen released its next-generation basic model architecture, Qwen3-Next, and open-sourced the Qwen3-Next-80B-A3B series of models based on this architecture. Compared to the Qwen3 MoE model architecture, this architecture features the following core improvements: a hybrid attention mechanism, a highly sparse MoE structure, a series of optimizations for stable and user-friendly training, and a multi-token prediction mechanism to improve inference efficiency. Based on the Qwen3-Next model architecture, Alibaba trained the Qwen3-Next-80B-A3B-Base model, which has 80 billion parameters but only activates 3 billion. This Base model achieves performance similar to or slightly better than the Qwen3-32B dense model, while its training cost (GPU hours) is less than one-tenth of that of the Qwen3-32B. Its inference throughput for contexts above 32k is over ten times that of the Qwen3-32B, achieving exceptional cost-effectiveness for both training and inference.PANews reported on September 12th that Alibaba's Tongyi Qianwen released its next-generation basic model architecture, Qwen3-Next, and open-sourced the Qwen3-Next-80B-A3B series of models based on this architecture. Compared to the Qwen3 MoE model architecture, this architecture features the following core improvements: a hybrid attention mechanism, a highly sparse MoE structure, a series of optimizations for stable and user-friendly training, and a multi-token prediction mechanism to improve inference efficiency. Based on the Qwen3-Next model architecture, Alibaba trained the Qwen3-Next-80B-A3B-Base model, which has 80 billion parameters but only activates 3 billion. This Base model achieves performance similar to or slightly better than the Qwen3-32B dense model, while its training cost (GPU hours) is less than one-tenth of that of the Qwen3-32B. Its inference throughput for contexts above 32k is over ten times that of the Qwen3-32B, achieving exceptional cost-effectiveness for both training and inference.

Alibaba launches more efficient Qwen3-Next artificial intelligence model

2025/09/12 07:27
1분 읽기
이 콘텐츠에 대한 의견이나 우려 사항이 있으시면 crypto.news@mexc.com으로 연락주시기 바랍니다

PANews reported on September 12th that Alibaba's Tongyi Qianwen released its next-generation basic model architecture, Qwen3-Next, and open-sourced the Qwen3-Next-80B-A3B series of models based on this architecture. Compared to the Qwen3 MoE model architecture, this architecture features the following core improvements: a hybrid attention mechanism, a highly sparse MoE structure, a series of optimizations for stable and user-friendly training, and a multi-token prediction mechanism to improve inference efficiency. Based on the Qwen3-Next model architecture, Alibaba trained the Qwen3-Next-80B-A3B-Base model, which has 80 billion parameters but only activates 3 billion. This Base model achieves performance similar to or slightly better than the Qwen3-32B dense model, while its training cost (GPU hours) is less than one-tenth of that of the Qwen3-32B. Its inference throughput for contexts above 32k is over ten times that of the Qwen3-32B, achieving exceptional cost-effectiveness for both training and inference.

시장 기회
Moonveil 로고
Moonveil 가격(MORE)
$0.0000392
$0.0000392$0.0000392
-2.24%
USD
Moonveil (MORE) 실시간 가격 차트
면책 조항: 본 사이트에 재게시된 글들은 공개 플랫폼에서 가져온 것으로 정보 제공 목적으로만 제공됩니다. 이는 반드시 MEXC의 견해를 반영하는 것은 아닙니다. 모든 권리는 원저자에게 있습니다. 제3자의 권리를 침해하는 콘텐츠가 있다고 판단될 경우, crypto.news@mexc.com으로 연락하여 삭제 요청을 해주시기 바랍니다. MEXC는 콘텐츠의 정확성, 완전성 또는 시의적절성에 대해 어떠한 보증도 하지 않으며, 제공된 정보에 기반하여 취해진 어떠한 조치에 대해서도 책임을 지지 않습니다. 본 콘텐츠는 금융, 법률 또는 기타 전문적인 조언을 구성하지 않으며, MEXC의 추천이나 보증으로 간주되어서는 안 됩니다.

USD1 Genesis: 0 Fees + 12% APR

USD1 Genesis: 0 Fees + 12% APRUSD1 Genesis: 0 Fees + 12% APR

New users: stake for up to 600% APR. Limited time!