Nvidia dropped new benchmark numbers Wednesday. The data shows its latest AI server boosts mixture-of-expert model performance by 10 times.
NVIDIA Corporation, NVDA
The tests included China’s Moonshoot AI Kimi K2 Thinking model. DeepSeek’s models saw similar gains.
This matters because the AI game is changing. Companies are shifting from training models to deploying them for millions of users.
That’s a market where Nvidia doesn’t have the same dominance. AMD and Cerebras are nipping at its heels.
Mixture-of-expert models work differently than traditional AI. They split tasks into pieces and assign them to specialized “experts” within the system.
The approach exploded after DeepSeek dropped its open source model in early 2025. That model trained faster and cheaper than competitors.
OpenAI adopted the technique for ChatGPT. France’s Mistral followed suit. Moonshoot AI released its own version in July.
These models need less training on expensive chips. But Nvidia argues they still need powerful hardware for deployment.
Nvidia packed 72 of its top chips into one machine. Fast connections link the chips together.
The company says this setup delivered the 10x performance boost for Moonshoot’s Kimi K2 model. Previous generation servers couldn’t match these numbers.
The gains come from two things. First, cramming more chips into each box. Second, the speed of chip-to-chip communication.
These are areas where Nvidia still beats rivals. For now.
AMD isn’t sitting still. The company is building its own multi-chip server.
That system should hit the market next year. It will pack multiple powerful processors together, matching Nvidia’s strategy.
The competitive pressure is real. While Nvidia owns the AI training market, inference is different territory.
Inference means serving trained models to end users. Multiple companies can compete here.
Nvidia released this data to prove a point. Even efficient models need serious hardware for deployment.
The benchmark focused on real-world models currently in production. Moonshoot AI’s system represents the new generation of efficient AI architecture.
These models train faster. They cost less to develop. But according to Nvidia’s numbers, deployment still demands top-tier servers.
The 10x improvement applies specifically to inference workloads. That’s the process of running queries through trained models at scale.
Nvidia published the data Wednesday, showing concrete metrics for mixture-of-expert performance. The company tested multiple models beyond just Moonshoot and DeepSeek.
The server combines raw chip count with connection speed. Both factors contribute to the performance gains Nvidia claims.
AMD’s competing product launch next year will test whether Nvidia can maintain its advantages in chip density and interconnect speed.
The post Nvidia (NVDA) Stock: New Server Benchmark Shows 10x Performance Increase appeared first on Blockonomi.



