Written by: Bruce Recently, the entire tech and investment world has been focused on the same thing: how AI applications are "killing" traditional SaaS. Ever sinceWritten by: Bruce Recently, the entire tech and investment world has been focused on the same thing: how AI applications are "killing" traditional SaaS. Ever since

The next earthquake in AI: Why is the real danger not a SaaS killer, but a computing power revolution?

2026/02/12 12:33
5 min read

Written by: Bruce

Recently, the entire tech and investment world has been focused on the same thing: how AI applications are "killing" traditional SaaS. Ever since Claude Cowork of @AnthropicAI demonstrated how easily it can help you write emails, create PowerPoint presentations, and analyze Excel spreadsheets, a panic about the "death of software" has begun to spread. It's certainly frightening, but if your focus is only on this, you might be missing out on a truly major upheaval.

The next earthquake in AI: Why is the real danger not a SaaS killer, but a computing power revolution?

It's like we're all looking up at the drone aerial battles, but no one notices that the entire continental plate beneath our feet is quietly shifting. The real storm is hidden beneath the surface, in a corner most people can't see: the computing power foundation that supports the entire AI world is undergoing a "silent revolution."

This revolution may cause the grand party meticulously organized by Nvidia (@nvidia), the shovel seller of AI, to end sooner than anyone imagined.

Two revolutionary paths are converging.

This revolution is not a single event, but rather the result of two seemingly independent technological paths intertwined. Like two armies closing in, they are launching a pincer attack on Nvidia's GPU dominance.

The first path is a revolution in algorithm slimming.

Have you ever wondered if a super brain really needs to use all its brain cells when thinking about a problem? Obviously not. DeepSeek figured this out and developed the MoE (Hybrid Expert Model) architecture.

You can think of it as a company with hundreds of experts in different fields. But each time you hold a meeting to solve a problem, you only need to invite the two or three most relevant people, instead of having everyone brainstorm together. That's the clever thing about MoE: it allows a huge model to activate only a small group of "experts" for each calculation, thus greatly saving computing power.

What will the result be? The DeepSeek-V2 model nominally has 236 billion "experts" (parameters), but only needs to activate 21 billion of them each time it works, less than 9% of the total. Yet its performance is comparable to GPT-4, which requires 100% full power. What does this mean? The capabilities of AI are decoupled from the computing power it consumes!

In the past, we all assumed that the more powerful the AI, the more GPUs it would consume. Now, DeepSeek tells us that through clever algorithms, the same effect can be achieved at one-tenth the cost. This directly casts a huge question mark over the essential nature of NVIDIA GPUs.

The second path is a hardware revolution.

AI operations are divided into two phases: training and inference. Training is like going to school, requiring extensive reading, and GPUs, with their powerful parallel computing capabilities, are indeed very useful in this stage. However, inference, like our daily use of AI, places more emphasis on reaction speed.

GPUs have an inherent limitation during inference: their RAM (HBM) is external, resulting in latency in data transfer. This is like a chef whose ingredients are stored in the refrigerator next door; they have to run over to get them every time they cook, which is never very fast. Companies like Cerebras and Groq have taken a different approach, designing dedicated inference chips that solder SRAM directly onto the chip, keeping the ingredients readily available and achieving "zero-latency" access.

The market has already voted with real money. While OpenAI complained about Nvidia's inadequate GPU inference, it immediately signed a $10 billion deal with Cerebras to lease their inference services. Nvidia itself panicked and promptly spent $20 billion to acquire Groq, just to avoid falling behind in this new arena.

When two paths intersect: Cost avalanche

Okay, now let's put these two things together: using an algorithmically "slimmed down" DeepSeek model to run on a hardware "zero-latency" Cerebras chip.

what happens?

A cost avalanche.

First, the slimmed-down model is very small and can be entirely packed into the chip's built-in memory. Second, without the bottleneck of external memory, the AI's reaction speed will be astonishingly fast. The end result is that training costs are reduced by 90% due to the MoE architecture, and inference costs are reduced by another order of magnitude due to dedicated hardware and sparse computing. In total, the cost of owning and running a world-class AI may only be 10%-15% of that of a traditional GPU solution.

This is not an improvement, this is a paradigm shift.

Nvidia's throne is being quietly stripped of its carpet.

Now you should understand why this is more deadly than the "Cowork panic".

Nvidia's trillion-dollar market capitalization today is built on a simple story: AI is the future, and the future of AI depends on my GPUs. But now, the foundation of this story is being shaken.

Even if Nvidia continues to monopolize the training market, the overall size of the market could shrink significantly if customers can get the job done with only one-tenth of the cards.

In the inference market, a pie ten times larger than the training market, Nvidia not only lacks an absolute advantage but also faces encirclement and suppression from various giants such as Google and Cerebras. Even its largest customer, OpenAI, is defecting.

What will happen to valuations built on the expectation of a "permanent monopoly" once Wall Street realizes that Nvidia's "shovel" is no longer the only, or even the best, option? I think everyone knows the answer.

Therefore, the biggest black swan event in the next six months may not be which AI application has defeated which other, but rather a seemingly insignificant piece of technical news: such as a new paper on the efficiency of the MoE algorithm, or a report showing a significant increase in the market share of dedicated inference chips, quietly announcing that the computing power war has entered a new stage.

When the shovel seller's shovel is no longer his only option, his golden age may be coming to an end.

Market Opportunity
Notcoin Logo
Notcoin Price(NOT)
$0.0004029
$0.0004029$0.0004029
+7.12%
USD
Notcoin (NOT) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.