TLDRs; Alibaba’s Qwen3-Max-Thinking achieved perfect scores in AIME and HMMT, marking China’s first flawless AI math performance. OpenAI’s GPT-5 Pro also self-reported perfect results, setting up a new East–West rivalry in reasoning AI. Verification concerns linger, as Alibaba’s results lack third-party validation or evidence of closed-book testing. API access opens doors for developers and investors, [...] The post Alibaba’s Qwen AI Outsmarts Global Peers in Math Benchmarks appeared first on CoinCentral.TLDRs; Alibaba’s Qwen3-Max-Thinking achieved perfect scores in AIME and HMMT, marking China’s first flawless AI math performance. OpenAI’s GPT-5 Pro also self-reported perfect results, setting up a new East–West rivalry in reasoning AI. Verification concerns linger, as Alibaba’s results lack third-party validation or evidence of closed-book testing. API access opens doors for developers and investors, [...] The post Alibaba’s Qwen AI Outsmarts Global Peers in Math Benchmarks appeared first on CoinCentral.

Alibaba’s Qwen AI Outsmarts Global Peers in Math Benchmarks

2025/11/06 05:21

TLDRs;

  • Alibaba’s Qwen3-Max-Thinking achieved perfect scores in AIME and HMMT, marking China’s first flawless AI math performance.
  • OpenAI’s GPT-5 Pro also self-reported perfect results, setting up a new East–West rivalry in reasoning AI.
  • Verification concerns linger, as Alibaba’s results lack third-party validation or evidence of closed-book testing.
  • API access opens doors for developers and investors, with potential cost-performance advantages across Asia-Pacific markets.

Alibaba’s artificial intelligence division has unveiled Qwen3-Max-Thinking, an advanced reasoning model that stunned observers by scoring a perfect 100% in two of the world’s toughest mathematics competitions, the American Invitational Mathematics Examination (AIME) and the Harvard-MIT Mathematics Tournament (HMMT).

This marks a significant milestone for China’s AI industry. It is reportedly the first time a Chinese-developed model has matched or exceeded Western benchmarks in reasoning-heavy academic tests.

The announcement places Alibaba’s AI efforts shoulder-to-shoulder with OpenAI’s GPT-5 Pro, which also self-reported flawless results in the same contests earlier this year.

A Leap for China’s AI Ambitions

According to Alibaba, Qwen3-Max-Thinking is built atop Qwen3-Max, the company’s largest AI model boasting over one trillion parameters. Released in late September, the Qwen3-Max architecture represents Alibaba’s boldest step toward creating general-purpose reasoning models that can compete globally in complex problem-solving tasks.

The math victories are symbolic as much as technical. For years, elite competitions like the AIME and HMMT have been used as unofficial benchmarks for evaluating the reasoning depth and abstract thinking capacity of large language models (LLMs). Perfect accuracy in such events signals that Qwen3-Max-Thinking is closing the performance gap with Western-developed systems.

However, questions remain about transparency and verification. Alibaba’s claims, while headline-grabbing, lack third-party confirmation. Neither the AIME nor HMMT maintains public leaderboards for AI models, and no independent audit has yet verified whether the results were achieved under closed-book, internet-free conditions, a crucial factor in determining authenticity.

Verification Gaps Raise Skepticism

Despite the celebration, experts have urged caution. The absence of public verification means it is unclear whether Qwen3-Max-Thinking truly achieved 100% accuracy under standardized conditions.
Unverified results have become a recurring issue in AI benchmarking, as companies race to claim superiority in domains like reasoning, coding, and mathematics.

Further complicating the picture, details remain murky on whether the 2025 versions of the contest problems were used or if the AI had prior exposure to similar data during training. Without contamination controls,  safeguards ensuring the model hadn’t seen test data before, perfect scores are difficult to validate.

While Alibaba’s announcement has sparked excitement, critics warn that without reproducibility, the victory could remain symbolic rather than scientific.

Developers and Investors Eye API Potential

Beyond benchmark bragging rights, Alibaba’s AI strategy has real commercial implications. The company recently opened API access to Qwen3-Max-Thinking, inviting developers to test its reasoning capabilities in real-world applications.

For software and data teams, this introduces new possibilities for cost-performance routing, dynamically choosing between AI providers based on pricing, accuracy, or latency. Developers in the Asia-Pacific region, particularly those seeking local AI infrastructure options, may find Qwen’s ecosystem attractive if it offers competitive pricing and reliable regional support beyond Singapore.

Investors are also watching closely. If Qwen3-Max-Thinking can handle complex reasoning tasks while maintaining affordability, Alibaba could carve out a niche among enterprise developers and AI startups looking for alternatives to U.S. providers. The success of such models could signal a new balance in global AI infrastructure, where Chinese models rival or even outperform Western ones in specific tasks.

The post Alibaba’s Qwen AI Outsmarts Global Peers in Math Benchmarks appeared first on CoinCentral.

Market Opportunity
Sleepless AI Logo
Sleepless AI Price(AI)
$0.03693
$0.03693$0.03693
-1.20%
USD
Sleepless AI (AI) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Trump-Backed WLFI Plunges 58% – Buyback Plan Announced to Halt Freefall

Trump-Backed WLFI Plunges 58% – Buyback Plan Announced to Halt Freefall

World Liberty Financial (WLFI), the Trump-linked DeFi project, is scrambling to stop a market collapse after its token lost over 50% of its value in September. On Friday, the project unveiled a full buyback-and-burn program, directing all treasury liquidity fees to absorb selling pressure. According to a governance post on X, the community approved the plan overwhelmingly, with WLFI pledging full transparency for every burn. The urgency of the move reflects WLFI’s steep losses in recent weeks. WLFI is trading Friday at $0.19, down from its September 1 peak of $0.46, according to CoinMarketCap, a 58% drop in less than a month. Weekly losses stand at 12.85%, with a 15.45% decline for the month. This isn’t the project’s first attempt at intervention. Just days after launch, WLFI burned 47 million tokens on September 3 to counter a 31% sell-off, sending the supply to a verified burn address. For World Liberty Financial, the buyback-and-burn program represents both a damage-control measure and a test of community faith. While tokenomics adjustments can provide short-term relief, the project will need to convince investors that WLFI has staying power beyond interventions. WLFI Launches Buyback-and-Burn Plan, Linking Token Scarcity to Platform Growth According to the governance proposal, WLFI will use fees generated from its protocol-owned liquidity (POL) pools on Ethereum, BNB Chain, and Solana to repurchase tokens from the open market. Once bought back, the tokens will be sent to a burn address, permanently removing them from circulation.WLFI Proposal Source: WLFI The project stressed that this system ties supply reduction directly to platform growth. As trading activity rises, more liquidity fees are generated, fueling larger buybacks and burns. This seeks to create a feedback loop where adoption drives scarcity, and scarcity strengthens token value. Importantly, the plan applies only to WLFI’s protocol-controlled liquidity pools. Community and third-party liquidity pools remain unaffected, ensuring the mechanism doesn’t interfere with external ecosystem contributions. In its proposal, the WLFI team argued that the strategy aligns long-term holders with the project’s future by systematically reducing supply and discouraging short-term speculation. Each burn increases the relative stake of committed investors, reinforcing confidence in WLFI’s tokenomics. To bolster credibility, WLFI has pledged full transparency: every buyback and burn will be verifiable on-chain and reported to the community in real time. WLFI Joins Hyperliquid, Jupiter, and Sky as Buyback Craze Spills Into Wall Street WLFI’s decision to adopt a full buyback-and-burn strategy places it among the most ambitious tokenomic models in crypto. While partly a response to its sharp September price decline, the move also reflects a trend of DeFi protocols leveraging revenue streams to cut supply, align incentives, and strengthen token value. Hyperliquid illustrates the model at scale. Nearly all of its platform fees are funneled into automated $HYPE buybacks via its Assistance Fund, creating sustained demand. By mid-2025, more than 20 million tokens had been repurchased, with nearly 30 million held by Q3, worth over $1.5 billion. This consistency both increased scarcity and cemented Hyperliquid’s dominance in decentralized derivatives. Other protocols have adopted variations. Jupiter directs half its fees into $JUP repurchases, locking tokens for three years. Raydium earmarks 12% of fees for $RAY buybacks, already removing 71 million tokens, roughly a quarter of the circulating supply. Burn-based models push further, as seen with Sky, which has spent $75 million since February 2025 to permanently erase $SKY tokens, boosting scarcity and governance influence. But the buyback phenomenon isn’t limited to DeFi. Increasingly, listed companies with crypto treasuries are adopting aggressive repurchase programs, sometimes to offset losses as their digital assets decline. According to a report, at least seven firms, ranging from gaming to biotech, have turned to buybacks, often funded by debt, to prop up falling stock prices. One of the latest is Thumzup Media, a digital advertising company with a growing Web3 footprint. On Thursday, it launched a $10 million share repurchase plan, extending its capital return strategy through 2026, after completing a $1 million program that saw 212,432 shares bought at an average of $4.71. DeFi Development Corp, the first public company built around a Solana-based treasury strategy, also recently expanded its buyback program to $100 million, up from $1 million, making it one of the largest stock repurchase initiatives in the digital asset sector. Together, these cases show how buybacks, whether in tokenomics or equities, are emerging as a key mechanism for stabilizing value and signaling confidence, even as motivations and execution vary widely
Share
CryptoNews2025/09/26 19:12
Son of filmmaker Rob Reiner charged with homicide for death of his parents

Son of filmmaker Rob Reiner charged with homicide for death of his parents

FILE PHOTO: Rob Reiner, director of "The Princess Bride," arrives for a special 25th anniversary viewing of the film during the New York Film Festival in New York
Share
Rappler2025/12/16 09:59
Bitcoin Peak Coming in 45 Days? BTC Price To Reach $150K

Bitcoin Peak Coming in 45 Days? BTC Price To Reach $150K

The post Bitcoin Peak Coming in 45 Days? BTC Price To Reach $150K appeared first on Coinpedia Fintech News Bitcoin has delivered one of its strongest performances in recent months, jumping from September lows of $108K to over $117K today. But while excitement is high, market watchers warn the clock is ticking.  History shows Bitcoin peaks don’t last forever, and analysts now believe the next major top could arrive within just 45 days, with …
Share
CoinPedia2025/09/18 15:49