TLDR Microsoft launched three in-house AI models: MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2, available through Microsoft Foundry. MAI-Transcribe-1 claims bestTLDR Microsoft launched three in-house AI models: MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2, available through Microsoft Foundry. MAI-Transcribe-1 claims best

Microsoft (MSFT) Releases Three In-House AI Models, Cutting Ties With OpenAI

2026/04/02 22:21
4 min read
For feedback or concerns regarding this content, please contact us at crypto.news@mexc.com

TLDR

  • Microsoft launched three in-house AI models: MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2, available through Microsoft Foundry.
  • MAI-Transcribe-1 claims best-in-class accuracy on 25 languages, beating OpenAI’s Whisper and Google Gemini Flash benchmarks.
  • Microsoft renegotiated its OpenAI contract in late 2025, freeing it to independently build frontier AI models for the first time.
  • Each model was built by teams of fewer than 10 engineers, using roughly half the GPU resources of competing products.
  • Microsoft AI CEO Mustafa Suleiman confirmed plans to build a frontier large language model, targeting full AI independence.

Microsoft has taken its most concrete step yet toward building its own AI models from scratch, launching three new tools on Wednesday that put it in direct competition with OpenAI, Google, and a range of AI startups.


MSFT Stock Card
Microsoft Corporation, MSFT

The three models — MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 — are available now through Microsoft Foundry and a new MAI Playground. They cover speech-to-text, text-to-speech, and image generation. Microsoft AI CEO Mustafa Suleiman described the launch as the opening move from his “superintelligence team,” which he formed just six months ago.

MSFT stock closed its worst quarter since 2008, down roughly 17% year-to-date. The model launch represents Suleiman’s first public answer to investor pressure for returns on the company’s AI spending.

MAI-Transcribe-1 is the headliner. It achieves the lowest average Word Error Rate on the FLEURS benchmark across the top 25 languages by Microsoft product usage, with an average rate of 3.8%. Microsoft claims it outperforms OpenAI’s Whisper-large-v3 on all 25 languages, and Google’s Gemini 3.1 Flash on 22 of 25. It processes MP3, WAV, and FLAC files up to 200MB, and Microsoft says its batch speed is 2.5 times faster than Azure’s existing offering. It’s already being tested inside Teams and Copilot Voice.

MAI-Voice-1 generates 60 seconds of natural-sounding audio in a single second and supports custom voice creation from just a few seconds of sample audio. It’s priced at $22 per million characters. MAI-Image-2 ranks in the top three on the Arena.ai leaderboard and is rolling out across Bing and PowerPoint, priced at $5 per million input tokens and $33 per million image output tokens. WPP is among the first enterprise partners using it at scale.

A New Contract Made It Possible

The launch would not have been possible a year ago. Until October 2025, Microsoft was contractually blocked from independently pursuing artificial general intelligence under its original 2019 deal with OpenAI.

When OpenAI sought to expand its compute footprint beyond Microsoft — striking deals with SoftBank and others — Microsoft renegotiated. The revised terms freed Microsoft to build its own frontier models while retaining license rights to everything OpenAI builds through 2032.

Suleiman told VentureBeat: “Back in September of last year, we renegotiated the contract with OpenAI, and that enabled us to independently pursue our own superintelligence.” He was clear the partnership with OpenAI remains in place at least until 2032.

Small Teams, Big Claims

One of the more surprising details from the launch: each model was built by a team of fewer than 10 engineers. Suleiman said the audio model was built by 10 people and that performance gains came from model architecture and data choices, not headcount.

Microsoft says the pricing is deliberately aggressive — designed to undercut Amazon and Google. Suleiman described it as “the cheapest of any of the hyperscalers.” The company is already planning frontier-scale GPU clusters over the next 12 to 18 months.

Suleiman confirmed a large language model is in the roadmap, saying Microsoft’s goal is to be “completely independent” and deliver “state of the art models across all modalities.”

The post Microsoft (MSFT) Releases Three In-House AI Models, Cutting Ties With OpenAI appeared first on CoinCentral.

Market Opportunity
Housecoin Logo
Housecoin Price(HOUSE)
$0,0012329
$0,0012329$0,0012329
-%1,27
USD
Housecoin (HOUSE) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact crypto.news@mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.
Tags:

Trade GOLD, Share 1,000,000 USDT

Trade GOLD, Share 1,000,000 USDTTrade GOLD, Share 1,000,000 USDT

0 fees, up to 1,000x leverage, deep liquidity