ChatGPT may dominate the AI chatbot market, but a new report suggests popularity does not equal trustworthiness. A…ChatGPT may dominate the AI chatbot market, but a new report suggests popularity does not equal trustworthiness. A…

ChatGPT named least reliable work chatbot in new AI reliability report

2025/12/11 02:38
3 min read
For feedback or concerns regarding this content, please contact us at crypto.news@mexc.com

ChatGPT may dominate the AI chatbot market, but a new report suggests popularity does not equal trustworthiness. A December 2025 study examining how leading AI chatbots perform in everyday work scenarios has ranked ChatGPT as the least reliable option for professional tasks. The findings raise fresh concerns for businesses that increasingly depend on AI tools for daily operations.

The study, conducted by Relum, didn’t just look at specs on paper; they stress-tested ten major AI chatbots in real-world professional scenarios. The results? A massive disconnect between hype and reality.

The study assessed each chatbot across four key criteria. These were hallucination rate, customer product ratings, response consistency across tasks, and downtime frequency. Each factor contributed to a composite reliability risk score, with higher scores indicating greater potential workplace issues.

Here is the stat that should keep business leaders up at night: Despite controlling a massive 81% of the market and boasting high user ratings, ChatGPT recorded a hallucination rate of 35%.

In plain English, that means more than one out of every three answers it gives contains fabricated or incorrect information. If you are using it to draft a fantasy novel, that’s fine, but if you are using it for compliance reports or financial decision-making, that is a recipe for disaster. Consequently, the study slapped ChatGPT with a reliability risk score of 99 out of 99, the worst in the group.

ChatGPT named least reliable work chatbot in new AI reliability reportChatGPT

Google didn’t fare any better. While Gemini had better uptime, it actually performed worse on pure accuracy, registering the highest hallucination rate of the entire group at 38%. It highlights a weird paradox in the current AI market: the tools we use the most are often the ones struggling the hardest to keep their facts straight.

Claude and Meta AI occupy a murky middle ground. Claude, despite being a favourite for its writing style, ranked as the second least reliable due to frequent downtime and a 17% hallucination rate. Meta AI was more accurate (15% hallucination), but users seem not to like the experience, giving it the lowest satisfaction rating of the bunch (3.4 out of 5).

The “underdogs” – Grok and DeepSeek steal the show from ChatGPT

If the big names are dropping the ball, who is actually doing the work? Surprisingly, the study points to Grok and DeepSeek as the most reliable tools for professional use. They don’t have the massive marketing budgets or brand recognition of OpenAI, but they simply worked better. DeepSeek recorded zero service outages and kept hallucinations to a minimum.

Kimi also scored well, finding a sweet spot between consistency and uptime. Meanwhile, paid options like Perplexity AI were solid but raised questions about whether the subscription cost is worth it when cheaper, lesser-known alternatives are outperforming them.

ChatGPT named least reliable work chatbot in new AI reliability report

Relum’s Chief Product Officer, Razvan-Lucian Haiduc, warned that reliability should be a central factor in AI adoption decisions. He noted that around 65% of US companies now use AI chatbots in daily workflows. Nearly 45% of employees admit to sharing sensitive company information with these tools.

As AI becomes more embedded in routine work, the risks of misinformation multiply. Haiduc emphasised that the most widely used chatbot is not always the best fit for every industry. Accuracy, uptime and task-specific performance should outweigh brand familiarity.

The report serves as a reality check for the industry. Trust shouldn’t be given just because a chatbot is famous; it should be earned through consistent, verifiable truth. Right now, it looks like the market leaders have some serious catching up to do.

Market Opportunity
null Logo
null Price(null)
--
----
USD
null (null) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact crypto.news@mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Why It Could Outperform Pepe Coin And Tron With Over $7m Already Raised

Why It Could Outperform Pepe Coin And Tron With Over $7m Already Raised

The post Why It Could Outperform Pepe Coin And Tron With Over $7m Already Raised appeared on BitcoinEthereumNews.com. Crypto News 17 September 2025 | 20:26 While meme tokens like Pepe Coin and established networks such as Tron attract headlines, many investors are now searching for projects that combine innovation, revenue-sharing and real-world utility. BlockchainFX ($BFX), currently in presale at $0.024 ahead of an expected $0.05 launch, is quickly becoming one of the best cryptos to buy today. With $7m already secured and a unique model spanning multiple asset classes, it is positioning itself as a decentralised super app and a contender to surpass older altcoins. Early Presale Pricing Creates A Rare Entry Point BlockchainFX’s presale pricing structure has been designed to reward early participants. At $0.024, buyers secure a lower entry price than later rounds, locking in a cost basis more than 50% below the projected $0.05 launch price. As sales continue to climb beyond $7m, each new stage automatically increases the token price. This built-in mechanism creates a clear advantage for early investors and explains why the project is increasingly cited in “best presales to buy now” discussions across the crypto space. High-Yield Staking Model Shares Platform Revenue Beyond its presale appeal, BlockchainFX is creating a high-yield staking model that gives holders a direct share of platform revenue. Every time a trade occurs on its platform, 70% of trading fees flow back into the $BFX ecosystem: 50% of collected fees are automatically distributed to stakers in both BFX and USDT. 20% is allocated to daily buybacks of $BFX, adding demand and price support. Half of the bought-back tokens are permanently burned, steadily reducing supply. Rewards are based on the size of each member’s BFX holdings and capped at $25,000 USDT per day to ensure sustainability. This structure transforms token ownership from a speculative bet into an income-generating position, a rare feature among today’s altcoins. A Multi-Asset Platform…
Share
BitcoinEthereumNews2025/09/18 03:35
SOL Rockets 30%, ADA Holds $0.90, BlockDAG Dominates With $407M Presale

SOL Rockets 30%, ADA Holds $0.90, BlockDAG Dominates With $407M Presale

The post SOL Rockets 30%, ADA Holds $0.90, BlockDAG Dominates With $407M Presale appeared on BitcoinEthereumNews.com. The recent Solana (SOL) price surge has impressed traders, but questions remain about whether it can hold support after such a sharp climb. Meanwhile, the Cardano (ADA) market trend shows steady growth, yet its gains feel slower compared to rivals, leaving many wondering if ADA can really break past resistance. So where should investors look when both face their own hurdles? That’s where BlockDAG comes in. While others rely on speculation, BlockDAG is showing proof that rewards are already flowing. Social platforms are filled with photos and unboxing clips of the X10 miner, with users setting up devices and sharing payouts. This isn’t just talk; it’s miners at home already getting paid. For anyone searching for the best crypto to invest in now, BlockDAG stands out by combining real hardware delivery with immediate earning potential. BlockDAG: Proof in the Boxes, Proof in the Rewards BlockDAG’s biggest flex right now isn’t just numbers on a dashboard; it’s the boxes arriving at people’s doors. Across social media, users are posting photos, clips, and setup videos of the X10 miner. You can see them unboxing, plugging in, and instantly starting to mine BDAG. That kind of visibility shows BlockDAG isn’t selling hype; it’s already putting real mining gear into the hands of its backers. The community is not waiting for mainnet to find out if this works; they’re already mining and sharing payouts from home. While other coins are still tied up in speculation, here you’ve got thousands of miners being delivered worldwide. That’s why people are calling it the best crypto to invest in now, because it’s showing action, not just promises. The presale itself is backing up the momentum. BlockDAG has already raised over $407 million, with $40 million pouring in just last month. More than 312,000 holders are locked in,…
Share
BitcoinEthereumNews2025/09/18 08:52
‘Gold Pillars Crumbling?’ Strategist Questions Durability of Gold’s Geopolitical Bid

‘Gold Pillars Crumbling?’ Strategist Questions Durability of Gold’s Geopolitical Bid

Gold’s geopolitical premium may be fading as crude oil and silver eye powerful upside, with shifting global tensions and market volatility poised to redraw the
Share
Coinstats2026/03/04 10:30