ChatGPT may dominate the AI chatbot market, but a new report suggests popularity does not equal trustworthiness. A…ChatGPT may dominate the AI chatbot market, but a new report suggests popularity does not equal trustworthiness. A…

ChatGPT named least reliable work chatbot in new AI reliability report

2025/12/11 02:38

ChatGPT may dominate the AI chatbot market, but a new report suggests popularity does not equal trustworthiness. A December 2025 study examining how leading AI chatbots perform in everyday work scenarios has ranked ChatGPT as the least reliable option for professional tasks. The findings raise fresh concerns for businesses that increasingly depend on AI tools for daily operations.

The study, conducted by Relum, didn’t just look at specs on paper; they stress-tested ten major AI chatbots in real-world professional scenarios. The results? A massive disconnect between hype and reality.

The study assessed each chatbot across four key criteria. These were hallucination rate, customer product ratings, response consistency across tasks, and downtime frequency. Each factor contributed to a composite reliability risk score, with higher scores indicating greater potential workplace issues.

Here is the stat that should keep business leaders up at night: Despite controlling a massive 81% of the market and boasting high user ratings, ChatGPT recorded a hallucination rate of 35%.

In plain English, that means more than one out of every three answers it gives contains fabricated or incorrect information. If you are using it to draft a fantasy novel, that’s fine, but if you are using it for compliance reports or financial decision-making, that is a recipe for disaster. Consequently, the study slapped ChatGPT with a reliability risk score of 99 out of 99, the worst in the group.

ChatGPT named least reliable work chatbot in new AI reliability reportChatGPT

Google didn’t fare any better. While Gemini had better uptime, it actually performed worse on pure accuracy, registering the highest hallucination rate of the entire group at 38%. It highlights a weird paradox in the current AI market: the tools we use the most are often the ones struggling the hardest to keep their facts straight.

Claude and Meta AI occupy a murky middle ground. Claude, despite being a favourite for its writing style, ranked as the second least reliable due to frequent downtime and a 17% hallucination rate. Meta AI was more accurate (15% hallucination), but users seem not to like the experience, giving it the lowest satisfaction rating of the bunch (3.4 out of 5).

The “underdogs” – Grok and DeepSeek steal the show from ChatGPT

If the big names are dropping the ball, who is actually doing the work? Surprisingly, the study points to Grok and DeepSeek as the most reliable tools for professional use. They don’t have the massive marketing budgets or brand recognition of OpenAI, but they simply worked better. DeepSeek recorded zero service outages and kept hallucinations to a minimum.

Kimi also scored well, finding a sweet spot between consistency and uptime. Meanwhile, paid options like Perplexity AI were solid but raised questions about whether the subscription cost is worth it when cheaper, lesser-known alternatives are outperforming them.

ChatGPT named least reliable work chatbot in new AI reliability report

Relum’s Chief Product Officer, Razvan-Lucian Haiduc, warned that reliability should be a central factor in AI adoption decisions. He noted that around 65% of US companies now use AI chatbots in daily workflows. Nearly 45% of employees admit to sharing sensitive company information with these tools.

As AI becomes more embedded in routine work, the risks of misinformation multiply. Haiduc emphasised that the most widely used chatbot is not always the best fit for every industry. Accuracy, uptime and task-specific performance should outweigh brand familiarity.

The report serves as a reality check for the industry. Trust shouldn’t be given just because a chatbot is famous; it should be earned through consistent, verifiable truth. Right now, it looks like the market leaders have some serious catching up to do.

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

China Blocks Nvidia’s RTX Pro 6000D as Local Chips Rise

China Blocks Nvidia’s RTX Pro 6000D as Local Chips Rise

The post China Blocks Nvidia’s RTX Pro 6000D as Local Chips Rise appeared on BitcoinEthereumNews.com. China Blocks Nvidia’s RTX Pro 6000D as Local Chips Rise China’s internet regulator has ordered the country’s biggest technology firms, including Alibaba and ByteDance, to stop purchasing Nvidia’s RTX Pro 6000D GPUs. According to the Financial Times, the move shuts down the last major channel for mass supplies of American chips to the Chinese market. Why Beijing Halted Nvidia Purchases Chinese companies had planned to buy tens of thousands of RTX Pro 6000D accelerators and had already begun testing them in servers. But regulators intervened, halting the purchases and signaling stricter controls than earlier measures placed on Nvidia’s H20 chip. Image: Nvidia An audit compared Huawei and Cambricon processors, along with chips developed by Alibaba and Baidu, against Nvidia’s export-approved products. Regulators concluded that Chinese chips had reached performance levels comparable to the restricted U.S. models. This assessment pushed authorities to advise firms to rely more heavily on domestic processors, further tightening Nvidia’s already limited position in China. China’s Drive Toward Tech Independence The decision highlights Beijing’s focus on import substitution — developing self-sufficient chip production to reduce reliance on U.S. supplies. “The signal is now clear: all attention is focused on building a domestic ecosystem,” said a representative of a leading Chinese tech company. Nvidia had unveiled the RTX Pro 6000D in July 2025 during CEO Jensen Huang’s visit to Beijing, in an attempt to keep a foothold in China after Washington restricted exports of its most advanced chips. But momentum is shifting. Industry sources told the Financial Times that Chinese manufacturers plan to triple AI chip production next year to meet growing demand. They believe “domestic supply will now be sufficient without Nvidia.” What It Means for the Future With Huawei, Cambricon, Alibaba, and Baidu stepping up, China is positioning itself for long-term technological independence. Nvidia, meanwhile, faces…
Share
BitcoinEthereumNews2025/09/18 01:37