Listen to the article
AI Chatbots Routinely Distort News and Facts, Major European Study Finds
A comprehensive survey by the European Broadcasting Union (EBU), supported by the BBC, has revealed alarming accuracy problems with popular AI chatbots when handling news content. The study found that these tools frequently distort information, confuse sources, and present outdated data as current facts.
The extensive research project involved 22 editorial teams from 18 countries who methodically tested four leading AI systems: ChatGPT, Microsoft Copilot, Google Gemini, and Perplexity. Researchers submitted thousands of standardized queries and meticulously compared the AI responses with actual published information.
The findings paint a troubling picture of AI reliability in news contexts. Approximately half of all responses contained significant errors, while a staggering 80% included at least minor inaccuracies. More specifically, 45% of answers presented substantial problems with factual accuracy, 31% confused or misattributed sources, and 20% contained serious fabrications including invented data and incorrect dates.
Google’s Gemini performed particularly poorly in the evaluation, with 72% of its responses containing incorrect or unverified sources. By comparison, ChatGPT demonstrated such errors in 24% of cases, while both Perplexity and Microsoft Copilot performed better at 15% each.
The study highlighted several egregious examples of misinformation. Gemini incorrectly insisted that NASA has never had astronauts stranded in space, despite the well-documented case of two astronauts who spent nine months aboard the International Space Station awaiting reentry. In another instance, ChatGPT claimed that Pope Francis was continuing his ministry weeks after a fictional death—a complete fabrication.
Researchers noted a particularly concerning pattern where chatbots would deliver incorrect information with a confident tone that masked their ignorance. In one case, a bot explicitly warned against mistaking fiction for reality while simultaneously providing fictional information as fact.
Despite these serious accuracy issues, public reliance on AI for information gathering is growing rapidly. According to an accompanying Ipsos survey of 2,000 UK residents, 42% now use chatbots to provide news summaries. This figure rises significantly among users under 35, where nearly half rely on AI-generated summaries. However, 84% of respondents indicated that even a single factual error dramatically reduces their trust in these systems.
This trend creates significant reputational risks for media organizations, as increased public reliance on automated summaries magnifies the potential damage from inaccuracies. The project, described as the largest study on the accuracy of journalistic AI assistants to date, demonstrates that these problems are systemic rather than isolated incidents.
AI developers have begun acknowledging these limitations. In September, OpenAI published a report admitting that model training sometimes encourages guesswork rather than honest admissions of ignorance. In a more dramatic example, Anthropic’s lawyers were forced to apologize to a court for submitting documents containing false quotes generated by their Claude AI model.
To address these challenges, project participants have developed practical recommendations for both developers and editors. These include requirements for transparent sourcing, principles for handling questionable data, and implementing pre-publication review mechanisms. The core recommendation is straightforward: AI systems should clearly notify users when they are uncertain, rather than inventing plausible-sounding responses.
The European Broadcasting Union warns that the proliferation of convincing but inaccurate information threatens to undermine public trust in news generally. To prevent this erosion of trust, the organization suggests that newsrooms and technology companies must establish common standards prioritizing accuracy over speed and verification over impact.
As AI continues to integrate into the information ecosystem, this research underscores the critical importance of developing systems that prioritize factual reliability over confident-sounding but potentially false narratives.
Verify This Yourself
Use these professional tools to fact-check and investigate claims independently
Reverse Image Search
Check if this image has been used elsewhere or in different contexts
Ask Our AI About This Claim
Get instant answers with web-powered AI analysis
Related Fact-Checks
See what other fact-checkers have said about similar claims
Want More Verification Tools?
Access our full suite of professional disinformation monitoring and investigation tools


10 Comments
It’s disheartening to see how often these AI systems are getting basic facts wrong. In an age of increasing digital information consumption, we need to be able to trust the sources we rely on. This study suggests the technology still has a long way to go before it can be considered a reliable replacement for human journalism and fact-checking.
This is a wake-up call for the AI industry. Chatbots that regularly distort facts and spread misinformation could have serious consequences, especially in sensitive domains like current events and news reporting. I hope the findings of this study lead to meaningful changes to improve the reliability and accountability of these systems.
While AI chatbots can be useful tools, this research demonstrates the need for significant improvement in their handling of news and factual content. The high rates of error, source confusion, and outright fabrication are unacceptable, especially for systems that are being increasingly integrated into mainstream media and information platforms. More rigorous testing and accountability measures are clearly needed.
The performance of Google’s Gemini system is particularly concerning given the company’s market dominance and influence. If their AI is prone to such high rates of inaccuracy, it raises questions about the integrity of the information being shared on a massive scale. Stricter regulation and accountability measures may be necessary to address this problem.
Wow, 80% of responses containing inaccuracies is a pretty staggering statistic. I wonder what the implications are for the widespread use of AI chatbots, particularly in journalism and other fields where factual integrity is paramount. This study highlights the need for more rigorous testing and oversight.
This study highlights the critical importance of human editorial oversight and fact-checking, even as AI continues to advance. Relying too heavily on chatbots to handle news and information could lead to the widespread propagation of misinformation. I hope this serves as a wake-up call for the industry to prioritize reliability and accuracy over speed and efficiency.
It’s disappointing to see such poor performance from these leading AI systems. While the technology is advancing rapidly, it’s clear there is still a lot of work to be done to ensure chatbots can reliably handle news content and fact-based information. I hope this study prompts further research and improvements in this area.
As an avid consumer of news, I find these results quite troubling. If AI chatbots cannot be trusted to accurately convey factual information, that poses a real risk of contributing to the spread of disinformation. More rigorous testing and oversight seems essential to ensure these tools are not inadvertently harming the public’s access to reliable information.
This is certainly concerning. AI chatbots should be reliable sources of information, especially when it comes to news and current events. It’s worrying to see how often they distort facts and present outdated data. We need to carefully evaluate the accuracy of these tools before relying on them too heavily.
I’m glad to see this comprehensive study shine a light on the accuracy issues with popular AI chatbots. As these tools become more ubiquitous, it’s critical that we understand their limitations and vulnerabilities, particularly when it comes to sensitive domains like current events and news reporting. This research should prompt a closer look at how we can ensure these systems are reliable and trustworthy.