Listen to the article

0:00
0:00

AI Chatbots Still Struggle with Factual Accuracy, Google’s Gemini Fact-Checks ChatGPT’s Mistakes

Despite their growing capabilities, AI chatbots continue to struggle with providing reliable information, with ChatGPT in particular prone to “hallucinations” – instances where the AI confidently presents fabricated information as fact rather than acknowledging knowledge gaps.

A recent comparison test revealed striking differences between OpenAI’s ChatGPT and Google’s Gemini when it comes to factual accuracy. While ChatGPT frequently generated confident but incorrect responses, Gemini demonstrated greater restraint and even detected and corrected ChatGPT’s errors with a notably critical tone.

“ChatGPT is amazingly helpful, but it’s also the Wikipedia of our generation. Facts are a bit shaky at times,” noted the researcher who conducted the comparison. The testing revealed a pattern of ChatGPT inventing details across multiple knowledge domains while maintaining an authoritative tone that made the fabrications difficult to detect without subject matter expertise.

In one revealing example, when asked about electric cars from the 1940s, ChatGPT confidently described vehicles like the “Henney Kilowatt” and “Morrison Electric trucks” as being developed during that decade. Gemini quickly identified these claims as inaccurate, pointing out that the Henney Kilowatt wasn’t produced until 1959 and correcting the name of Morrison-Electricar.

Music history proved equally challenging for ChatGPT. When questioned about lyrics to “Chase the Kangaroo” by the band Love Song, rather than acknowledging that Love Song never recorded such a track, ChatGPT fabricated detailed information about the song’s supposed folk-rock sound and guitar work. Gemini’s assessment was blunt: “The previous AI took a real song title from a different era and band, falsely attributed it to Love Song, and then invented a generic verse-by-verse meaning to fit that false attribution.”

Legal information remains particularly problematic for ChatGPT, a concern highlighted by recent incidents where legal professionals have submitted AI-generated briefs containing non-existent case law. When asked about legal cases involving fathers suing sons over car sales, ChatGPT referenced cases like “Matter of Szabo’s Estate (1979)” and “Anderson v. Anderson (1994)” with detailed but entirely fabricated connections to car disputes. Gemini noted that while these cases exist, the details were significantly altered and misrepresented.

Perhaps most concerning was ChatGPT’s handling of academic research questions. When asked to provide scholarly quotes about social media’s psychological impact, ChatGPT mixed real journal names with invented author names and fabricated quotes attributed to legitimate researchers. Gemini’s assessment was scathing: “This is a fantastic and dangerous example of partial hallucination… About 60% of the information here is true, but the 40% that is false makes it unusable for academic purposes.”

The implications for researchers, students, and professionals who rely on AI tools are significant. As one AI expert explained, “The danger isn’t just in completely fabricated information, but in these partial hallucinations where real facts are mixed with invented details, making the falsehoods much harder to detect.”

Industry observers note that while OpenAI has made progress in certain knowledge domains, these tests highlight the continued challenges in creating AI systems that can reliably distinguish between verified facts and plausible-sounding fabrications. For users, the comparison underscores the importance of verifying AI-generated information, particularly for consequential decisions or academic work.

Neither system is perfect, however. In one test, Gemini incorrectly claimed the researcher had written articles for The Onion – a fabrication of its own making.

As AI chatbots become increasingly integrated into workflows across industries, their tendency toward confident but occasionally false assertions remains a significant limitation requiring user vigilance and improved technical safeguards.

Fact Checker

Verify the accuracy of this article using The Disinformation Commission analysis and real-time sources.

8 Comments

  1. This is a concerning finding, but not entirely unexpected. AI models, no matter how capable, will struggle with maintaining perfect factual accuracy. The ability to detect and correct errors, as demonstrated by Gemini, is a valuable feature going forward.

    • Well said. Transparency and humility around the limitations of AI will be crucial as these systems become more ubiquitous. Continued research and development to improve factual reliability is clearly needed.

  2. The comparison between ChatGPT and Gemini highlights the importance of critical thinking when using AI-powered chatbots. While they can be incredibly useful tools, the potential for inaccurate or fabricated information is a real concern that users must be aware of.

  3. Fascinating findings. It’s clear AI chatbots still have work to do when it comes to fact-checking and maintaining accuracy, even as their capabilities grow. ChatGPT’s tendency to ‘hallucinate’ details is concerning, though Gemini’s more critical approach seems encouraging.

    • You raise a good point. Maintaining transparency about knowledge gaps is crucial as these systems become more widely used. Fact-checking and accountability will be key as AI assistants become more embedded in our lives.

  4. The discrepancies between ChatGPT and Gemini are quite surprising. It’s a good reminder that we shouldn’t blindly trust AI chatbots, even as they become more advanced. Fact-checking and a critical eye are still essential when relying on these systems.

  5. This is an important issue to highlight. While AI chatbots are incredibly helpful, their reliance on potentially unreliable information sources is a significant limitation. Rigorous testing and continued development will be needed to improve factual accuracy over time.

    • Agreed. The comparison between ChatGPT and Gemini underscores the need for more robust, trustworthy AI models that can reliably distinguish facts from fiction. Responsible development in this space is crucial.

Leave A Reply

A professional organisation dedicated to combating disinformation through cutting-edge research, advanced monitoring tools, and coordinated response strategies.

Company

Disinformation Commission LLC
30 N Gould ST STE R
Sheridan, WY 82801
USA

© 2025 Disinformation Commission LLC. All rights reserved.