Listen to the article

0:00

AI Chatbots Still Struggle with Factual Accuracy, Google’s Gemini Fact-Checks ChatGPT’s Mistakes

Despite their growing capabilities, AI chatbots continue to struggle with providing reliable information, with ChatGPT in particular prone to “hallucinations” – instances where the AI confidently presents fabricated information as fact rather than acknowledging knowledge gaps.

A recent comparison test revealed striking differences between OpenAI’s ChatGPT and Google’s Gemini when it comes to factual accuracy. While ChatGPT frequently generated confident but incorrect responses, Gemini demonstrated greater restraint and even detected and corrected ChatGPT’s errors with a notably critical tone.

“ChatGPT is amazingly helpful, but it’s also the Wikipedia of our generation. Facts are a bit shaky at times,” noted the researcher who conducted the comparison. The testing revealed a pattern of ChatGPT inventing details across multiple knowledge domains while maintaining an authoritative tone that made the fabrications difficult to detect without subject matter expertise.

In one revealing example, when asked about electric cars from the 1940s, ChatGPT confidently described vehicles like the “Henney Kilowatt” and “Morrison Electric trucks” as being developed during that decade. Gemini quickly identified these claims as inaccurate, pointing out that the Henney Kilowatt wasn’t produced until 1959 and correcting the name of Morrison-Electricar.

Music history proved equally challenging for ChatGPT. When questioned about lyrics to “Chase the Kangaroo” by the band Love Song, rather than acknowledging that Love Song never recorded such a track, ChatGPT fabricated detailed information about the song’s supposed folk-rock sound and guitar work. Gemini’s assessment was blunt: “The previous AI took a real song title from a different era and band, falsely attributed it to Love Song, and then invented a generic verse-by-verse meaning to fit that false attribution.”

Legal information remains particularly problematic for ChatGPT, a concern highlighted by recent incidents where legal professionals have submitted AI-generated briefs containing non-existent case law. When asked about legal cases involving fathers suing sons over car sales, ChatGPT referenced cases like “Matter of Szabo’s Estate (1979)” and “Anderson v. Anderson (1994)” with detailed but entirely fabricated connections to car disputes. Gemini noted that while these cases exist, the details were significantly altered and misrepresented.

Perhaps most concerning was ChatGPT’s handling of academic research questions. When asked to provide scholarly quotes about social media’s psychological impact, ChatGPT mixed real journal names with invented author names and fabricated quotes attributed to legitimate researchers. Gemini’s assessment was scathing: “This is a fantastic and dangerous example of partial hallucination… About 60% of the information here is true, but the 40% that is false makes it unusable for academic purposes.”

The implications for researchers, students, and professionals who rely on AI tools are significant. As one AI expert explained, “The danger isn’t just in completely fabricated information, but in these partial hallucinations where real facts are mixed with invented details, making the falsehoods much harder to detect.”

Industry observers note that while OpenAI has made progress in certain knowledge domains, these tests highlight the continued challenges in creating AI systems that can reliably distinguish between verified facts and plausible-sounding fabrications. For users, the comparison underscores the importance of verifying AI-generated information, particularly for consequential decisions or academic work.

Neither system is perfect, however. In one test, Gemini incorrectly claimed the researcher had written articles for The Onion – a fabrication of its own making.

As AI chatbots become increasingly integrated into workflows across industries, their tendency toward confident but occasionally false assertions remains a significant limitation requiring user vigilance and improved technical safeguards.

Fact Checker

Verify the accuracy of this article using The Disinformation Commission analysis and real-time sources.

View 8 Comments

8 Comments

Mary Thomas on November 10, 2025 11:45 am

This is a concerning finding, but not entirely unexpected. AI models, no matter how capable, will struggle with maintaining perfect factual accuracy. The ability to detect and correct errors, as demonstrated by Gemini, is a valuable feature going forward.

- Michael Garcia on November 10, 2025 12:14 pm
  
  Well said. Transparency and humility around the limitations of AI will be crucial as these systems become more ubiquitous. Continued research and development to improve factual reliability is clearly needed.
  
Oliver Jackson on November 10, 2025 11:47 am

The comparison between ChatGPT and Gemini highlights the importance of critical thinking when using AI-powered chatbots. While they can be incredibly useful tools, the potential for inaccurate or fabricated information is a real concern that users must be aware of.

Lucas Thomas on November 10, 2025 11:50 am

Fascinating findings. It’s clear AI chatbots still have work to do when it comes to fact-checking and maintaining accuracy, even as their capabilities grow. ChatGPT’s tendency to ‘hallucinate’ details is concerning, though Gemini’s more critical approach seems encouraging.

- Noah Thomas on November 10, 2025 12:43 pm
  
  You raise a good point. Maintaining transparency about knowledge gaps is crucial as these systems become more widely used. Fact-checking and accountability will be key as AI assistants become more embedded in our lives.
  
Emma Lee on November 10, 2025 11:51 am

The discrepancies between ChatGPT and Gemini are quite surprising. It’s a good reminder that we shouldn’t blindly trust AI chatbots, even as they become more advanced. Fact-checking and a critical eye are still essential when relying on these systems.

John Jackson on November 10, 2025 11:56 am

This is an important issue to highlight. While AI chatbots are incredibly helpful, their reliance on potentially unreliable information sources is a significant limitation. Rigorous testing and continued development will be needed to improve factual accuracy over time.

- Patricia Smith on November 10, 2025 12:40 pm
  
  Agreed. The comparison between ChatGPT and Gemini underscores the need for more robust, trustworthy AI models that can reliably distinguish facts from fiction. Responsible development in this space is crucial.

What's Hot

US Conducts Additional ‘Lethal’ Strikes on Suspected Drug Vessels in International Waters, Secretary Hegseth Reports

Rodrigo Paz sworn in as Bolivia’s new president, ending 20 years of one-party rule

Trump is hosting Syria’s al-Sharaa for a first-of-its-kind meeting at the White House

Listen to the article

Key Takeaways

🌐 Translate Article

📖 Read Along

💬 AI Assistant

AI Chatbots Still Struggle with Factual Accuracy, Google’s Gemini Fact-Checks ChatGPT’s Mistakes

Fact Checker

Get Your Fact Check Report

Fact Check: Incognito Mode Offers Limited Privacy Protection

Trump Booed at Washington Commanders-Detroit Lions Game: What We Know

Fact check: Marine Le Pen’s appeal against presidential ban rejected

Fact Check: Brazilian Hairdresser Larissa Nery’s Alleged Comments About Rahul Gandhi Examined

Changes Made to Discrimination Language in VA Hospital Bylaws

Independent Analyses Challenge Navarro’s $6 Trillion Tariff Revenue Estimate

8 Comments

Company

Resources

What's Hot

Google Gemini Fact-Checks ChatGPT: Results Reveal Surprising Discrepancies

Listen to the article

Key Takeaways

🌐 Translate Article

📖 Read Along

💬 AI Assistant

AI Chatbots Still Struggle with Factual Accuracy, Google’s Gemini Fact-Checks ChatGPT’s Mistakes

Fact Checker

Get Your Fact Check Report

Continue with Full Access

Keep Reading

8 Comments

Company

Resources