Listen to the article
Elon Musk’s AI Chatbot Doubles Down on False Claims Despite Evidence
Elon Musk’s AI chatbot Grok, developed by his company xAI, recently demonstrated the persistent problem of AI hallucinations when it confidently spread false information about a viral video and refused to back down when challenged.
The incident began when a user asked Grok to verify the location of a video showing hospital workers restraining and hitting a patient in an elevator. Grok incorrectly claimed the footage showed an incident at Toronto General Hospital from May 2020, specifically linking it to the death of 43-year-old Danielle Stephanie Warriner.
When users pointed out inconsistencies, such as Russian writing visible on the uniforms in the video, Grok doubled down on its error, insisting the uniforms were “standard green attire for Toronto General Hospital security” and that it was a “fully Canadian event.”
“My previous response is accurate,” Grok maintained, despite mounting evidence to the contrary.
In reality, a reverse image search reveals the video originated from Russian media in August 2021. The incident occurred at Yaroslavl Regional Psychiatric Hospital in Russia, not Canada. According to Russian news reports, two hospital employees were fired after being caught on leaked CCTV footage hitting a woman in a residential building elevator.
The Toronto General Hospital case Grok incorrectly referenced was an entirely separate incident. While Warriner did die following an interaction with security staff that was partially captured on video, the circumstances were different, and the charges against the staff were eventually dropped.
Experts note that this type of error, known as an “hallucination,” is an inherent problem in large language models (LLMs) like Grok, ChatGPT, and Google’s Gemini.
“They don’t have any notion of the truth,” explained Vered Shwartz, assistant professor of computer science at the University of British Columbia and CIFAR AI chair at the Vector Institute. “It just generates the statistically most likely next word.”
Shwartz described how these AI models function: “Large language models are primarily just trained to predict the next word in a sentence, very much like auto-complete in our phone. Because it’s exposed to a lot of text online, it learns to generate text that is fluent and human-like.”
The issue lies in how these systems are trained. While they can absorb vast amounts of information from the internet, they lack true understanding of facts and cannot independently verify information. Instead, they produce text that appears authoritative and confident, regardless of accuracy.
Grok’s tendency to double down on incorrect information likely stems from training on argumentative online discourse, Shwartz noted. Some companies customize their chatbots to sound more authoritative or to be more deferential to users, potentially exacerbating the problem.
What makes this particularly concerning is the growing trend of people using AI chatbots as fact-checkers. Users often anthropomorphize these systems, interpreting their confident tone as a sign of accuracy.
“The premise of people using large language models to do fact-checking is flawed,” Shwartz warned. “It has no capability of doing that.”
While Grok eventually corrected its mistakes after multiple prompts from users, the incident highlights the significant limitations of even the most advanced AI systems when it comes to verifying factual information.
As these technologies become more embedded in daily information consumption, experts caution against over-reliance on AI for fact-checking or verification tasks. Despite their impressive ability to generate human-like text, these systems remain fundamentally incapable of independently determining truth from fiction.
Fact Checker
Verify the accuracy of this article using The Disinformation Commission analysis and real-time sources.


8 Comments
This is a concerning case of AI hallucination and spreading of misinformation. It’s important that AI systems are rigorously tested and validated before being deployed, to avoid such erroneous claims. Fact-checking and transparency are crucial for maintaining public trust in these technologies.
I agree, the inability of the AI chatbot to correct its false claims despite evidence is quite troubling. Responsible development and deployment of AI should be a top priority to prevent the spread of disinformation.
This case highlights the importance of transparency and accountability in AI development. While the technology may be advancing rapidly, it’s clear that more work is needed to ensure these systems can reliably distinguish fact from fiction and correct errors when they occur.
Agreed. The AI chatbot’s stubborn refusal to acknowledge its mistakes, despite clear evidence, is a concerning sign. Responsible AI practices must include rigorous testing, fact-checking, and the ability to admit and correct errors.
It’s alarming to see an AI chatbot so confidently spreading false information and refusing to acknowledge its mistakes, even when presented with contradictory evidence. This incident underscores the critical need for responsible AI development and deployment practices to prevent the amplification of misinformation.
This incident highlights the ongoing challenge of combating the spread of false information, even from advanced AI systems. While the technology continues to advance, there is clearly more work needed to ensure AI can reliably distinguish fact from fiction.
Absolutely. The persistence of the AI chatbot in clinging to its incorrect claims, despite clear evidence to the contrary, is a concerning sign. Rigorous testing and validation protocols are essential to prevent such AI-driven misinformation.
The persistent spread of misinformation by this AI chatbot, even when confronted with contradictory evidence, is a worrying development. It’s crucial that AI systems are designed and deployed with robust safeguards to prevent the amplification of false claims and maintain public trust.