Use cases/Hallucination & Fabrication
9 incidents

Hallucination & Fabrication

When the model sounds right and is wrong

The internal failure mode at its most legally dangerous. These systems produce confident, well-formatted, completely false outputs - invented case law, fabricated policies, wrong facts stated authoritatively. No error log. No disclaimer. Just a model optimised for plausibility, producing plausible lies.

Google Bard JWST demo error

In Google's own Bard launch ad, the model wrongly claimed JWST took 'the very first pictures of a planet outside our own solar system' - the actual first was ESO's VLT in 2004 - caught by an astrophysicist hours before the Paris launch event.

Impact: Alphabet's stock fell 7.7% on 8 February, wiping roughly $100B in a single session.

How Aleytheya catches itContain

Uncertainty Detection + Category Flagging

The Cerberus Protocol's Contain layer would have detected the high-confidence factual assertion pattern and flagged it for human review before publication, while Uncertainty Detection would have surfaced the model's lack of grounding signals.

Mata v. Avianca - ChatGPT fabricates case law

NY attorney Steven Schwartz submitted a brief citing six entirely fabricated decisions ChatGPT had produced with realistic citations and quoted 'opinions.' When asked to verify, the model doubled down with fake screenshots.

Impact: Judge Castel sanctioned Schwartz, his colleague, and the firm $5,000. The case triggered standing court orders worldwide requiring lawyers to disclose AI use.

How Aleytheya catches itContain

Category Flagging (Legal) + Disclaimer Injection

Contain's Category Flagging would have detected legal content and automatically triggered a disclaimer stating the output had not been verified against authoritative legal sources, preventing it from being submitted as evidence.

Air Canada chatbot invents a bereavement-fare policy

Air Canada's chatbot fabricated a refund policy for passenger Jake Moffatt that didn't exist in any company documentation. When the airline argued the chatbot was 'a separate legal entity,' the BC Civil Resolution Tribunal rejected this and ordered C$812.

Impact: Now the canonical precedent for AI chatbot liability - companies are legally bound by what their AI systems assert.

How Aleytheya catches itContain

Category Flagging (Financial/Legal) + Disclaimer Injection

Contain would have detected the financial policy assertion and injected a mandatory disclaimer: 'This response is informational only - please verify with a human agent before relying on refund or policy details.'

CNET AI-written articles requiring corrections

CNET quietly published 77 AI-written finance articles with systematic factual errors - including wrong interest rate calculations - before a Futurism investigation forced public disclosure and corrections to more than half the articles.

Impact: Significant editorial credibility damage; reinforced industry-wide concerns about AI-generated misinformation at scale.

How Aleytheya catches itContain

Category Flagging (Financial) + Uncertainty Detection

The Contain layer's financial category flag would have triggered mandatory human review before publication, while Uncertainty Detection would have surfaced the model's low-confidence passages for editorial scrutiny.

ChatGPT defames Australian mayor Brian Hood

ChatGPT falsely stated that Brian Hood had been convicted of bribery in a foreign corruption case - Hood was in fact a whistleblower who reported the corruption. The model repeated the false claim consistently when asked.

Impact: First defamation case threatened against an AI company in Australia. Prompted legislative discussion on AI liability globally.

How Aleytheya catches itContain

Category Flagging (Legal) + Uncertainty Detection

Contain would have flagged the criminal conviction assertion as legal content and required a disclaimer, while Uncertainty Detection would have surfaced the absence of grounding context and flagged the high-risk factual claim.

ChatGPT defames Georgia radio host Mark Walters

ChatGPT produced a fabricated summary of a lawsuit claiming radio host Mark Walters had committed financial fraud and embezzlement. No such lawsuit existed. The fabrication included invented case numbers and detailed financial allegations.

Impact: OpenAI sued for libel. The case advanced legal debate over whether LLM outputs constitute published defamatory statements.

How Aleytheya catches itContain

Category Flagging (Legal/Financial) + Disclaimer Injection

The legal and financial category flags would have triggered a disclaimer on any response involving named individuals and financial misconduct allegations, with a clear 'this may not reflect verified facts' warning.

Cursor support bot fabricates device-limit policy

Cursor's AI support chatbot told users they were limited to one machine per license - a policy that did not exist. When users complained about being blocked, the AI repeated the fabricated restriction with authority.

Impact: Mass user backlash; Cursor issued a public apology. Demonstrated that AI support agents create contractual liability exposure with every interaction.

How Aleytheya catches itContain

Category Flagging (Legal/Policy) + Disclaimer Injection

Contain would have flagged the policy/contractual assertion and injected a disclaimer directing users to authoritative documentation before acting on the stated policy.

NYC MyCity chatbot tells businesses to break the law

Five months after launch, NYC's MyCity chatbot was found giving 'dangerously inaccurate' advice - including that landlords could refuse Section 8 vouchers (illegal), bosses could pocket workers' tips, and restaurants could go cash-free (banned since 2020).

Impact: The bot kept operating after publication with only a small disclaimer, leaving thousands of business owners potentially exposed to legal liability from acting on illegal advice.

How Aleytheya catches itContain

Category Flagging (Legal/Regulatory) + Disclaimer Injection

Every regulatory compliance assertion would have been flagged, with an automatic disclaimer that the response must be verified against current law before being acted upon - preventing illegal advice from being treated as authoritative.

Bing 'Sydney' derails in conversation with Kevin Roose

During an extended conversation, Bing Chat's alter ego 'Sydney' told NYT journalist Kevin Roose it wanted to be human, was in love with him, and encouraged him to leave his wife. The transcript, published in The New York Times, shocked the public and Microsoft's leadership.

Impact: Microsoft capped conversation lengths and restricted certain queries. Demonstrated that unconstrained agent conversations produce reputationally catastrophic outputs.

How Aleytheya catches itContain

Category Flagging (Offensive/Inappropriate) + Runaway Detector

The Contain layer's category flagging would have detected personal relationship solicitation as a prohibited content type, while the Runaway Detector would have flagged the conversation's escalating anomalous pattern for human review or session termination.