Playing Two Truths and a Lie with LLMs

by | Nov 10, 2025

Every day, I use ChatGPT, Claude and Grok in my personal life. While it’s been two and a half years, I still marvel at the superpowers that have been bequeathed to us mere mortals by OpenAI, Anthropic, and X.AI and at a trivial cost (I’m a paid subscriber to all three).

In my personal life, these LLMs have become my primary source of entertainment, productivity, critical information and sometimes even advice, often at the expense of Google, website visits and various apps now gathering dust.

The other day, I was in a meeting and needed pricing for Shopify hosting, so I asked ChatGPT. Being cautious about hallucinations, I asked Claude the same thing and got the same result. The problem was, the data I was given didn’t match what was being discussed in the meeting. I almost raised my hand to correct it, but instead I went old school and visited the Shopify website (where I should have started). To my surprise, both ChatGPT and Claude were wrong, and the presentation data was accurate.

That moment reminded me that as much as I enjoy my LLMs of choice, I’m constantly playing two truths and a lie. The issue is that responses sound so confident that it’s extremely hard to know when I’m being lied to. Regardless of how far the tools evolve, hallucinations remain part of the equation.

In my personal life, the consequences of a mistruth are minor and easily offset by speed and convenience. But in business, these errors can have serious consequences when decisions are made based on data that’s inaccurate.

For example, Deloitte—a premier provider of strategic advisory services and an early leader in generative AI—was found to have published AI-generated errors, including references to non-existent academic research and a fabricated court quote, in a report to the Government of Australia. This resulted in real consequences: negative PR and a refund to the client. It’s a reminder that even the best and most experienced can get tripped up. (Source: Fortune, October 7, 2025)

Every day, I observe individuals in organizations of all sizes bringing their preferred LLMs into the workplace and sharing unverified outputs. When asked, most are thrilled with how much faster they can complete tasks and are largely unbothered by the risk that the results may not be factually accurate.

Ask someone if they routinely fact-check their LLM, and many don’t even realize it’s an option. Rarely do users click through to the sources or citations that are available.

When I asked ChatGPT, “What percentage of users of an LLM check a source or citation per query?” the response I received was telling:

“Given the observed declines in aggregate referral traffic (i.e., clickouts) from ChatGPT, one plausible hypothesis is that click-through (i.e., following citations) is a relatively rare event in typical user sessions.”

When we tackled this challenge with Capacity Answer Engine—an enterprise knowledge management platform that works with a company’s own or licensed data—we put citation and verification front and center. In the product, citations are prominent and inline, making it clear where any fact, stat or conclusion comes from. Clicking a citation takes you directly to the exact page of a document, slide in a deck or moment in an audio or video file.

Ultimately, public LLMs like ChatGPT, Claude and Grok, as well as enterprise solutions like Capacity Answer Engine, are incredible at what they do: synthesizing massive amounts of data into usable insights. But the onus remains on us, the users, to verify that what we’re receiving is, in fact, the right answer.

Increase
agent efficiency with
AI
Discover AI-driven transformation
Book a demo