AI Hallucinations and Reliability cartoon - Marketoonist

The NYT published a fascinating article last month on the conundrum of AI accuracy and reliability. They found that even as AI models were getting more powerful, they generated more errors, not fewer.

In OpenAI’s own tests, their newest models hallucinated at higher rates than their previous models. One of their benchmarks is called a SimpleQA test, based on general questions. OpenAI found their most powerful o3 model hallucinated 51% of the time, up from 44% in their earlier o1 model.

In their PersonQA test, based on questions about public figures, the o3 model hallucinated 33% of the time, double the rate of their earlier model.

Some of this growing problem relates to the nature of reasoning systems, as AI works through more complex problems in multiple steps, compounding the errors of each step.

Amr Awadallah, former Google exec and CEO of Vectara, claims that hallucinations are just part of the nature of AI models. As he put it:

“Despite our best efforts, they will always hallucinate. That will never go away.”

Last month, I wrote about the “Garbage In, Garbage Out” challenge of AI systems. I quoted how Greg Kihlsrom termed the outputs as “confident nonsense.”

With AI adoption full steam ahead, this raises the urgency for business leaders to figure out how to work around “confident nonsense.” Yet 64% of marketing teams are adopting AI without an AI roadmap or strategy, according to the AI Marketing Institute.

Some are trying to solve the hallucination problem by adding multiple AI systems to fact-check each other. Yet with each AI model bringing their own baggage, I’ve heard this described as a “turtles all the way down” problem, which inspired this week’s cartoon.

I like how Pratik Verma, CEO of Okahu, framed the challenge:

“You spend a lot of time trying to figure out which responses are factual and which aren’t. Not dealing with these errors properly basically eliminates the value of AI systems, which are supposed to automate tasks for you.”

Here are a few related cartoons I’ve drawn over the years: