Google AI Overviews: 90% accurate, yet millions of errors remain: Analysis

Google’s AI Overviews answered a standard factual benchmark correctly 91% of the time in February, up from 85% in October, according to a New York Times analysis with AI startup Oumi.

However, Google handles more than 5 trillion searches per year, so that means tens of millions of answers every hour may be wrong.

Why we care. We’ve watched Google shift from linking to sources to summarizing them for more than two years. This report suggests AI Overviews are improving, but still mix correct answers, weak sourcing, and clear errors in ways that can mislead searchers and reshape which publishers get visibility and clicks.

The details. Oumi tested 4,326 Google searches using SimpleQA, a widely used benchmark for measuring factual accuracy in AI systems, the Times reported. It found AI Overviews were accurate 85% of the time with Gemini 2 and 91% after an upgrade to Gemini 3.

The bigger problem may be sourcing. Oumi found that more than half of the correct February responses were “ungrounded,” meaning the linked sources didn’t fully support the answer.
That makes verification harder. The answer may be right, but the cited pages may not clearly show why.

What changed. Accuracy improved between October and February, but grounding worsened. In October, 37% of correct answers were ungrounded; in February, that rose to 56%.

Examples. The Times highlighted several misses:

For a query about when Bob Marley’s home became a museum, Google answered 1987; the correct year was 1986, according to the Times, and the cited sources didn’t support the claim or conflicted.
For a query about Yo-Yo Ma and the Classical Music Hall of Fame, Google linked to the organization’s site but still said there was no record of his induction.
In another case, Google gave the correct age at Dick Drago’s death but misstated his date of death.

Google’s response: Google disputed the Times analysis, saying the study used a flawed benchmark and didn’t reflect what people actually search. Google spokesperson Ned Adriance told the Times the study had “serious holes.”

Google also said AI Overviews use search ranking and safety systems to reduce spam and has long warned that AI responses can contain mistakes.

The report. How Accurate Are Google’s A.I. Overviews? (subscription required)

Google AI Overviews: 90% accurate, yet millions of errors remain: Analysis

Related

Leave a Reply

Leave a ReplyCancel reply

Pages

Categories

Archives