How good are generalist AI tools at identifying public figures in videos?

Two weeks ago, I was stumped by several TikTok accounts that used AI avatars to spread misinformation.

The videos appeared to use deepfakes of real people, but I didn’t recognize the faces. I tried using Google Lens to reverse search stills from the videos but the results were mixed. The tool – which is reachable by right-clicking on an image in Chrome – often refused my request with the message, “Results for people are limited.” While I recognized several of the deepfaked speakers, the lack of matches meant I had to leave several others out of my article.

I tried the same searches with AI Mode, a feature in Google Search. It was more willing to give results – but also more likely to make mistakes. For example, it misidentified former Univision anchor Jorge Ramos as CNN’s Anderson Cooper, even though the two are hardly similar.

Multimodality is one of the most promising aspects of generative AI. You should be able to combine text and visual elements into a prompt and receive a response that engages with both elements, courtesy of computer vision.

This should in principle make AI search tools a great ally for fact-checkers and other investigators. Images and videos used out of context are a dominant form of widespread misinformation. Deepfaked videos and AI-generated images of people are also surging.

And yet AI tools can badly bungle image searches. Full Fact found several instances of Google’s AI tools repeating false claims about miscaptioned footage that the British fact-checking organization had already debunked. A Tow Center study out this week found that seven mainstream AI chatbots failed to consistently identify the location and source of ten different images, which aligns with recent geolocation experiments from Bellingcat.

When it comes to identifying people, the most accurate results typically come from a dedicated facial recognition tool. But as Craig recently wrote for Indicator, these services often have murky ownership and privacy practices, which makes me hesitant to use them.

I wanted to see if LLMs might offer a reasonable alternative for my use case of recognizing individuals featured in videos. I ran more than 100 tests across three of Google’s AI tools (Lens, AI Mode, and Gemini¹), as well as with ChatGPT, Claude, Mistral, and Perplexity. I also conducted some of the same searches on TinEye, a traditional reverse image search tool, to compare results.

Here’s a breakdown of what I found out.

Upgrade to read the rest.

Become a paying member of Indicator to access all of our content and our monthly members-only workshop. Support independent media while building your skills.

Upgrade

A membership gets you:

Everything we publish, plus archival content, including the Academic Library and our Resources
Live monthly workshops and access to all recordings

How good are generalist AI tools at identifying public figures in videos?

Upgrade to read the rest.

A membership gets you:

Keep Reading