Logo
Logo
Search
UPGRADE
ABOUT
RESOURCES
REPORTING
LOGIN

Deepfake labels and detectors still don't work

Catholic page uses Google to spread unlabeled AI slop on Meta, Deepfake image detectors flop basic test, and Kent Walker thinks crowd-checking on YouTube shows potential

Alexios Mantzarlis
Alexios Mantzarlis

Jan 22, 2025

HEADLINES

Donald Trump ordered his Attorney General to investigate federal interventions against misinformation in the past four years and recommend “remedial actions.” Douyin blocked more than 10,000 accounts pretending to be foreign “TikTok refugees.” Thailand’s Prime Minister received a call from a deepfaked foreign leader. GitHub is still hosting projects tied to deepfake porn. Misleading content was the second biggest source of user reports on Bluesky last year. Russian influence operation Doppelgänger ran ads worth $328,000 on Meta in the EU. Apple paused its AI news summarization feature that kept getting stuff terribly wrong.

TOP STORIES

This Catholic AI slop page loves Gemini

In October, I wrote about a network of Italian Facebook pages spreading Catholic clickbait and unrelated AI slop to tens of thousands of users.

I checked in on the operation and noticed that its largest page, La luce di Gesù e Maria, appears substantially reliant on Google’s Gemini to create its synthetic spam. Take this bucolic image of newlyweds eliciting seemingly sincere reactions:

Humans and animals in the picture are deepfaked using Gemini. To its credit, Google is one of the companies that has agreed to add a marker of fakeness to the outputs of its AI-generating tools (more here — and see screenshot below).

Using this metadata, I found that La luce di Gesù e Maria used Gemini to generate 22 out of the 33 images published in January. Of the 108 images posted in the past month or so, 39 were created with Google’s AI. This increase may be due to how the metadata did and didn’t carry through based on how the photos were uploaded and can’t definitely prove an increased reliance on Google.

Using Gemini to create nonsensical images doesn’t violate any product policy (whether it’s aligned with Google’s mission or the long term health of its core product is another question).

Facebook’s failure to label images that carry Google’s metadata breaks a promise it made in February 2024 to build “industry-leading tools that can identify invisible markers at scale – specifically, the “AI generated” information in the C2PA and IPTC technical standards – so we can label images from Google, OpenAI, Microsoft, Adobe, Midjourney, and Shutterstock as they implement their plans for adding metadata to images created by their tools.”

I reached out to Meta for comment and was pointed back to the February 2024 blog post statement that “it’s not yet possible to identify all AI-generated content“ — but wasn’t given a reason for the lack of labels on images with Google’s metadata.

AI detectors keep promising more than they can deliver

Worry not, though, because the industry is flush with deepfake detectors that will make spotting AI slop at scale extremely easy. Right? Wrong.

Take AIorNot, which raised $5M last week and claims to have 200,000 clients for its AI image detector. Or if you prefer, use Winston AI, which claims a “99.98% accuracy rate” (h/t Libération journalist Enzo Quenescourt) and has been featured on The New York Times, Wired, The Guardian and more.

When I tested six AI-generated images of varying provenance and formats on these two tools, they both failed at least half of the time. Both successfully spotted that the corn stomach and the Italian soldier below were AI-generated. But Winston AI failed to spot an image I created of the United Nations building in ruins, a picture of three tech CEOs with their tongues out that spread on social media, and a composite photo of a couple purportedly posing on the same motorcycle over the years. It even failed on a pretty classic case of a GAN-generated profile picture (AIorNot got it right).

To recap: platforms aren’t methodically labeling AI-generated images to their users. And detectors promising to do the job fail when probed even a handful of times.

Walker backs Community Notes

In a letter to the European Commission seen by Axios, Google SVP Kent Walker said it will "pull out of all fact-checking commitments” in the Code of Practice on Disinformation before it becomes a more binding document under the Digital Services Act. The relevant section in the Code is from page 37 onwards here and here’s a TL;DR from the Commission back in 2021 about what was at stake:

❝

[S]ignatories should commit to extend the cooperation with fact-checkers. Increasing the impact of fact-checking can be also achieved through a better incorporation and visibility of content produced by fact-checkers. Signatories should look into efficient labelling systems as well as the creation of a common repository of fact-checks, which would facilitate its efficient use across platforms to prevent the resurgence of disinformation that has been debunked by fact-checkers. Cooperation with fact-checkers should ensure their independence, fair remuneration, foster cooperation and facilitate the flow across services.

Google hasn’t really done anything new with fact checks since the pandemic, so this isn’t a true change of plans as much as a restated commitment to avoid integrating external fact checks in product labeling and ranking decisions.

Walker, who was my boss’ boss’ boss’ boss when I was at Google, also told the EU that he thinks YouTube’s Community Notes-like feature had “significant potential.”

And yet the program, which I wrote about at launch, has for the moment failed to visibly take off. I have not found any updates from YouTube since their initial announcement, nor have I encountered any notes in the wild. The feature appears to be still in a highly restricted pilot phase where most users do not get to access the “add note” button. If you have signed up to it or spot any notes in the wild, do write to me at [email protected].

Misinformation leads to policy rollback in Brazil

Misinformation about Pix, a payments platform run by the Brazilian Central Bank, led the Lula government to change course on a proposal to increase oversight of payments totaling more than R$5,000 per month.

According to Lupa, falsehoods about the program go back to its very inception, but the new bill led to an explosion of messages claiming the government was planning to tax, rather than track, Pix transactions. The fact-checkers estimate that misinformation about the program reached as many as 9 million Brazilians on WhatsApp and Telegram.

Quantifying MrDeepFakes

This useful preprint gives a sense of what goes on in the largest platform for deepfake porn, MrDeepFakes. The data collection took place in November 2023, so take it as a snapshot of that period rather than a live view.

At that time, the site hosted 43,000 videos that had been viewed a collective 1.5 billion times.

90% of these videos mentioned the individual depicted. The researchers were thus able to identify 3,803 unique individuals, noting that “the ten most targeted celebrities are depicted in 400–900 videos each, but over 40% of targets are depicted in only one video.”

Of the celebrities who account for 95% of the videos, one third were American and one tenth were Korean. The primary occupations were actors, musicians and models.

Fully 271 of the top 1,942 named individuals in this subsample were not celebrities, contravening MrDeepFakes modicum of a policy restricting deepfake video generation to famous individuals. As the researchers write:

❝

For 29 targets, we found no online presence, and for another 242 targets, we could not find profiles on any of the listed social media platforms that satisfied the minimum following criteria. In one example, we find that 38 Guatemalan newscasters with little to no social media following appear in over 300 videos. All of these videos were posted by two users, who both describe their focus on Latin American individuals in their profiles.

The researchers also looked at the paid requests for deepfake videos posted in community forum pages. They flag that most requests likely occur in private exchanges, but from the handful that were public (58) they surmised that the average price advertised was $87.50.

Finally, and this is where the timing of the data collection matters the most, the researchers find anecdotal evidence that the decreasing barrier to powerful AI tools is affecting the marketplace for non consensual deepfake porn by making it easier for people to do it by themselves rather than seek an “expert” to produce it.

NOTED

  1. I molti errori e imprecisioni nell’articolo del Post sul fact-checking (Facta)

  2. Wild Claims About L.A. Wildfires Get Millions of Views (NewsGuard)

  3. Alice Weidel bei Elon Musk: Diese Behauptungen haben wir geprüft (Correctiv)

  4. Judge rebukes Minnesota over AI errors in 'deepfakes' lawsuit (Reuters)

  5. ‘I doorknocked for Labour then racist deepfake ruined my life’ (The Times)

  6. Sockenpuppenzoo – Angriff auf Wikipedia (ARD Audiothek)

  7. More than half of the misinformation most relevant to Latino Communities during the US elections did not prompt any action from the main digital platforms (Maldita)

  8. Algorithmic Behaviors Across Regions: A Geolocation Audit of YouTube Search for COVID-19 Misinformation between the United States and South Africa (arXiv)

  9. Singapore actor Laurence Pang loses S$35,000 to online love scam in the Philippines (CNA)

  10. That Sports News Story You Clicked on Could Be AI Slop (Wired)


Keep Reading



Indicator is your essential guide to understanding and investigating digital deception.

cursor-click