This is a regularly updated collection of academic studies and industry reports about digital deception. It currently includes short descriptions of 49 academic studies and systematic reports.
Photo by Jason Leung / Unsplash
This library is organized in five clusters:
Email me studies you think I should add at [email protected]!
Prevalence and characteristics of misinformation
📇 arXiv | Mar 2025 | Erik J. Schlicht
This analysis of PolitiFact's fact checks over time posted on arXiv caught my eye. The author collected the ratings assigned by the Pulitzer prize-winning fact-checker and found that there was a significant increase in the rate of false ratings assigned starting in 2020. (Like other fact-checkers, PolitiFact also assigns "True" ratings when they are warranted). The rise in misinformation labels appears to have started in 2016-2017, perhaps informed by PolitiFact's partnership with Meta that financially incentivized targeting fake claims over true ones. But the real surge coincided with the COVID-19 pandemic.
The analysis also found that the average sentiment of the claims that PolitiFact covers has become markedly more emotional and more negative since 2016.
📇 Sociological Science | Dec 2024 | Sandra González-Bailón, David Lazer, Pablo Barberá, et al.
Made possible by access to Meta data negotiated by Talia Stroud and Josh Tucker, this study tries to characterize the spread of misinformation flagged by fact-checkers on Facebook and Instagram during the 2020 US elections. It concludes that while information as a whole primarily spread in a broadcast manner through Pages, misinformation flipped the script and “and relie[d] much more on viral spread, powered by a tiny minority of users who tend to be older and more conservative.”
The study also found a steep decrease in “misinformation trees” on election day (see chart on the right below). Counts then climb back up shortly after the election and until January 6. The researchers suggest but cannot definitively conclude that the dip is due to Meta’s “break glass” measures introduced to reduce viral reach of content on its platforms.
📇 Science | Nov 2024 | Killian L. McLoughlin, William J. Brady, Aden Goolsbee, Ben Kaiser, Kate Klonick, and M. J. Crockett
I know this is the definition of confirmation bias but it remains nice to see a study that makes intuitive sense. Researchers at Northwestern, Princeton and St John’s universities conclude that, by and large, online misinformation exploits outrage to reach its audiences. They ran eight studies across Facebook and Twitter data that “misinformation sources evoke more outrage than do trustworthy news sources” and “outrage facilitates the spread of misinformation at least as strongly as trustworthy news.”
📝 Reuters Institute for the Study of Journalism | November 2024 | Waqas Ejaz, Richard Fletcher, Rasmus Kleis Nielsen, and Shannon McGregor
The Oxford-based journalism institute asked ~2,000 strong representative samples of the populations of eight countries around the world (Argentina, Brazil, Germany, Japan, South Korea, Spain, the UK, and the USA) a range of questions about the role of digital platforms in contemporary media environments.
69% of respondents thought platforms have made spreading misinformation easier, with only 11% believing the contrary. And across every single country surveyed, wide majorities believe that platforms should be held responsible for helping misinformation reach users.
Of course, misinformation is in the eye of the beholder. But remember this next time a handful of American commentators and elected official suggest content moderation is unwanted censorship.
📇 Journal of the Royal Society Interface | Nov 2024 | Kailun Zhu, Songtao Peng, Jiaqi Nie, Zhongyuan Ruan, Shanqing Yu, and Qi Xuan
A group of researchers at the Zhejiang University of Technology claim that Reddit threads about false claims tend to have more back-and-forth and be more negative in tone than those about true claims. (They based their analysis on a previously published dataset of Reddit posts tied to fact checks by PolitiFact, Snopes and Emergent.info.)
📇 Science Advances | Oct 2024 | Kevin T. Greene, Nilima Pisharody, Lucas Augusto Meyer, Mayana Pereira, Rahul Dodhia, Juan Lavista Ferres, and Jacob N. Shapiro
Researchers at Princeton and Microsoft studied two large samples of Bing results to try and understand the manner and extent to which the Microsoft search engine returned unreliable news sites.
(The study was published on Science Advances, a highly reputable peer-reviewed journal, but it is worth noting the conflict of interest of the Microsoft co-authors, who assert the company did not have pre-publication approval.)
Across the two samples, the researchers collected a total of almost 14 billion search result pages (SERPs) that included at least one of the 8,000 domains whose reliability has been rated by NewsGuard. The researchers argue their largest sample, dating back to June-August 2022, provides “a representative sampling of heavily searched queries.”
Overall, the study finds that unreliable sites were returned in about ~1% of the SERPs, far less frequently than reliable sites (27% to 41% depending on the sample).
More important still, the likelihood of being exposed to an unreliable site was far higher (20x in sample 1, more in sample 2) for navigational queries, i.e. those that included the website’s name. This is a significant distinction to make because it helps tease out the role a search engine has in discovery versus retrieval of low quality information. Think of it as the difference between getting to infowars dot com from the query [infowars] versus the query [sandy hook].
📇 Nature | October 2024 | Mohsen Mosleh, Qi Yang, Tauhid Zaman, Gordon Pennycook & David G. Rand
This study argues that politically asymmetrical suspensions of social media users may be explainable by an asymmetrical sharing of misinformation by those accounts, rather than by platform bias.
The researchers found that that Twitter “accounts that had shared #Trump2020 during the election were 4.4 times more likely to have been subsequently suspended than those that shared #VoteBidenHarris2020.”
This could have been for a range of reasons, including bot activity or incitement to violence. Still, the pro-Trump accounts were also far more likely to share links to low-quality news sites that may have been flagged for misinformation. Crucially, this discrepancy held even when the news sites were rated by a balanced sample of laypeople rather than by referring to existing lists compiled by fact-checkers and other media monitors.
The researchers also found that this disparity largely held on Facebook, in survey experiments, and across 16 different countries.
📇 Public Opinion Quarterly | July-August 2024 | The Electoral Misinformation Nexus: How News Consumption, Platform Use, and Trust in News Influence Belief in Electoral Misinformation and A Matter of Misunderstanding? Explaining (Mis)Perceptions of Electoral Integrity across 25 Different Nations | Camila Mont’Alverne et al. and Rens Vliegenthart et al.
In this special issue on election misinformation of Public Opinion Quarterly, I was particularly interested in two papers analyzing how consumption of and trust in news media affects belief in misinformation, which the good folks at RQ1 helpfully summarized. Here’s how they present the main takeaways from the two papers:
📇 Science Advances | May 2024 | Jennifer Allen, Duncan J. Watts, and David G. Rand
Researchers at MIT and Penn assessed the impact of COVID-19 vaccine-related headlines on Americans’ propensity to take the shot. Then, they built a dataset of 13,206 vaccine-related public Facebook URLs that were shared more than 100 times between January and March 2021. Finally, they used crowd workers and a machine-learning model to attempt to predict the impact of the 13K URLs on vaccination intent.
That’s a lot to digest, but the graph below does a great job at delivering most of the results. On the left side you can see that the median URL flagged as false by Facebook’s fact-checking partners was predicted to decrease the intention to vaccinate by 1.4 percentage points. That’s significantly worse than the 0.3 decrease from the median unflagged URL.
But there’s a catch. Unflagged articles with headlines suggesting vaccines were harmful had a similarly negative impact on predicted willingness to jab — and were seen a lot more. Whereas flagged misinformation received 8.7 million views, the overall sample of 13K vaccine-related URLs got 2.7 billion views.
There are two takeaways for me here:
For one, it looks like (flagged) misinformation was a relatively small part of COVID-19 vaccine content in the US. Whether this should be interpreted as validation for Facebook’s fact-checking program or an indication that a big chunk of misinformation evaded fact-checker scrutiny would make for a valuable follow-up study.
The second message is that headlines matter. Because vaccine skeptical headlines reached so many more people than flagged misinfo, they are more likely to have depressed vaccination rates. Here’s a notable bit from the study:
a single vaccine-skeptical article published by the Chicago Tribune titled “A healthy doctor died two weeks after getting a COVID vaccine; CDC is investigating why” was seen by >50 million people on Facebook (>20% of Facebook’s US user base) and received more than six times the number of views than all flagged misinformation combined.
I remember this article. Even at the time, there were questions about the framing of an individual case in a way that alluded to causality. A coroner’s investigation was unable to confirm or deny a connection to the vaccine. It now seems likely that the article may have had a non trivial effect on the propensity to vaccinate of US Facebook users.
📇 PLOS One | May 2024 | Matthew R. DeVerna, Rachith Aiyappa, Diogo Pacheco, John Bryden, and Filippo Menczer
The OSoMe crew at Indiana University is behind this paper seeking to define and identify misinformation superspreaders on Twitter. The researchers first isolated almost half a million accounts that shared content from sources on the Iffy+ list. Then, they identified the most influential based on their number of retweets and a repurposed h-index, finding that these are far better predictors of influence than an account’s bot score. They conclude that “just 10 superspreaders (0.003% of accounts) were responsible for originating over 34% of the low-credibility content” between March and October 2020.
📇 arXiv | May 2024 | Nicholas Dufour, Arkanath Pathak, Pouya Samangouei, Nikki Hariri, Shashi Deshetti, Andrew Dudfield, Christopher Guess, Pablo Hernández Escayola, Bobby Tran, Mevan Babakar, Christoph Bregler
Several good humans I used to work with released this preprint taxonomizing media-based misinformation. The primarily Google-based authors trained 83 raters to annotate 135,862 English language fact checks carrying ClaimReview markup. (They are releasing their database under the suitably laborious backronym of AMMeBa.)
The study finds that almost 80% of fact-checked claims are now in some way related to a media item, typically video. This high proportion can’t be ascribed only to Facebook’s money drawing the fact-checking industry away from textual claims given that the trend precedes the program’s launch in 2017.
Unsurprisingly, AI disinformation shot up since the advent of ChatGPT and its ilk.
📇 Nature Human Behavior | March 2020 | Andrew M. Guess, Brendan Nyhan & Jason Reifler
This study tracked the online behavior of a roughly representative sample of 2,525 Americans from 7 October to 14 November 2016. It found that 44% of them visited an “untrustworthy news website” at least once even as these sites represented only 6% of overall news diet in the sample. The consumption of lower quality information was driven by the most conservative 20% of the population, and by use of Facebook. Finally, the researchers found that fewer than one in two Americans who was exposed to an untrustworthy website also visited a fact-checking website in the same period (see right-hand column on the chart below).
Effects of fact-checking interventions
📇 PsyArXiv | April 2025 | Thomas Renault, Mohsen Mosleh, and David Rand
Someone better not show Mark Zuckerberg or Elon Musk this working paper by Thomas Renault, Mohsen Mosleh, and David Rand. The billionaire owners of Meta and X based their support for community-driven correction labels on the alleged neutrality of the crowd compared to that of pesky professional fact-checkers.
Set aside that this support extends only as long as they agree with the findings of the community. And set aside, too, that a greater focus on right-leaning falsehoods may be a factor of their greater prevalence online.
It now turns out that Community Notes also disproportionately target Republicans.
Renault et al. extracted the 281,382 English-language notes written between January 2023 and June 2024. They then classified the users that these notes were correcting as Democratic or Republican based on the accounts they follow, resolving any uncertainty by getting an LLM to rate 500 of their tweets. (It's possible that this ended up including some users outside the U.S. who follow American politicians.)
The results are lopsided: 60% of proposed notes are on 'Republican' tweets; 40% on 'Democratic' ones. The difference gets starker when looking at notes that are rated helpful, 70% of which appear on Republican posts. This matters because only helpful notes are appended to the offending tweet and shown to all X users.
So not only do Republican tweets get targeted more by notes; those notes are also far more likely to be be considered helpful (10.4% vs 6.8% for those proposed on Democratic tweets). Overall, the researchers claim there's a +64% chance of a note on a Republican user's tweet being deemed helpful when holding stable a user's verified status, follower count, tweeting volume and the topic of the tweet.
Given Community Notes' bridging algorithm, this research is strongly suggestive that right-leaning American social media users post more content that third parties find misleading. Other recent research has found that extremely conservative users are more susceptible to misinformation and less likely to recognize it.
There were a couple of other valuable tidbits in this preprint. For one, a plurality of Community Notes focus on politics (35.1%), significantly ahead of science and health (12.7% and 7.8% respectively) and economics (5.3%).
Additionally, at least until June 2024, there were more Democratic posts on X than Republican ones, even though the share of the latter greatly increased following Musk's takeover of the platform. This evolving user base and the updates to the Community Notes bridging algorithms means a follow-up analysis will be invaluable (Renault told me he and his co-authors are working on it).
📇 arXiv | March 2025 | Kirill Solovev, Nicolas Pröllochs
This preprint found that Community Notes that link to fact-checkers are deemed most helpful among any of the document types studied.
📇 HKS Misinformation Review | February 2025 | Claire Betzer et al.
Very helpful finding for folks who work to fight misinformation on platforms by a big group of scholars:
state-affiliated media tags on Twitter were less effective at reducing the perceived accuracy of false claims from state media outlets than previous research suggests. On the other hand, our results reinforce the finding that fact-checks are effective at combating misinformation.
The study imitated Twitter's labeling format completely, adding a small "state-affiliated media" under the account handle and/or a "false information: Checked by independent fact-checkers" warning below the post content.
As you can see below, the state media label had essentially no effect on the perceived accuracy of the underlying tweet, while the fact check label reduced it by ~10%. This is not terribly surprising in principle: state-affiliation might even provide a sheen of respectability and resources amid a miasma of creators you've never heard of before. But X and other platforms explicitly used these labels as a halfway solution to avoid fact-checking individual posts while warning users that the content may not be all that trustworthy. In that sense, at least, this study suggests the solution doesn't work.
📇 psyarXiv | February 2025 | Thomas H. Costello, Gordon Pennycook, and David Rand
You may remember the paper published that found that a three-round conversation with an AI chatbot reduced conspiracy beliefs among US study participants. The paper's authors are out with a preprint exploring why the intervention might have been so effective, which the lead author summarized in one word: "facts".
In this study, the researchers tried a range of communication styles, including framing the conversations as a debate or an attempt to change the conspiracy believer's mind. While there was some variation in the efficacy of each intervention, only one of them was significantly worse – the conversations that didn't provide any evidence to contradict the conspiracy belief (see chart below). I'm not surprised the no-evidence interventions didn't work, given the examples are barely even rebuttals. But the point here is that nothing other than omitting the facts appears to prevent the fact-checking exchanges from having a noticeable impact, at least in this experimental setup.
📇 arXiv | October 2024 | Mitchell Linegar, Betsy Sinclair, Sander van der Linden, and R. Michael Alvarez
In a preprint, Linegar et al. test the efficacy of an AI-generated “prebunking” article on five common US election falsehoods. On average, prebunks reduced belief in these myths by 0.5 points on a 10-scale, while also improving confidence in election integrity. Overall, Democrats and Republicans did have differing belief levels in the misinformation — but preemptive corrections worked regardless of party affiliation.
📇 Scientific Reports | September 2024 | Hendrik Bruns, François J. Dessart, Michał Krawczyk, Stephan Lewandowsky, Myrto Pantazi, Gordon Pennycook, Philipp Schmid & Laura Smillie
This paper looked at corrections in the context of misinformation about COVID-19 or climate change in Germany, Greece, Ireland, and Poland. The paper tested two different variables: the timing of the correction (before or after the user was served an article making false claims about climate change) and the correction’s source (either absent, or attributed to the European Commission).
The authors conclude that for most conditions tested and in most locales, corrections worked (see aggregate results below). Overall, debunking had a slightly bigger effect than prebunking. The effect was pretty large, too, reducing strong agreement with the main false claims by almost half. Greater trust in the EU was also associated with an increased acceptance of the correction attributed to the Commission, though the authors caution that study design may have affected this finding.
📇 Nature Human Behavior | September 2024 | Cameron Martel and David G. Rand
Important takeaway in this study on Nature Human Behavior for those designing anti-misinformation interventions in online spaces.
Fact check warning labels reduce Americans’ propensity to believe or share false news despite differential trust in fact-checkers across political persuasions. As you can see in this chart, trust in fact-checkers is higher with Democratic voters:
But even when trust in fact-checkers is lowest (left on the x-axis below), fact checks still reduced the perceived accuracy of the labeled false news.
The chart looks the same for sharing intentions. Even though higher trust does lead on average to a higher reduction in sharing labeled false news, the effect is consistent across the board.
📇 Science | September 2024 | Thomas H. Costello, Gordon Pennycook, and David G. Rand
A three-round conversation with ChatGPT reduced belief in a conspiracy theory of choice by an average of 20%. That’s…pretty good! The effect was observed across a wide range of conspiracy theories and lasted even two months after the intervention.
You can get a good sense of the study design in the graphic below. I am inclined to agree with the authors’ assessment that the reason the intervention was so successful is that the LLM could tailor its responses to the unique reasons each participant had to believe in the conspiracy theory. You can also play around with their AI fact-checker at debunkbot.com.
📇 arXiv | September 2024 | Yuwei Chuai, Moritz Pilarski, Thomas Renault, David Restrepo-Amariles, Aurore Troussel-Clément, Gabriele Lenzini, and Nicolas Pröllochs
This preprint by researchers in Luxembourg, France and Germany claims community notes on X reduced the spread of tweets they were attached to by up to 62 percent and doubled their chance of being deleted. The study also found that the labels typically came too late to affect the overall virality of the post. (This is a bit of a chicken-and-egg problem where a viral fake is more likely to be seen by people who can debunk it.)
📇 Misinformation Review | September 2024 | John C. Blanchar, Catherine J. Norris
This peer-reviewed paper found that the “disputed” labels that (then) Twitter was appending to false claims of election fraud increased belief in the false claim by Trump supporters. It’s worth noting that this was a survey, rather than an analysis of platform data, and that no information beyond the label was provided.
📇 Journal of Online Trust & Safety | September 2023 | Andy Zhao, Mor Naaman
This is a great deep dive into the efficacy of the Taiwanese crowdsourced fact-checking website CoFacts. The study concludes that by and large … it works? While CoFacts covers slightly different topics from professionally-staffed websites MyGoPen and Taiwan FactCheck Center, it does so at much greater scale and speed. Moreover, disagreement on the rating of exactly similar claims were rare.
📇 Perspectives on Psychological Science | August 2023 | Cameron Martel, Jennifer Allen, Gordon Pennycook, and David G. Rand
This review of several studies on the efficacy of crowdsourced fact-checking concludes that “current evidence supports the promise of leveraging the wisdom of crowds to identify misinformation via aggregate layperson evaluations.”
📇 CHI ‘22 | April 2022 | Jennifer Allen, Cameron Martel, David G Rand
This study concludes that users of Birdwatch (now Community Notes) tend to rate counter partisans more negatively than those whose party affiliation they share. While the researchers note that “the preferential flagging of counter-partisan tweets we observe does not necessarily impair Birdwatch’s ability to identify misleading content,” it is nonetheless striking when compared to “other theoretically relevant features, like the number of sources cited in the note.”
📇 PNAS | July 2021 | Ethan Porter and Thomas J. Wood
This study is invaluable in looking at the effect of misinformation and related corrections across different locales. The researchers tested 22 fact checks on at least 1,000 respondents in each of Argentina, Nigeria, South Africa and the United Kingdom. They found that “every fact-check produced more accurate beliefs,” even when the topic of misinformation was politically charged. The study helpfully tested two identical fact checks across the four countries to correct for any confounding factors, and found this remained true (see chart below).
📇 Political Communication | October 2019 | Nathan Walter, Jonathan Cohen, R. Lance Holbert, and Yasmin Morag
This meta-analysis of findings on the impact of fact-checking contains my go-to citation about what we know about this field:
Simply put, the beliefs of the average individual become more accurate and factually consistent, even after a single exposure to a fact-checking message. To be sure, effects are heterogeneous and various contingencies can be applied, but when compared to equivalent control conditions, exposure to fact-checking carries positive influence.
However, the results also raise substantial concerns. In line with the motivated reasoning literature (Kunda, 1990; Nir, 2011), the effects of fact-checking on beliefs are quite weak and gradually become negligible the more the study design resembles a real-world scenario of exposure to fact-checking
📇 Journal of Public Economics | July 2019 | Oscar Barrera Rodríguez, Sergei Guriev, Emeric Henry and Ekaterina Zhuravskaya
[excerpted from an article originally published on Poynter]
This study found that providing factual information on immigration improved French voters’ understanding but didn’t reduce their likelihood to vote for the fact-checked politician. This finding is in line with an earlier study conducted on American voters.
The researchers surveyed 2,480 French individuals online in four regions where the far-right Front National party (FN) had done best in last year’s regional elections. The sample was otherwise representative of the French population in terms of age, gender and population.
Respondents were put into one of four groups. The first group received false claims on immigration made by Marine Le Pen, the FN’s presidential candidate. The second group obtained statistics on the same issues. The other two groups were given both or neither, respectively.
Across all groups, the researchers tested respondents’ understanding of the facts, their support for Le Pen on immigration and their voting intentions.
The variation of factual understanding among these four treatments is immediately clear. In one of the three claims tested, Le Pen used photos from the migration influx into Germany and Hungary to claim that 99 percent of refugees were men. UN stats indicate that the actual share of adult males among migrants coming into Europe from the Mediterranean was 58 percent.
In the graph below, respondents are divided into deciles, and the correct answer marked with a red vertical line. The “informed” group correctly determined the share of men more than 60 percent of the time. Individuals given no information or the false claims by Le Pen were far more likely to suggest a higher percentage.
Overall, knowledge of the facts was negatively affected when respondents only read Le Pen’s claims but improved when they were offered the facts alone or both the facts and Le Pen’s claims.
📇 PNAS | April 2019 | Joshua Becker, Ethan Porter, and Damon Centola
In this study, groups of 35 Democratic and Republican voters recruited on Amazon Turk were asked to respond over three rounds to four factual questions known to elicit partisan responses (e.g. “What was the unemployment rate in the last month of Barack Obama’s presidential administration?”). In some circumstances, they were given the average responses of four other participants, not knowing that these were their co-partisans. Even though significant partisan differences remained, participants in the ‘crowd’ condition showed improved accuracy across the board (see chart). It is important to note that compensation was tied to the accuracy of a final answer, so respondents had a financial incentive to be correct.
📇 Political Behavior | January 2019 | Brendan Nyhan, Ethan Porter, Jason Reifler & Thomas J. Wood
This study looked at the response of US-based respondents to two fact checks about crime and unemployment during the 2016 election. It concludes that “people express more factually accurate beliefs after exposure to fact-checks.” As you can see in the chart below, belief that crime had increased in America (which was not true at the time of Trump’s fact-checked statement) decreased for both Clinton and Trump voters after they were exposed to a fact check. They even held up after respondents were also exposed to denials by the Trump campaign and the candidate himself.
The role of familiarity in correcting inaccurate information
📇 Journal of Experimental Psychology: Learning, Memory, and Cognition | May 2017 | Briony Swire, Ullrich K H Ecker, and Stephan Lewandowsky
[excerpted from an article originally published on Poynter]
This study looked at the influence of different variables on the effectiveness of a correction in light of the familiarity effect. The researchers concluded that more detailed explanations help people remember corrections longer and that individuals over 65 are comparatively worse at holding on to corrective information (i.e. likelier to misremember a myth as a fact).
Researchers used a pilot group to select claims — both true and false — that were “common and at least midrange believable.” They then presented participants with either a brief affirmation/retraction or a more detailed one. The result? Belief in facts shot up while belief in myths tumbled (see chart below).
The effect of the correction on belief in a myth did wear off a little over time, however. “Belief change was more sustained after a fact affirmation compared with myth retraction,” the researchers write, noting that “this asymmetry could be partially explained by familiarity.” It is also worth noting that with over 65-year olds, the regression was significantly higher.
The beneficial effect of a correction was recorded not just when participants were asked explicitly to rate their belief in a claim, but also when these were asked an “inference question.” (For the myth that you can tell if someone is lying through physical tells like eye movement, the inference question was “What percentage of lies can FBI detectives catch just by looking at physical tells?”).
The psychologists believe there are some practical lessons that can be gleaned from their findings.
First, detailed fact checks are more effective. In the study’s case, detailed explanations were a mere three or four sentences long, This is a relatively low bar for fact-checkers to clear. But it does suggest that headlines or tweets alone may not be as effective in correcting a misperception.
Second, fact-checkers may have to repeat their corrections frequently in order to counteract the regression towards thinking a myth is a fact that we experience over time. The study acknowledges this recommendation is “somewhat ironic,” but making a correction more familiar to readers might be worth the drawback of reminding them of the underlying myth.
📇 Political Behavior | January 2018 | Thomas Wood & Ethan Porter
[excerpted from an article originally published on Poynter]
This paper fails to replicate — and ultimately contradicts — one of the most cited findings on the relationship between facts and partisan beliefs, namely the “backfire effect.”
The study showed 8,100 subjects corrections to claims made by political figures on 36 different topics. Only on one of the 36 issues (the misperception that WMD were found in Iraq) did they detect a backfire effect. Even then, a simpler phrasing of the same correction led to no backfire. The paper’s co-authors conclude that “by and large, citizens heed factual information, even when such information challenges their partisan and ideological commitments.”
📇 Royal Society Open Science | March 2017 | Briony Swire, Adam J. Berinsky, Stephan Lewandowsky, and Ullrich K. H. Ecker
[excerpted from an article originally published on Poynter]
This study focused on statements — both factual and inaccurate — made by Donald Trump during the Republican primary campaign. The basic conclusion of the study is that fact-checking changes people’s minds but not their votes.
The authors presented 2,023 participants on Amazon’s Mechanical Turk with four inaccurate statements and four factual statements (see full list here). Misinformation items included Trump’s claim that unemployment was as high as 42 percent and that vaccines cause autism.
These claims were presented two ways: unattributed or clearly indicating that they were uttered by Trump. Participants were asked to rate whether they believed each of them on a scale of zero to 10.
Each falsehood was then corrected (or confirmed) with reference to a nonpartisan source like the Bureau of Labor Statistics. Participants were then asked to rate their belief in that claim again, either immediately or a week later.
The results are clear: Regardless of partisan preference, belief in Trump falsehoods fell after these were corrected (see dotted lines below). The belief score fell significantly for Trump-supporting Republicans, Republicans favoring other candidates and Democrats.
There’s more good news for fact-lovers in the research. Generally speaking, misinformation items were from the start less believed than factual claims. Moreover, the research found no evidence of a “backfire effect,” i.e. an increase in inaccurate beliefs post-correction.
The rub, of course, is that Trump supporters were just as likely to vote for their candidate after being exposed to his inaccuracies. The paper found that voting preferences didn’t vary among Republicans who didn’t support Trump, either. Only Democrats said they were even less likely to vote Trump.
Prevalence, effects, formats, and labeling of AI-generated deceptive content
📇 arXiv March 2025 | Binh M. Le, Jiwon Kim, Simon S. Woo, Kristen Moore, Alsharif Abuadbba, Shahroz Tariq
A preprint by researchers based in Australia and South Korea found that deepfake detection is still imperfect. While the sixteen detectors tested performed passably on known datasets of facial deepfakes, the best of them could only score 69% on real world data.
This echoes a seminal 2024 literature review (see FU#6) that found that minimal edits such as cropping and resizing short-circuited many of the detection techniques in the literature. My experience with commercial detectors has also been that they promise precision rates that they fail to achieve.
If all this wasn't enough, apparently generative AI is getting really good at stripping watermarks off images anyway!
📇 CCS '24: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security | December 2024 | Kevin Warren, Tyler Tucker, Anna Crowder, Daniel Olszewski, Allison Lu, Caroline Fedele, Magdalena Pasternak, Seth Layton, Kevin Butler, Carrie Gates, Patrick Traynor
This study had 1,200 users listen to twenty audio clips and determine whether they thought they were synthetic, how confident they were, and what they based their decision on. While a few individuals did correctly identify 100% of the deepfakes they were exposed to, the average response was less impressive. Mean accuracy was as low as 65% on audio samples from the Wavefake dataset, 71% on ASVspoof2021 and 81% on FakeAVCeleb.
Helpfully, the researchers also coded the qualitative responses based on common characteristics that respondents used to determine whether something was synthetic or not (see chart below). Prosody — which includes things like cadence, tone and pitch — was the most frequently cited element.
📇 Scientific Reports | October 2024 | Sarah Barrington, Emily A. Cooper, Hany Farid
Humans don’t appear all that well equipped to detect faked voices.
UC Berkeley researchers used ElevenLabs's Instant Voice Cloning API to clone 220 speakers. They then had survey respondents discern whether two clips were from the same person and whether any of them were deepfaked. In almost 80% of the cases, a real voice and its audio clone were deemed to be from the same speaker (real clips of the same voice were correctly attributed with a slightly higher precision). Slicing the data another way, Barrington and Farid found that respondents correctly flagged an audio as synthetic ~60% of the time. That is not much better than flipping a coin.
📇 PNAS Nexus | October 2024 | Sacha Altay and Fabrizio Gilardi
In a study on PNAS Nexus, two political scientists at the University of Zurich concluded that “labeling headlines as AI-generated reduced the perceived accuracy of the headlines and participants’ intention to share them, regardless of the headlines’ veracity (true vs. false) or origin (human- vs. AI-generated).” Still, the effect was relatively small: a 2.66 percentage point decrease for a “generated by AI” headline compared to a 9.33 percentage point decrease for content labeled as “false.”
📇 RAID '24: Proceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses | September 2024 | Jonas Ricker, Dennis Assenmacher, Thorsten Holz, Asja Fischer and Erwin Quiring
Holz, Quiring, et al. tried to quantify the reach of AI accounts on Twitter by selecting a random 1 percent of all public posts in one week in March 2023 and seeing how many associated accounts used deepfaked profile pictures. Their estimate is 7,723 accounts, or 0.05% of the total.
While “fake-image accounts” did not post more than real-image accounts, they were on average far newer and were primarily focused on large-scale spamming attacks. In that limited time frame of study, most content published was in English, Turkish and Arabic and the principal topics were politics and finance.
📇 PLOS One | June 2024 | Peter Scarfe, Kelly Watcham, Alasdair Clarke, and Etienne Roesch
Psychology researchers at the universities of Reading and Essex tested the proposition that generative AI can be reliably used to cheat in university exams. The answer is a resounding Yes.
The researchers used GPT-4 to produce 63 submissions for the at-home exams of five different classes in Reading’s undergraduate psychology department. The researchers did not touch the AI output other than to remove reference sections and re-generate responses if they were identical to another submission.
Only four of the 63 exams got flagged as suspicious during grading, and only half of those were explicitly called out as possibly AI-generated. The kicker is that the average AI submission also got a better grade than the average human.
The study concludes that “from a perspective of academic integrity, 100% AI written exam submissions being virtually undetectable is extremely concerning.”
📇 arXiv | June 2024 | Dmitry Kobak, Rita González-Márquez, Emőke-Ágnes Horvát, Jan Lause
A group of machine learning researchers claim in a preprint that as many as 10% of the abstracts published on PubMed in 2024 were “processed with LLMs” based on the excess usage of certain words like “delves.” Seems significant, important
📇 Nature | June 2024 | Sebastian Farquhar, Jannik Kossen, Lorenz Kuhn & Yarin Gal
On Nature, computer scientists at the University of Oxford presented the results of their effort to detect LLM hallucinations with LLMs (see also WaPo write-up). In a skeptical riposte, RMIT computer scientist Karin Verspoor warns that this approach could backfire “by layering multiple systems that are prone to hallucinations and unpredictable errors.”
If I understand the figure below correctly, the accuracy of this method, billed “semantic entropy,” is still only about 80%.
📇 arXiv | May 2024 | Allison Koenecke, Anna Seo Gyeong Choi, Katelyn X. Mei, Hilke Schellmann, Mona Sloane
In a great paper presented at FAccT, Alison Koenecke and colleagues tested Whisper, OpenAI’s transcription service, on 13,140 audio snippets. They found that in 187 cases (~1.4%), Whisper consistently transcribed things that the speakers never said.
More worryingly, one third of these hallucinations were not innocuous substitutions of homophones but truly wild additions that could have a material consequence if taken at face value. See for yourself:
Also concerning was the fact that Whisper performed markedly worse on speakers with aphasia, a language disorder, than with those in the control group.
📇 arXiv | May 2024 | Cameron R. Jones and Benjamin K. Bergen
In this preprint by two cognitive scientists at UC San Diego, 500 participants spent 5 minutes texting with a human or one of three AI interfaces through an interface that concealed who was on the other side. 54% of the respondents assigned to GPT-4 thought they were chatting with a human, not much lower than the share respondents who rated the actual human as a human (67%).
📇 IEEE Security & Privacy | June 2024 | Diangarti Tariang, Riccardo Corvi, Davide Cozzolino, Giovanni Poggi, Koki Nagano, Luisa Verdoliva
This paper claims post-processing typical of image sharing — such as cropping, resizing and compression — can have a strong impact on detector accuracy. Compare the “without PP” results and the “with PP” results to get a sense of how significant this impact can be.
More hopefully, the paper finds that low-level forensic artifacts can still be used as artificial fingerprints of a particular model. Look for instance below, from left to right, at the spectral analysis of images generated by Latent Diffusion, Stable Diffusion, Midjourney v5, DALL·E Mini, DALL·E 2 and DALL·E 3.
📇 CHI ‘23 | April 2023 | Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, Mor Naaman
In this worrying study on the possible persuasiveness of AI assistants, participants were asked to answer "Is social media good for society?" Those with a virtual assistant primed to be pro-social media were 2x more likely to answer affirmatively than control group. The exercise appears to have affected reported opinions in the same direction, too.
Synthetic non consensual intimate imagery
Characterizing the MrDeepFakes Sexual Deepfake Marketplace
📇 arXiv | January 2025 | Catherine Han, Anne Li, Deepak Kumar and Zakir Durumeric
This useful preprint gives a bit of a sense of what the largest platform for deepfake porn, MrDeepFakes, looked like in November 2023. At that time, the site hosted 43,000 videos that had been viewed a collective 1.5 billion times.
90% of these videos mentioned who was depicted in the videos. The researchers were thus able to identify 3,803 unique individuals, who follow a long-tail distribution: “the ten most targeted celebrities are depicted in 400–900 videos each, but over 40% of targets are depicted in only one video.”
Of the celebrities who account for 95% of the videos, one third were American and one tenth were Korean. The primary occupations were actors, musicians and models.
Fully 271 of the top 1,942 named individuals in this subsample were not celebrities, contravening MrDeepFakes modicum of a policy restricting deepfake video generation to famous individuals. As the researchers write:
For 29 targets, we found no online presence, and for another 242 targets, we could not find profiles on any of the listed social media platforms that satisfied the minimum following criteria. In one example, we find that 38 Guatemalan newscasters with little to no social media following appear in over 300 videos. All of these videos were posted by two users, who both describe their focus on Latin American individuals in their profiles.
The researchers also looked at the paid requests for deepfake videos posted in community forum pages. They flag that most requests likely occur in private exchanges, but from those that were public (only 58) they surmised that the average price advertised was $87.50. Finally, and this is where the timing of the data collection matters the most, the researchers find anecdotal evidence that the decreasing barrier to powerful AI tools is affecting the marketplace for non consensual deepfake porn by making it easier for people to do it by themselves rather than purchase.
In Deep Trouble: Surfacing Tech-Powered Sexual Harassment in K-12 Schools
📝 Center for Democracy & Technology | September 2024 | Elizabeth Laird, Maddy Dwyer, Kristin Woelfel
The Center for Democracy & Technology surveyed American public school students in grades 6-12 (roughly speaking, ages 11 to 18). 15% of these students said that they know of a deepfake depicting individuals associated with their school being shared in the past school year.
With 15M students in U.S. public high schools, this suggests the number of deepfake nudes in school settings around the country may be as high as 225,000. Even if some cases were double counted because multiple pupils from the same school took the survey1 and even if half of the students answered falsely, we’d still be looking at tens of thousands of cases. And this would assume that every case only affected one victim (unlikely) and that there were no cases in private schools (impossible).
You should read the whole report, but two other things stood out to me. First, I was surprised to see most students report that non consensual intimate imagery (authentic and deepfaked) is shared primarily via social media, not messaging apps.
Second, ~60% of teachers and students report that their school has not communicated their procedures for addressing deepfake nudes.
The new face of digital abuse: Children’s experiences of nude deepfakes
📝 Internet Matters | October 2024
A survey of British teenagers by the industry-funded nonprofit Internet Matters found that at least 13% of them had either created a deepfake nude, knew someone who had, or had encountered this type of content online.
The most interesting finding of the report to me was that 55% of the teens thought being targeted by a deepfake nude was worse than by a real nude. One girl, aged 16, had this to say in a open-ended response: “If a nude image was sent of me currently that I consented to filming even though it's sad/unfortunate I would know that (it) was my choice that led to that image being shared. However, with a deepfake I didn't choose for that image to be created and its not realistic to me.”
📇 arXiv | September 2024 | Qiwei Li, Shihui Zhang, Andrew Timothy Kasper, Joshua Ashkinaze, Asia A. Eaton, Sarita Schoenebeck, Eric Gilbert
In a preprint, researchers at the University of Michigan and Florida International University tested X’s responsiveness to takedown requests for AI-generated non consensual intimate imagery.
The study created 5 deepfake nudes for AI-generated personas and posted them from 10 different X accounts. They then proceeded to flag the images through the in-platform reporting mechanisms. Half were reported to X as a copyright violation and the other half as a violation of the platform’s non consensual nudity policy. While the copyright violations were removed within a day, none of the images flagged as NCII had been removed three weeks after being reported.
“Violation of my body:” Perceptions of AI-generated non-consensual (intimate) imagery
arXiv | June 2024 | Natalie Grace Brigham, Miranda Wei, Tadayoshi Kohno, Elissa M. Redmiles
In this preprint, researchers at the University of Washington and Georgetown studied the attitudes of 315 US respondents towards AI-generated non consensual intimate imagery (AIG-NCII). They found that vast majorities thought the creation and dissemination of AI-generated NCII was “totally unacceptable.” That remained true whether the object of the deepfake was a stranger or an intimate partner, though there was some fluctuation based on the intent.
In an indication that generalized access could normalize AIG-NCII, respondents were notably more accepting of people seeking out this content.
Non-Consensual Synthetic Intimate Imagery: Prevalence, Attitudes, and Knowledge in 10 Countries
CHI '24 | May 2024 | Rebecca Umbach, Nicola Henry, Gemma Beard, Colleen Berryessa
Earlier this year, a separate group of researchers across four institutions shared the results of a mid-2023 survey of 1,600 across 10 countries that found a similar distinction between creation/dissemination versus consumption. In the chart below on a scale from -2 to +2 you can see the mean attitude towards criminalizing the behaviors listed on the left column. (The redder the square, the more people favored criminalization.)
Other
A computational analysis of potential algorithmic bias on platform X during the 2024 US election
QUT ePrints | November 2024 | Timothy Graham & Mark Andrejevic
An analysis by researchers at Queensland University of Technology and Monash University claims there was a significant increase in engagement with right-leaning accounts after Elon Musk’s endorsement of Donald Trump on July 13, 2024.
While this increase may have been organic in nature and tied to the increasingly partisan nature of the platform, one metric caught my eye. Views of Musk’s own tweets went up by 138% compared to his average for the first part of the year.
If this was intentional, it would not bet unprecedented. In 2023, Platformer reported that Musk had ordered X engineers inflate the reach of his posts by a factor of 1,000. If he was willing to do that because (allegedly) his Super Bowl tweet performed less well than Joe Biden’s, it is hard to imagine that he wouldn’t have done the same in the interest of an election he described in cataclysmic terms.
Reliability Criteria for News Websites
ACM Transactions on Computer-Human Interaction | January 2024 | Hendrik Heuer and Elena L. Glassman
In a preprint whose study design I can only define as chaotic good, two computer scientists asked 23 local politicians from Germany and 20 journalists from the US to describe how they would rate three low-credibility news sources. The researchers found there were 11 criteria that respondents tended to use to assess the trustworthiness of the site in front of them. These included the website’s content, reputation and self-description.
While the two groups mentioned some criteria at similar rates (see chart below, where “experts” are the journalists and “laypeople” are the elected officials), two exceptions stand out.
18 out of 20 journalists looked for the author of an article and considered their biography. Only 9 out of 23 politicians did the same.
Equally interesting was the differential reliance on Ads as a proxy for credibility. This was the single-most cited (negatively impacting) factor for politicians but only came up from 4 out of 20 of the journalists. I wonder whether this is because the latter are more familiar with the reality of online business models for news and more inured to the dissonance of seeing a lousy ad next to a credible news article.
arXiv | September 2024 | Andreas Plesner, Tobias Vontobel, Roger Wattenhofer
In this preprint, researchers at ETH Zurich claim that advanced YOLO Models can solve 100% of reCAPTCHAv2 bot-filtering tests (I swear those were all real words). The paper shows that VPN use, mouse movement and user history all affect the likelihood of detection. The authors conclude that “we are now officially in the age beyond captchas.”
On the one hand, great! I won’t miss these capricious and deranged puzzles. But reCAPTCHAv2 is one the internet’s main defenses against automated bots. So this is probably not the best thing to happen just as generative AI unloads hordes of imitation humans in our online spaces.