The Editors Protecting Wikipedia from AI Hoaxes

Excerpt

archived 10 Oct 2024 14:02:02 UTC


A group of Wikipedia editors have formed

WikiProject AI Cleanup

, “a collaboration to combat the increasing problem of unsourced, poorly-written AI-generated content on Wikipedia.”

“A few of us had noticed the prevalence of unnatural writing that showed clear signs of being AI-generated, and we managed to replicate similar ‘styles’ using ChatGPT,” Ilyas Lebleu, a founding member of WikiProject AI Cleanup, told me in an email. “Discovering some common AI catchphrases allowed us to quickly spot some of the most egregious examples of generated articles, which we quickly wanted to formalize into an organized project to compile our findings and techniques.”

In many cases, WikiProject AI Cleanup finds AI-generated content on Wikipedia with the same methods others have used to find AI-generated content in scientific journals and Google Books, namely by searching for phrases commonly used by ChatGPT. One egregious example is this

Wikipedia article about the Chester Mental Health Center

, which in November of 2023 included the phrase “As of my last knowledge update in January 2022,” referring to the last time the large language model was updated. 

Other instances are harder to detect. Lebleu and another WikiProject AI Cleanup founding member who goes by Queen of Hearts told me that the most “impressive” examples they found of AI-generated content on Wikipedia so far is an article about the Ottoman fortress of Amberlisihar. 

“Amberlihisar fortress was built in 1466 by Mehmed the Conqueror in Trabzon, Turkey. The fortress was designed by Armenian architect, Ostad Krikor Baghsarajian.[7] Construction of the fortress was completed using a combination of stone and brick materials, with craftsmen and builders being brought in from the Rumelia region to work on the project. The timbery for the fortress was sourced from the forests along the coast of the Black Sea. The duration of construction is not specified, but it is known that the fortress was completed in 1466. It is likely that construction took several years to complete.[7]

The more than 2,000 word article is filled with cogent paragraphs like the ones above, divided into sections about its name, construction, various sieges it faced, and even restoration efforts after it “sustained significant damages as a result of bombardment by Russian forces” during World War I.” 

“One small detail, the fortress never existed,” Lebleu said. Aside from a few tangential facts mentioned in the article, like that Mehmed the Conqueror, or

Mehmed II

, was a real person, everything else in the article is fake. “The entire thing was an AI-generated hoax, with well-formatted citations referencing completely nonexistent works.”

Fake citations, Lebleu said, are a more “pernicious” issue because they might stay undetected for months. Even if someone was using an LLM trained on a corpus of data relevant to the Wikipedia article, it could generate text that reads well and with correctly formatted citations of real sources, it still wouldn’t be able to correctly match a citation to a specific claim made in a specific body of work. As an example, Lebleu pointed to

a Wikipedia article about an obscure species of beetle

that cited a real journal article in French. 

“Only thing, that article was about a completely unrelated species of crab, and made no mention of the beetle at all,” Lebleu said. “This adds a layer of complications if the sources are not in English, as it makes it harder for most readers and editors to notice the issue.”

Other examples of AI-generated content appearing on Wikipedia that WikiProject AI Cleanup has removed are more subtle, but can cause just as much confusion. For example, an article about

Darul Uloom Deoband

, a real Islamic seminary in India, at one point featured this image, which like many Wikipedia articles looks like period-appropriate painting related to the subject of the article. 

 Upon closer-examination, however, you can see the telltale signs of poorly AI-generated people with mangled hands and a foot with seven toes. According to a page where WikiProject AI Cleanup tracks the

removal of AI-generated images on Wikipedia

(and that was previously highlighted by

The Signpost

), this image was removed because “the image contributes little to the article, could be mistaken for a contemporary artwork, and is anatomically incorrect.”

The WikiProject AI Cleanup page that tracks AI-generated images on Wikipedia also clarifies that it doesn’t remove AI-generated images from Wikipedia just because they are AI-generated, and that in some cases AI-generated images are appropriate. If the article is about or mentions an AI-generated image, it makes sense to include it. For example, this article about the viral, botched

Willy’s Chocolate Experience

which was advertised with an AI-generated image, includes that image.

This article

about the baseless claims promoted by Donald Trump that Haitian immigrants were eating pets in Springfield, Ohio, includes an

AI-generated image

, tweeted by the Republican-controlled United States House Committee on the Judiciary, of Trump holding a goose and a kitten. The pastoral science fiction article page also features

an image generated with Stable Diffusion

which the team is not trying to remove because “the image is a high quality rendition of the ideas in its section.”

In some ways, it seems, Wikipedia is at least for now better than other major internet services at detecting and filtering out misleading AI-generated content because the site has always relied on human volunteers to review new articles and track down any claims they make to reliable sources. This as opposed to Facebook, Google, Amazon, and other large platforms that have human moderators, but that we’ve repeatedly reported have failed to catch misleading AI-generated content, and usually only remove it in response to our reporting or complaints from users. 

“Wikipedia articles have a more specific format (not only in terms of presentation, but also of content) than Google results, and a LLM that isn’t familiar with it is likely to produce something that is much more easy to spot,” Lebleu said. “Things like verifying references also help: as Wikipedia aims to be a tertiary source (synthesizing other sources without adding original research itself), it should theoretically be possible to verify if the written content matches the sources.”

“While I’d like to think Wikipedians are decent at detecting and removing AI content, there is undoubtedly a lot that slips through the cracks and we’re all volunteers,” Queen of Hearts, another founding member of WikiProject AI Cleanup, said. “While major companies’ failure to detect and remove AI slop is concerning, I believe they could do better than us with properly allocated resources.”

Lebleu said that the editors have discussed using AI to detect AI, with tools like GPTZero, but so far found it had “varying levels of success.”

“There is ultimately no ‘oracle machine’ that could perfectly distinguish AI text from non-AI text,” Lebleu said. “These AI-detecting tools are often imprecise, and only effective on older models like GPT-2. Also, like LLMs themselves, LLMs detectors haven’t been trained specifically on Wikipedia articles, a corpus that is much more homogenous than a much larger training set, and thus easier to distinguish from the outputs of models trained on this larger set. Because of this, humans familiar with both Wikipedia writing guidelines and common LLM keywords are often better at spotting AI content in this specific context.”

About the author

Emanuel Maiberg is interested in little known communities and processes that shape technology, troublemakers, and petty beefs. Email him at emanuel@404media.co

Emanuel Maiberg