Wikipedia Will Survive A.I.
Rumors of Wikipedia’s death at the hands of ChatGPT are greatly exaggerated.
Wikipedia is, to date, the largest and most-read reference work in human history. But the editors who update and maintain Wikipedia are certainly not complacent about its place as the preeminent information resource, and are worried about how it might be displaced by generative A.I. At last week’s Wikimania, the site’s annual user conference, one of the sessions was “ChatGPT vs. WikiGPT,” and a panelist at the event mentioned that rather than visiting Wikipedia, people seem to being going to ChatGPT for their information needs. Veteran Wikipedians have couched ChatGPT as an existential threat, predicting that A.I. chatbots will supplant Wikipedia in the same way that Wikipedia infamously dethroned Encyclopedia Britannica back in 2005.
But it seems to me that rumors of the imminent “death of Wikipedia” at the hands of generative A.I. are greatly exaggerated. Sure, the implementation of A.I. technology will undoubtedly alter how Wikipedia is used and transform the user experience. At the same time, the features and bugs of large language models, or LLMs, like ChatGPT intersect with human interests in ways that support Wikipedia rather than threaten it.
For context, there have been elements of artificial intelligence and machine learning on Wikipedia since 2002. Automated bots on Wikipedia must be approved, as set forth in the bot policy, and generally must be supervised by a human. Content review is assisted by bots such as ClueBot NG, which identifies profanity and unencyclopedic punctuation like “!!!11.” Another use case is machine translation, which has helped provide content for the 334 different language versions of the encyclopedia, again generally with human supervision. “At the end of the day, Wikipedians are really, really practical—that’s the fundamental characteristic,” said Chris Albon, director of machine learning at the Wikimedia Foundation, the nonprofit organization that supports the project. “Wikipedians have been using A.I. and M.L. from 2002 because it just saved time in ways that were useful to them.”
In other words, bots are old news for Wikipedia—it’s the offsite LLMs that present new challenges. Earlier this year, I reported on how Wikipedians were grappling with the then-new ChatGPT and deciding whether chatbot-generated content should be used in the process of composing Wikipedia articles. At the time, the editors were understandably concerned with how LLMs hallucinate, responding to prompts with outright fabrications complete with fake citations. There is a real risk that users who copy ChatGPT text into Wikipedia would risk polluting the project with misinformation. But an outright ban on generative A.I. seemed both too harsh and too Luddite—a failure to recognize new ways of working. Some editors have reported that ChatGPT answers were useful as a starting point or a skeletal outline. While banning generative A.I. could keep low-quality ChatGPT content off of Wikipedia, it could also curtail the productivity of human editors.
These days, Wikipedians are in the process of drafting a policy for how LLMs can be used on the project. What’s being discussed is essentially a “take care and declare” framework: The human editor must disclose in an article’s public edit history that an LLM was used and must take personal responsibility for vetting the LLM content and ensuring its accuracy. It’s worth noting that the proposed policy for LLMs is very similar to how most Wikipedia bots require some human supervision. Leash your bots, your dogs, and now your LLMs.