Ideas

The Word

Wikipedia’s copy editor army

How do you edit a giant crowd-sourced encyclopedia? With volunteers — and lots of fighting

globe staff illustration

Bryan Henderson, a software engineer who lives in San Jose, was once just an ordinary person irritated, like many, by language errors he saw on the Internet. But around 2004, with the rise of a volunteer-generated encyclopedia site called Wikipedia, he was able to start actually fixing those problems—particularly the use of the phrase “comprised of” to mean “composed of.” “Comprise” is most properly used in the active voice to express all the components of something, as in “the week comprises seven days.” Under the Wikipedia username Giraffedata, Henderson set about quietly fixing the faulty “comprised of”—about 48,000 times so far, he told me last week.

Henderson might have continued his labors in obscurity if it weren’t for a profile in Medium.com earlier this month that revealed his name and briefly made him a hero of sorts, a man undertaking a lonely, perhaps pointless quest against an error that not everyone agrees is wrong. (Even the OED includes the “composed of” definition for “comprises.”) The story also hinted at a fascinating underground world: the obsessive, contentious subculture of people who copy edit Wikipedia.

Wikipedia, now a monumental collection of over 4.5 million articles created by 24 million registered users, has become a first-stop resource for writers, researchers, students, and other Internet searchers. Though it’s correct a surprising amount of the time, it doesn’t have a great reputation for accuracy, in content or style. When I spoke to some of the people who copy edit the site, however, it became clear that these amateur editors—like Henderson—are even more passionate about the perfection of language than many professionals.

Advertisement

Traditionally, most publications—newspapers, magazines, journals—employ a team of copy editors, headed by a copy chief. (Books are typically copy edited by freelancers, but their work is overseen by an in-house managing editor.) They usually begin with a published style guide as a basis: Many newspapers use the AP Stylebook, while trade publishers often rely on the Chicago Manual of Style. On top of that, publications typically keep an in-house style guide, an evolving set of rules to cover questions that are ambiguous or not covered.

Get Today in Opinion in your inbox:
Globe Opinion's must-reads, delivered to you every Sunday-Friday.
Thank you for signing up! Sign up for more newsletters here

At Wikipedia, this process is a little different. Anne Fanelli, better known on Wikipedia as Miniapolis, is retired with adult children and started editing six years ago after her mother passed away. Now she spends two to three hours a day working on the site and has made over 25,000 edits. She is a site administrator and the lead coordinator of the Guild of Copy Editors, a group of about 400 people—about 20 of them active, Fanelli says—who organize regular “drives” to work through backlogs of articles. Sometimes the editors use “bots,” or computer programs, to find errors; sometimes writers send their pieces directly. Anyone can join the guild, without any training required, but it tends to be a self-selecting group, another coordinator known as Philg88 told me: “People get the message pretty quickly whether they can cut it or not.”

Wikipedia copy editors can also officially or unofficially “adopt a typo,” a mini version of Henderson’s crusade. Editor DocWatson42 corrects “moeity” (for “moiety”), “Phildelphia,” “assistent,” and “sub-genre” or “sub genre” for “subgenre,” a change he told me he has made 1,316 times.

The copy editors’ task is “Sisyphean,” Philg88 said, due to Wikipedia’s sheer size and the challenges of editing amateur writers, who may not speak English as a first language. Maintaining a common house style is also a daunting project. Wikipedia has a “Manual of Style” (usually abbreviated as MOS), which, like any style guide, is constantly in flux as usages evolve and new names and words enter the lexicon. Unlike at a newspaper copy desk, however, the long and fervent discussions that lead to decisions are anonymous, and can easily degenerate into flame wars.

Not long ago, Fanelli said, Wikipedia erupted into what she called “the hyphen wars: when do you use a hyphen, when do you use an em-dash, when do you use an en-dash. It was insane.” (The Chicago Manual of Style dictates using a slightly longer “en-dash” to show ranges, like 1947–48, and to join up prefixes with compound terms, like pre–World War II; the AP Manual of Style indicates using only hyphens. “Em-dashes”—like these—are legal in both.) The debate raged for years, with groups forming on all sides, including the “Hyphen Luddites,” Wikipedia editors in fervent revolt against what they called an MOS-supported overuse of em-dashes and en-dashes. A battle over “Mexican–American War” vs. “Mexican-American War” led to accusations of lying and “sockpuppetry” and the short-term banning of several editors, then culminated in a May 2011 “arbitration motion regarding hyphens and dashes,” in which all editors were temporarily blocked from making further hyphen/en-dash changes to article titles. “Trivia question: which was longer, the Mexican+American war, or the argument over what to call it?” asked editor Avanu in March 2011.

Advertisement

Editors have also clashed over whether the “the” before “Beatles” should be upper- or lower-case; whether cougar (the animal species) should be capitalized; whether the entry for “Daylight Savings Time” wouldn’t be more accurately written “Daylight-
Saving Time.” Tensions recently spiked over whether four-letter words such as “like” and “with” should be capitalized in titles. According to an e-mail from Wikipedia administrator and copy editor Fluffernutter (who, like other editors, wished to remain anonymous for privacy reasons), debate is pretty much constant: “Wikipedians can manage to fight about just about anything...because everyone does the editing they do because that happens to be the thing they care about/are interested in.”

The intense passions of Wikipedia editors, Fanelli said, can scare people off. But they also explain why she and her colleagues spend hours each day copy editing the work of strangers, for free. The copy editor army clearly feels a sense of pride and responsibility; Fanelli compared the site to the library of Alexandria in terms of size and scope, although she lays no claim on completion. Wikipedia is, she said, “a tremendous work in progress.”

Related:

Text ‘aye,’ matey! The Pirate Party’s push for direct democracy

The Word: Why we love the language police

2012 | The Word: Crowdsourcing the dictionary

Advertisement

2011: How crowdsourcing is changing science

Britt Peterson is an Ideas columnist. She lives in Washington, D.C. Follow her on Twitter @brittkpeterson.