scorecardresearch Skip to main content

Codex Hackathon, a two-day marathon of tech for books

One hundred and sixty programmers and literary nerds put their heads together to solve the future of publishing.

Jessica Rinaldi

CAMBRIDGE — Luke Van Seters and Vignesh Mohankumar were camped out on a pair of purple couches on the third floor of the MIT Media Lab, attempting to hack written language itself.

The code that the two Northeastern students had cobbled together over the previous 48 hours analyzes an excerpt of text, then rewrites it slightly to be about a different subject, often with bizarre results. Van Seters tested it by feeding in a New York Times article about Sean Penn's interview with cartel kingpin Joaquín Guzmán — but with instructions to rewrite it to be less about "drugs" and more about "candy." The algorithm dutifully produced a surreal, almost-coherent account of a meeting between "Sean Hershey" and a "Mexican chocolate lord" that had taken place in a "jungle gumdrop."


Van Seters and Mohankumar were two of about 160 participants in the CODEX Hackathon, a weekend-long event at MIT timed to coincide with the American Library Association's midwinter conference in Boston. The second of its kind, it united programmers with librarians, students, and others with an interest in books to dream up and create apps and websites meant to expand the frontiers of publishing — all in the space of two days.

"There are a lot of really talented developers who love books, and who would love to participate if they knew the problems that needed to be solved," said organizer Jennifer 8. Lee, the author of the best-selling 2008 book "The Fortune Cookie Chronicles" and cofounder of literary studio Plympton.

Participants ranged from high schoolers to bearded hackers and entrepreneurs to gray-haired librarians. Though some projects were as whimsical as Van Seters and Mohankumar's, others targeted major current dilemmas in library science and publishing. One common goal was to help readers better access and use online repositories of information — plentiful these days, but often clunky or hard to navigate.


George Baier IV, a vice president of business and workflow systems at publisher Macmillan, decided to attend the hackathon in order to rub shoulders with book enthusiasts outside the publishing industry. On Saturday, he joined a team that formed to build an app called Infinite Library, an e-reader that could access any of the 50,000 volumes in Project Gutenberg, a public database of books that have fallen into the public domain. Unlike Project Gutenberg's site, which displays books as plain text, Infinite Library would render them in a modern, aesthetically pleasing format.

"There's all this public content available, but it's not necessarily serving the reading community," said Baier, whose collaborators on Infinite Library included a publishing colleague from Hachette Book Group named Michael Gaudet, a programmer and English PhD student at Columbia University named Jonathan Reeve, and a visual artist named Fiona Sergeant who plans to open a restaurant in Baltimore later this year.

By Sunday, the Infinite Library team had sprawled across two tables in a back room of the Media Lab, their MacBooks surrounded by empty water bottles and the remains of a catered lunch from Clover Food Lab. At an adjacent table, another team was working on a project called subTEXT, an app intended to annotate text automatically by searching the Internet for relevant images and information to display in the margins. The hard part was that the team wanted the software to determine what information was relevant by context — to figure out, for example, whether a reference in the text was to Rome, Italy, or Rome, N.Y.


"What if the book could know what you didn't know?" said Franny Gaede, a librarian at Butler University who worked on the subTEXT team. "So we turned that problem over to the computer."

The CODEX Hackathon at MIT united programmers with librarians, students, and others to dream up apps and websites meant to expand the frontiers of publishing.Jessica Rinaldi

In the high-ceilinged front room, which buzzed with the sounds of urgent conversations and keystrokes, another team was building a website called Glasnost that lets users report books and Web pages that have been censored by governments around the world — and files the reports into a database that's distributed over many users' computers to make it difficult to suppress. At a neighboring table, a team headed by University of British Columbia student Peter Siemens was laboring to build a service called Stanza, which recommends a poem depending on the user's mood.

Many CODEX participants mentioned being amused by an automated Twitter account, also created over the weekend, called Paige Turner — its profile picture displays the silhouette of a woman with a severe haircut, wearing Gary Larson-style glasses — that sends a daily tweet to users, reminding them to read a book instead of their phone during their morning commute.

"I find the experience of reading a book to be very rewarding," said Gabe Stein, a programmer from Brooklyn who masterminded Paige Turner. "But books don't come with things that nag you to read them. Books don't have access to your phone."

Hackathons, which are events at which teams attack a problem with technology over a short stretch of time, date to the late 1990s. There have been few devoted to the world of publishing, though, which was what led Lee to organize the first CODEX Hackathon in San Francisco last summer. The name is a riff on "code" and "codex," which means a book with pages.


"In order to get developers to solve actual practical problems for publishing, you need to pair them up with people from publishing and literary backgrounds," Lee said.

For the latest hackathon, Lee secured about $35,000 in funding from sponsors including Google and the Harvard Book Store. That money went toward travel stipends for attendees — organizers prioritized stipends for women and underrepresented minorities, Lee said — and for hacker fuel: sandwiches, Indian cuisine, and a steady supply of coffee, soft drinks, and bags of chips.

For those who work on digital libraries full time, the burst of energy from a hackathon can be exhilarating. "There's a huge amount of free content available to the public," said Dan Cohen, the executive director of the Boston-based Digital Public Library of America, another organization that provided funding for the event. "Events like the CODEX Hackathon provide a way for developers and other people to try new things out on this wonderful digital content."

Lee pointed out that even if a team's project is unfinished or glitchy at the end of the weekend, the relationships that came about because of the work remain. Still, pressure mounted as the 3 p.m. deadline started to loom on Sunday afternoon.


Back at the purple couches, Van Seters had the core code working, but Mohankumar was still struggling to build it into a browser plug-in.

"If I can get this to work in the next 20 minutes, it might be the greatest accomplishment of my life," Mohankumar said, hunching over his laptop.

Jon Christian can be reached at