scorecardresearch Skip to main content

Boston startup can search Shakespeare’s plays in a test tube of DNA

Boston startup Catalog said it encoded eight of Shakespeare’s tragedies onto strands of synthetic DNA and then successfully conducted text searches among the 200,000 words stored in a test tube.David L. Ryan/Globe Staff

Parting is such sweet sorrow, Shakespeare’s doomed protagonist Juliet tells her lover Romeo. But how many times does the Bard use the word “sorrow” in the play?

You can use a computer to search the text and come up with the answer (10).

But Boston startup Catalog announced on Monday that it can duplicate the feat with a DNA-based computer that relies on chemical instead of electrical processing. The six-year-old company said it had encoded eight of Shakespeare’s tragedies onto strands of synthetic DNA and then successfully conducted text searches among the 200,000 words stored in a test tube.

The feat marks a new advance in the lengthy effort by Catalog and rival startups to develop a functioning DNA-based computer that can not only duplicate the tasks of electronic computers, but could vastly surpass them in speed of searching and reduced energy use.


Catalog’s search process is similar to the common COVID test that looks for signs of the virus on cells swabbed from a person’s nose, Hyunjun Park, Catalog’s chief executive and cofounder, said in an interview. The swab contains a mix of DNA, and the test uses a chemical agent to find the particular genetic markers of the virus.

Catalog CEO Hyunjun Park standing by a DNA data writer. David L. Ryan/Globe Staff

“Consider a database that is also in DNA form, but containing not genetic information, but digital information,” he said. “And you’re searching through that not for COVID sequences but a search phrase or a pattern that you’re looking for.”

Next year, the company hopes to demonstrate a search through a DNA-encoded database of 100 million words. Ultimately, the technology could be used to search for signs of financial fraud across millions of transactions or comb through vast sets of scientific results like those produced by particle accelerators.

Using chemistry to do the searching and pattern matching has several advantages over electronic computers, at least in theory. Typical computers load data from storage into memory and run computations through a central processor. Catalog’s system avoids those bottlenecks, as the storage and computing happens in a single test tube all at once, greatly speeding up the process. After writing data to DNA, the storage and computing also don’t require electricity, saving energy.


Still, Catalog isn’t about to start selling a DNA computer. Even if everything goes according to plan, it will take a couple more years to prove fully that the concepts work and a few more years after that to create commercial products, the company said.

“We’re inventing the nuts and bolts of it,” Swapnil Bhatia, a computer scientist at the company, said. Referencing the basic components inside silicon chips, Bhatia explained: “We’re not even at the transistor level yet, we’re inventing kind of the wires. And maybe next year, the gates and so on.”

Analysts agree DNA-based computing is still a ways off in the future. “It’s always refreshing and exciting to see demonstrable advancements in promising new technologies and innovations,” said Earl Joseph, chief executive of Hyperion Research who was briefed on Catalog’s advance. “While advancements are occurring...and CATLOG is heavily contributing to each, much work still needs to be done.”

At its office in the old Schrafft’s candy building in Charlestown, Catalog has built a proprietary machine, dubbed Shannon, to encode data onto strands of DNA. The minivan-sized device uses technology similar to inkjet printers to squirt tiny strands of DNA onto a plastic film that are then transferred into a liquid solution for storage and computing.


Colored ink caritages for the DNA Data writer at Catalog.David L. Ryan/Globe Staff

By using DNA molecules for data storage and computing, Catalog can take advantage of technological advances such as the rapidly decreasing cost to sequence DNA.

“There’s a huge amount of tools that we can use for our purposes,” Park said. “Sequencing is a great example. All of that has been developed with billions of dollars behind it.”

Park and his cofounder, Nathaniel Roquet, met at MIT and got help getting the company off the ground from the school’s StartMIT program and the San Francisco accelerator IndieBio. (Roquet left Catalog in 2020 and is the lead scientist at Tessera Therapeutics in Cambridge.)

Despite the West Coast assistance, Boston offered a more compelling community of techies, Park said.

“This area is home to a lot of great academic institutions and other companies in the synthetic biology field, which is a big draw,” Park said. “Having a collection of great companies in the same area is a huge value.”

Aaron Pressman can be reached at Follow him @ampressman.