HIAWATHA BRAY | TECH LAB
For the founders of a new Boston startup, there’s no such thing as too much information.
Catalog, a company founded by scientists from Harvard University and the Massachusetts Institute of Technology, designs systems that store data on manmade versions of microscopic DNA molecules, instead of on bulky magnetic tapes or silicon chips.
DNA is nature’s own hard drive, storing data by assembling itself in millions of different combinations of just four chemical compounds found in nearly every living cell. Human DNA is so small you need a microscope to see it, but the strand of DNA in a single human cell contains about 800 megabytes of information.
Scientists have been working on replicating the idea behind DNA with artificial versions made in a lab that could store computerized data using the same sequencing techniques found in human genes. The process is laborious and expensive, but the tech giant Microsoft has said it expects to deliver a commercial version by the end of this decade.
But now Catalog, a tiny startup with just $9 million in new funding, threatens to beat Microsoft to the punch, pledging to deliver the first commercial DNA storage product sometime next year. Big-data generators such as corporations and government agencies could use it to store billions of gigabytes of information in a space the size of a bedroom closet.
“It’s a new generation of information storage technology that’s got a million times the information density, compared to flash” storage, said Catalog’s chief executive, Hyunjun Park, referring to the flash memory chips used in digital cameras and thumb drives. “You can shrink down entire data centers into shoeboxes of DNA.”
Though the technology is daunting, the idea behind it seems simple enough.
You take the four main chemical compounds in DNA — cytosine, adenine, thymine and guanine, or C, A, T, and G for short — and use them as shorthand for digital information, just as a computer uses the numbers 1 and 0 as a binary code for all the data it stores.
Scientists can manipulate the order of the compounds so that different sequences of C, A, T, and G correspond to certain sequences of 0’s and 1’s, and huge strings of these compounds can represent the information of a massive data file — every Hollywood film ever made, for instance, or the complete Library of Congress.
The beauty of the technology is that DNA is so dense you can pack a ridiculous amount of data into a microscopic amount. At Catalog, the finished recording resembles a thin, almost invisible film. To access the stored data, the DNA is mixed with water and put into a machine that reads the CATG sequences, translating it back to binary form.
Moreover, DNA is far more durable than magnetic tapes and hard drives, which deteriorate in a few decades. Park said a DNA data archive could remain readable for thousands of years.
Park said Catalog has stored the Douglas Adams science fiction novel “The Hitchhiker’s Guide to the Galaxy” in DNA, to demonstrate that the concept works. Now, all he, Catalog, and the few other entrepreneurs and researchers in the field have to do is prove it can be a viable commercial storage product.
“I think there’s going to be massive challenges, and the biggest challenge is cost,” said Sri Kosuri, assistant professor of chemistry and biochemistry at the University of California Los Angeles.
Kosuri, who worked on DNA data storage at Harvard Medical School, said the process currently is “about six orders of magnitude more expensive than it needs to be.” Making it a practical tool for everyday use “requires some technological breakthrough that I haven’t seen, as yet,” Kosuri said.
Catalog cofounder Park was a postdoctoral associate at MIT, and his cofounder Nathaniel Roquet earned a doctorate in biophysics from Harvard. In 2016, they began developing the company at Indiebio, a San Francisco incubator for biotech startups, and moved to Boston after winning a spot in Harvard’s biotech incubator, the Life Lab. Now they have attracted funding from a host of venture investors, including New Enterprise Associates, OS Fund, Day One Ventures, Data Collective, and Green Bay Ventures.
At least one other startup, Iridia, , based in San Diego, is developing DNA storage systems. And in 2017, Microsoft said that its research department was hard at work on a similar system that it planned to offer as a commercial product by 2020. Microsoft declined to comment.
For the state, the deal was as much a bet on GE as it was on South Boston real estate.Continue reading »
General Electric will reimburse the state for the incentive package that helped convince the company to move its headquarters here, as it looks to sell its Fort Point property.Continue reading »
Investors fear that the restaurant’s struggles threaten to topple the Boston chef’s crown jewel.Continue reading »
All stores will remain open until at least the end of March, and the majority will remain open until May.Continue reading »
Two of the most prominent corporate headquarters deals of this century took major turns Thursday.Continue reading »
Among the goals of the group, called the Academy for Health and Lifespan Research, are to share findings and lobby governments in the United States, Europe, and elsewhere to increase funding and create pathways to approve age-slowing therapies.Continue reading »
Peter Hotton answers readers’ questions.Continue reading »
Amazon’s breakup with New York City is the kind of story where you can draw your own conclusions and find some evidence to support them. Here are mine.Continue reading »
The couple’s plight was detailed in a Boston Globe column on Tuesday.Continue reading »