fb-pixel Skip to main content
Science In Mind

Learning from Ebola, Harvard researchers suggest more open data

Pardis Sabeti, a computational biologist at Harvard Universtiy, is calling for a summit to discuss how to share infectious disease data in the midst of an epidemic, such as Ebola.David L. Ryan/Globe Staff/File/2008/Boston Globe

Nearly two decades ago, science and policy leadersgathered in Bermuda to hammer out a strategy for sharing the data from the international effort to sequence the human genome. The group drafted the bold and unusual “Bermuda Principles” that have profoundly shaped biomedical research since: the genome data would be made public within 24 hours of production.

A group of Harvard University researchers who have been sequencing the genomes of Ebola virus samples from Africa throughout the recent epidemic now see a need for similar meeting of the minds among people working on disease outbreaks.

In a commentary published in the journal Nature, a group of researchers led by Harvard computational biologist Pardis Sabeti calls for a new summit to discuss how to share infectious disease data in the midst of an epidemic.


“As sequencing technology and capabilities have spread, it’s going to be increasingly easier to sequence viruses during outbreaks and because of this . . . we need to think very carefully about how and when that data is shared,” said Nathan Yozwiak, a senior staff scientist at the Broad Institute, a genomics research powerhouse in Cambridge, and a coauthor of the paper. “We want to use this as an opportunity to think what are we going to do about the next outbreak, whether that is Ebola or West Nile, or some other pathogen we have yet to discover.”

In science, the norm is often to keep data private until a group has published a definitive analysis. But Sabeti’s team took great efforts to make their Ebola data publicly available immediately this summer after they produced them from samples shipped to Cambridge from the Kenema Government Hospital in Sierra Leone.

Their sequence data helped resolve how the virus spread so widely during the outbreak, which has claimed more than 9,300 lives. After Ebola appeared in the human population, the DNA data show it spread from person to person, instead of from repeated human contact with bats or other animals that harbor the virus. The data also helped scientists understand how quickly the virus was mutating and compare changes in its molecular sequence to tests and potential treatments, to understand whether they will continue to work.


This summer, when Sabeti and colleagues sequenced 99 Ebola genomes from Sierra Leone and made the dataavailable, they were surprised by the rapid and widespread response. The laboratory typically studies the evolution and genetics of pathogens, but they quickly found themselves fielding inquiries from drug companies, vaccine developers, and virologists who wanted to use the data to pursue their own research ideas and therapy development.

The solution to the infectious disease data-sharing puzzle may be a bit more involved than the principles used for the genome. Privacy concerns could arise because some of the information might include clinical data about patients. There are also diverse interest groups involved in a disease outbreak who don’t always work together, including local public health officials, aid groups, and academic scientists.

“The challenge with outbreaks is it is often a novel pathogen, or a pathogen we didn’t necessarily think had pathogen potential. You create a new research community on the fly, which has new players,” Yozwiak said.

Such an impromptu collaboration of brain power and varied expertise could also be part of the strength of modern approaches to disease outbreaks — as long as the data is there to draw the key players together.


Carolyn Y. Johnson can be reached at cjohnson@globe.com. Follow her on Twitter @carolynyjohnson.