scorecardresearch Skip to main content

‘Very harmful’ lack of data blunts US response to outbreaks

Boxes of COVID-19 test results at the Alaska health department's epidemiology division in Anchorage, Alsaka on Tuesday, July 5, 2022. Major data gaps, the result of decades of underinvestment in public health, have undercut the government response to the coronavirus and now to monkeypox.ASH ADAMS/NYT

ANCHORAGE, Alaska — After a middle-aged woman tested positive for COVID-19 in January at her workplace in Fairbanks, public health workers sought answers to questions vital to understanding how the virus was spreading in Alaska’s rugged interior.

The woman, they learned, had existing conditions and had not been vaccinated. She had been hospitalized but had recovered. Alaska and many other states have routinely collected that kind of information about people who test positive for the virus. Part of the goal is to paint a detailed picture of how one of the worst scourges in American history evolves and continues to kill hundreds of people daily, despite determined efforts to stop it.


But most of the information about the Fairbanks woman — and tens of millions more infected Americans — remains effectively lost to state and federal public health researchers. Decades of underinvestment in public health information systems has crippled efforts to understand the pandemic, stranding crucial data in incompatible data systems so outmoded that information often must be repeatedly typed in by hand. The data failure, a salient lesson of a pandemic that has killed more than 1 million Americans, will be expensive and time-consuming to fix.

The precise cost in needless illness and death cannot be quantified. The nation’s comparatively low vaccination rate is clearly a major factor in why the United States has recorded the highest COVID death rate among large, wealthy nations. But federal experts are certain that the lack of comprehensive, timely data has also exacted a heavy toll.

“It has been very harmful to our response,” said Dr. Ashish K. Jha, who leads the White House effort to control the pandemic. “It’s made it much harder to respond quickly.”

Details of the Fairbanks woman’s case were scattered among multiple state databases, none of which connect easily to the others, much less to the Centers for Disease Control and Prevention, the federal agency in charge of tracking the virus. Nine months after she fell ill, her information was largely useless to public health researchers because it was impossible to synthesize most of it with data on the roughly 300,000 other Alaskans and the 95 million-plus other Americans who have gotten COVID.


Those same antiquated data systems are now hampering the response to the monkeypox outbreak. Once again, state and federal officials are losing time trying to retrieve information from a digital pipeline riddled with huge holes and obstacles.

“We can’t be in a position where we have to do this for every disease and every outbreak,” Dr. Rochelle P. Walensky, the CDC director, said in an interview. “If we have to reinvent the wheel every time we have an outbreak, we will always be months behind.”

The federal government invested heavily over the past decade to modernize the data systems of private hospitals and health care providers, doling out more than $38 billion in incentives to shift to electronic health records. That has enabled doctors and health care systems to share information about patients much more efficiently.

But while the private sector was modernizing its data operations, state and local health departments were largely left with the same fax machines, spreadsheets, emails and phone calls to communicate.

States and localities need $7.84 billion for data modernization over the next five years, according to an estimate by the Council of State and Territorial Epidemiologists and other nonprofit groups. Another organization, the Healthcare Information and Management Systems Society, estimates those agencies need nearly $37 billion over the next decade.


The pandemic has laid bare the consequences of neglect. Countries with national health systems like Israel and, to a lesser extent, Britain were able to get solid, timely answers to questions such as who is being hospitalized with COVID and how well vaccines are working. American health officials, in contrast, have been forced to make do with extrapolations and educated guesses based on a mishmash of data.

Facing the wildfirelike spread of the highly contagious omicron variant last December, for example, federal officials urgently needed to know whether omicron was more deadly than the delta variant that had preceded it and whether hospitals would soon be flooded with patients. But they could not get the answer from testing, hospitalization or death data, Walensky said, because it failed to sufficiently distinguish cases by variant.

Instead, the CDC asked Kaiser Permanente of Southern California, a large private health system, to analyze its COVID patients. A preliminary study of nearly 70,000 infections from December showed patients hospitalized with omicron were less likely to be hospitalized, need intensive care or die than those infected with delta.

But that was only a snapshot, and the agency only got it by going hat in hand to a private system. “Why is that the path?” Walensky asked.

The drought of reliable data has also repeatedly left regulators high and dry in deciding whether, when and for whom additional shots of coronavirus vaccine should be authorized. Such decisions turn on how well the vaccines perform over time and against new versions of the virus. And that requires knowing how many vaccinated people are getting so-called breakthrough infections and when.


But almost two years after the first COVID shots were administered, the CDC still has no national data on breakthrough cases. A major reason is that many states and localities, citing privacy concerns, strip out names and other identifying information from much of the data they share with the CDC, making it impossible for the agency to figure out whether any given COVID patient was vaccinated.

“The CDC data is useless for actually finding out vaccine efficacy,” said Dr. Peter Marks, the top vaccine regulator at the Food and Drug Administration. Instead, regulators had to turn to reports from various regional hospital systems, knowing that picture might be skewed, and marry them with data from other countries like Israel.

The jumble of studies confused even vaccine experts and sowed public doubt about the government’s booster decisions. Some experts partly blame the disappointing uptake of booster doses on squishy data.

The FDA now spends tens of millions of dollars annually for access to detailed COVID-related health care data from private companies, Marks said. About 30 states now also report cases and deaths by vaccination status, showing that the unvaccinated are far more likely to die of COVID than those who got shots.


But those reports are incomplete, too: The state data, for instance, does not reflect prior infections, an important factor in trying to assess vaccine effectiveness.

And it took years to get this far. “We started working on this in April of 2020, before we even had a vaccine authorized,” Marks said.

Now, as the government rolls out reformulated booster shots before a possible winter virus surge, the need for up-to-date data is as pressing as ever. The new boosters target the version of a fast-evolving virus that is currently dominant. Pharmaceutical companies are expected to deliver evidence from human clinical trials showing how well they work later this year.

“But how will we know if that’s the reality on the ground?” Jha asked. Detailed clinical data that includes past infections, history of shots and brand of vaccine “is absolutely essential for policymaking,” he said. “It is going to be incredibly hard to get.”

New Outbreak, Same Data Problems

When the first U.S. monkeypox case was confirmed May 18, federal health officials prepared to confront another information vacuum. Federal authorities cannot generally demand public health data from states and localities, which have legal authority over that realm and zealously protect it. That has made it harder to organize a federal response to a new disease that has now spread to nearly 24,000 people nationwide.

Three months into the outbreak, more than half of the people reported to have been infected were not identified by race or ethnicity, clouding the disparate impact of the disease on Black and Hispanic men.

To find out how many people were being vaccinated against monkeypox, the CDC was forced to negotiate data-sharing agreements with individual jurisdictions, just as it had to do for COVID. That process took until early September, even though the information was important to assess whether the taxpayer-funded doses were going to the right places.

The government’s declaration in early August that the monkeypox outbreak constituted a national emergency helped ease some of the legal barriers to information-sharing, health officials said. But even now, the CDC’s vaccine data is based on only 38 states, plus New York City.

Some critics say the CDC could compensate for its lack of legal clout by exercising its financial muscle, since its grants help keep state and local health departments afloat. But others say such arm-twisting could end up harming public health if departments then decide to forgo funding and not cooperate with the agency.

Nor would that address the outmoded technologies and dearth of scientists and information analysts at state and local health departments, failings that many experts say are the biggest impediment to getting timely data.

Alaska is a prime example.

Early in the pandemic, many of the state’s COVID case reports arrived by fax on the fifth floor of the state health department’s office in Anchorage. National Guard members had to be called in to serve as data entry clerks.

The health department’s highly trained specialists “didn’t have the capacity to be the epidemiologists that we needed them to be because all they could do was enter data,” said Dr. Anne Zink, Alaska’s chief medical officer, who also heads the Association of State and Territorial Health Officials.

All too often, she said, the data that was painstakingly entered was too patchy to guide decisions.

A year ago, for instance, Zink asked her team whether racial and ethnic minorities were being tested less frequently than whites to assess whether testing sites were equitably located.

But public health researchers could not tell her because for 60% of those tested, the person’s race and ethnicity were not identified, said Megan Tompkins, a data scientist and public health researcher who until this month managed the state’s COVID data operation.

Boom and Bust Funding

State and local public health agencies have been shriveling, losing an estimated 15% of their staffs between 2008 and 2019, according to a study by the de Beaumont Foundation, a public-health-focused philanthropy. In 2019, public health accounted for 3% of the $3.8 trillion spent on health care in the United States.

The pandemic has prompted Congress to loosen its purse strings. The CDC’s $50 million annual budget for data modernization was doubled for the current fiscal year, and key senators seem optimistic it will double again next year. Two pandemic relief bills provided an additional $1 billion, including funds for a new center to analyze outbreaks.

But public health funding has traced a long boom-and-bust pattern, rising during crises and shrinking once they end. Although COVID still kills about 400 Americans each day, Congress’ appetite for public health spending has waned.

While $1 billion-plus for data modernization sounds impressive, it is roughly the cost of shifting a single major hospital system to electronic health records, Walensky said.

For the first two years of the pandemic, the CDC’s disease surveillance database was supposed to track not just every confirmed COVID infection, but whether infected individuals were symptomatic, had recently traveled or attended a mass gathering, had existing medical conditions, had been hospitalized, required intensive care and had survived. State and local health departments reported data on 86 million cases.

But the vast majority of data fields are usually left blank, an analysis by The New York Times found. Even race and ethnicity, factors essential to understanding the pandemic’s unequal impact, are missing in about one-third of the cases. Only the patient’s sex, age group and geographic location are routinely recorded.

While the CDC said the basic demographic data remains broadly useful, swamped health departments were too overwhelmed or too ill-equipped to provide more. In February, the agency recommended that they stop trying and focus on high-risk groups and settings instead.

The CDC has patched together other disparate sources of data, each imperfect in its own way. A second database tracks how many COVID patients turn up in about 70% of the nation’s emergency departments and urgent care centers. It is an early warning signal of rising infections. But it is spotty: Many departments in California, Minnesota, Oklahoma and elsewhere do not participate.

Another database tracks how many hospital inpatients have COVID. It, too, is not comprehensive, and it is arguably inflated because totals include patients admitted for reasons other than COVID but who tested positive during their stay. The CDC nevertheless relies partly on those hospital numbers for its rolling, county-by-county assessment of the virus’s threat.

There are bright spots. Wastewater monitoring, a new tool that helps spot incipient coronavirus surges, is now conducted at 1,182 sites around the country. The government now tests enough viral specimens to detect whether a new version of the virus has begun to circulate.

In the long run, officials hope to leverage electronic health records to modernize the disease surveillance system that all but collapsed under the weight of the pandemic. Under the new system, if a doctor diagnoses a disease that is supposed to be flagged to public health authorities, the patient’s electronic health record would automatically generate a case report to local or state health departments.