When President Trump predicted on May 3 that around 100,000 Americans might die of COVID-19, he contradicted a more dire report (disavowed by the Centers for Disease Control and Prevention but carrying its logo) showing 3,000 deaths per day by June 1. Trump may have been influenced by a model prepared by Kevin Hassett, former chair of the Council of Economic Advisers. Since dubbed the “cubic model,” it is by all accounts a travesty of data science, a naive forecast based on extending an existing trend line, the kind of analysis that would get a failing grade in a high school statistics class. Among other issues, the model appeared to predict zero daily deaths by May 15 (although Hassett later said the chart was not meant to be used as a prediction). If the extrapolation were continued, it would imply a negative number of deaths afterwards. It’s the equivalent of concluding that since today was colder than yesterday, we are due for an ice age next week.
Predicting the scale of the coronavirus pandemic is an especially difficult task because the forecasts can be self-defeating: any good news may cause people to take more risks and negate the assumptions about social behavior underlying the model. So a laughably optimistic forecast like the “cubic model” has the impressive potential to prove itself doubly wrong — first as an obvious undercount and then again for encouraging lax behavior that drives the count higher still. But being right or wrong might not have been the point. As the statistical truism goes, all models are wrong, but some are useful. The utility of this particular model was the support it could lend to the narrative that the virus is under control and we can safely reopen the economy. The rosy forecasts are therefore best understood not as predictions but as world-building, the construction of facts that would have to be true to justify a decision already made.
Evaluating models based on the outcomes that followed is hard. Since a model reflects the information put into it, the uncertainty about those ingredients and the incorrectness of the assumptions can leave us to be blindsided by actual events. At the same time, while statistical models are based on our existing knowledge, they’re useful only if they point us to something we didn’t already know. So, often the best we can ask for is transparency: an honest accounting of the model’s assumptions and a good-faith commitment to let forecasts drive policy and not the other way around. The administration’s “cubic model,” though cloaked in authoritative mathematical language, failed spectacularly by these criteria. Like a line drawn in Sharpie on a weather map, it is an apparent product of motivated reasoning, the kind of modeling that’s sadly easy. To quote another aphorism, “It is difficult to make predictions, especially about the future.” That’s true — if they’re done right.
Aubrey Clayton is a mathematician living in Boston and the author of the forthcoming book “Bernoulli’s Fallacy.” Follow him on Twitter: @aubreyclayton.