In November 1839, a group of five men convened in Boston to draft a constitution for a society charged with “the purpose of collecting, preserving and diffusing statistical information in the different departments of human knowledge.”
Far from being specialists, these early statisticians were prominent in other domains — a clergyman, a publisher, a lawyer, a congressman, and a physician, brought together by a shared interest in accruing information about the world with crude tools that often boiled down to simply tallying things up.
This weekend, about 6,000 statisticians are visiting Boston for the annual meeting of the American Statistical Association, highlighting the astonishing arc the field has taken since that inaugural meeting. From practitioners of a nascent science that was simply trying to collect reliable data on crops and population, statisticians now deploy powerful computational tools to sift meaning from a tsunami of data that can range from the full genomes of thousands of people to the casual interactions people have online.
The number of undergraduates majoring in statistics or biostatistics has tripled during the past decade and the number of people earning master’s degrees has doubled. It is one of the hottest jobs around; a report by the research firm McKinsey and Co. predicts that by 2018, the United States will face a statistician shortage, needing at least 140,000 people with the analytical skills to work on big data problems.
Nearly everything can now be turned into a data point, from credit card use to the makeup of microbes in our stomachs, and statistics is hearkening back to the field’s roots in the mainstream as a force that will shape society.
In its early days, the Statistical Association’s members included US President Martin Van Buren and the British nurse Florence Nightingale. The nascent science was essential to a young country, shaped understanding of the importance of sanitary conditions in wartime hospitals, and allowed a growing country of immigrants to understand who was in it.
Today, statistics have begun again to creep back into the public consciousness — used to assemble winning baseball teams and predict elections with greater accuracy.
For years, working as a statistician meant dealing with public misconceptions. Ron Wasserstein, executive director of the American Statistical Association, recalled that when he earned his graduate degree in statistics in the early 1980s, friends and family mistakenly thought that it involved memorizing lots of numerical trivia. He would have to explain that the methods of statistics were used to figure out how to understand patterns in data, such as whether a drug is effective, whether cigarettes cause lung cancer, or whether a poll of 1,000 people is likely to be indicative of general opinion on a topic. Now, things have changed.
“You can tell people, ‘I’m like “Moneyball,” but for biomedical applications,’ ” said Rafael Irizarry, a professor of biostatistics at the Harvard School of Public Health.
The methods used to gather and analyze data began changing and becoming more sophisticated, thanks in part to the efforts of the statisticians. Francis A. Walker, a president of the Massachusetts Institute of Technology and former president of the American Statistical Association, helped improve the field and the quality of the census.
“He helped bring the census from a rather crude counting in the 1790s into something that begins to give a meticulously detailed picture of a nation that was growing,” said Stephen Stigler, a professor of statistics at the University of Chicago. “We’ve been a field that’s been changing many times over the years, but there’s some constancy to that and one of them is: How can we process information? If you have a collection of measurements or observations, how can you reduce it to something you can understand?”
Surprisingly, even as the tools of statisticians have changed radically, the underlying quest has remained remarkably the same. Extracting meaning from data is the same basic problem, even if today the challenge is more likely to be how to sift meaning from what would have once been an unthinkable amount of information, not how to assemble it in the first place.
At the meeting this week, presentations are spanning a wide gamut, from the application of statistics to understanding neuroscience in President Obama’s BRAIN initiative, to the challenges in designing cancer clinical trials, to projections of the finish time for the nearly 6,000 runners who were unable to complete the 2013 Boston Marathon after the bombing halted the race.
In fact, Irizarry said, statisticians have become so essential and integrated into so many different disciplines where the tools are needed to make sense of data that some people experience almost an existential crisis.
“Some people who are trained as statisticians become very much involved in a specific field, they sometimes start wondering, ‘Am I still a statistician?’ ” Irizarry said. “Now, all of a sudden they almost have become a geneticist and some of them even start to do experiments. That’s one of the things I hope, that we can continue to be a discipline and continue to have this core knowledge we all share.”