Facebook is confronting another privacy scandal after it was reported that the company has formal agreements to share user data with at least four Chinese electronics companies, including one flagged as a national security threat to the United States. The latest revelation contributes to a growing unease about how Facebook and other tech companies protect user privacy.
From undisclosed data sharing that puts health and financial information at risk to political ads targeted using unauthorized data, the consequences of big data have become vaster and more threatening than anyone could have imagined. Our research finds traditional approaches to safeguarding privacy stretched to the limit as thousands of data points are collected about each of us every day and maintained indefinitely by a host of technology platforms.
For all our fears, many people have a sense that the erosion of privacy is simply the price we must pay for the benefits of technology. However, the death of privacy is not inevitable, and the proof comes from another field that collects and protects large amounts of sensitive personal information: scientific research.
After facing its own series of ethical crises nearly 50 years ago, particularly the infamous Tuskegee syphilis experiment, the scientific community developed a rigorous system for collecting and using research data while protecting the people behind the data. This system of research ethics is a fundamental foundation of modern science, helping academics maintain the trust of the people who participate in their research.
To understand what a company like Facebook can learn from scientists, consider the Framingham Heart Study. One of the longest-running scientific studies in history, it has been tracking the health outcomes of 15,000 people over seven decades and three generations. By collecting detailed health information, psychological and social data, and even DNA samples over the course of people’s lives, researchers have made breakthroughs in understanding cardiovascular disease and other health problems.
Longitudinal studies like Framingham have to grapple with many of the same problems that confront private companies operating in the realm of big data. Yet few of the common practices and techniques used to protect privacy in academia have gained widespread adoption in the business world. Understanding how scientists address data privacy can be instructive for technology companies navigating similar challenges.
A vital oversight mechanism in academia is the institutional review board. Before launching a study, researchers must receive approval from an independent committee that reviews what data they will collect, how they will use and share the data, and how they will protect participants’ privacy.
In contrast, the approach to protecting privacy in the tech industry is much less rigorous and standardized. Discussions about how data will be used and what privacy risks are acceptable often take place behind closed doors among company employees with biased incentives. Decisions are made in an ad hoc fashion without systematic, industry-wide standards for guidance. Further, protections often focus on simply removing information deemed to be “identifying,” rather than evaluating and addressing privacy risks holistically.
Facebook is one of the few tech companies that has created internal committees to review research ethics and privacy issues. However, the members of these bodies are Facebook employees, and they only review activities that the company considers research. Facebook’s recent announcement of an academic research collaboration that relies on oversight by a university-based review board is a step in the right direction.
Another cornerstone of the academic approach to protecting research participants is informed consent. Researchers disclose in plain language exactly how people’s data will be used and must obtain their consent before proceeding. If opportunities to use or share data in unanticipated ways arise — as is common in long-term studies like the Framingham Heart Study — researchers must renew consent for each new application of the data.
The approach in the tech industry seems to be the opposite: overly broad, hard-to-read terms of service allow companies to collect massive amounts of data and decide later how to use them, often without disclosing the details to users. Companies seem to take an attitude of ownership over people’s data and view securing fully informed consent as more optional than obligatory. When Facebook users agreed to the company’s terms of service, it’s doubtful they imagined having their data used by a private firm like Cambridge Analytica to manipulate voters and influence a presidential election.
Another lesson from academia is that you can have clear, consistent standards while still allowing for flexibility and customization. Scientists use tiered access to provide different levels of security and privacy protection depending on who the end user of the data is, from the general public to academic peers to trusted research colleagues. Some tech companies, like Google, have begun to implement tiered access to their data in limited settings and could consider expanding this approach.
Flexibility allows the academic system to continue evolving to address new opportunities and risks and to better balance privacy and research value. Researchers are developing innovative technical approaches like differential privacy, which offers protection against a wide range of potential privacy violations, including types of attacks currently unknown or unforeseen. The Harvard Dataverse, a massive repository of social science data, was one of the first entities to begin exploring implementations of differential privacy, and the approach has since been adopted by companies like Apple and Uber.
While the tech industry is beginning to embrace some of the privacy approaches that are common in science, academia generally manages privacy risks far more effectively than most commercial entities. Admittedly, scientific studies do not involve as many participants sharing as much data as Facebook gathers on a daily basis. But what academia demonstrates is that you don’t have to choose between privacy and valuable data.
Critics may argue that academics can be more generous in protecting privacy because they don’t have companies to run and shareholders to please. But at the end of the day, the research ethics system is equally self-interested. The scientific community understands that without people willing to participate in its research, there would be no research. Facing a moment of reckoning, academics got out in front of the problem and developed a system that protects privacy while still providing rich data sources for researchers.
As the tech industry confronts its own crisis — with users scrambling to review privacy settings and threatening to #DeleteFacebook — it may want to follow the scientific community’s lead. Protecting privacy is possible and is being done all the time in academia, countering the claims of tech titans who insist it’s too difficult.
While not all of academia’s approaches are applicable, there is plenty for the tech industry to learn from if it wants to make safeguarding privacy a real priority.Alexandra Wood is a fellow at the Berkman Klein Center for Internet & Society at Harvard University. Micah Altman is director of research at MIT Libraries. They are members of the Harvard University Privacy Tools Project.