data dive

The looming battle for clean data


With the science increasingly certain and environmental rules tightening, the 21st-century contest for a better environment and a stable climate is shaping up to be a battle over clean data.

The wider problem of data pollution seems to grow by the hour. Volkswagen has seen its market share (and reputation) shredded over its cheating on emissions tests. Hyundai and Kia agreed last year to pay hundreds of millions of dollars in penalties for overestimating the fuel economy of some 1.2 million vehicles. And carmakers are only among the latest to be caught manipulating data. With the Paris climate talks and tighter greenhouse gas limits looming, China has just disclosed it has substantially underreported the size of its coal industry — and thus its total greenhouse gas emissions.

Given society’s headlong embrace of data and Internet-based monitoring, the VW scandal should be an “early warning signal that we shouldn’t have too much reliance on the sensors, data collection, and data, without some vigilance and cross-checks,” according to Michael Zimmer, a longtime energy industry attorney and a fellow at Ohio University. He adds: “It’s a canary in the coal mine.”


What’s more, while Volkswagen was particularly brazen in its flouting of the rules, more subtle gaming is increasingly a reality as the United States moves into a more advanced stage of environmental rules.

Get Today in Opinion in your inbox:
Globe Opinion's must-reads, delivered to you every Sunday-Friday.
Thank you for signing up! Sign up for more newsletters here

Corruption of environmental regulators and auditors has long been a huge problem in many developing economies. US embassies in China, India, and elsewhere famously monitor pollution in those countries because official indexes can’t be trusted, particularly in Chinese cities where data reporting on air quality consistently comes in just below the government’s cutoff.

In the West, however, it is much more often companies that stand accused of skirting the rules. For some air pollutants and nearly all water pollutants, in fact, the integrity of the data is often uncertain, as regulators must rely on self-reporting by companies. In such cases, discretion is a big factor. “It is those gray areas where the potential for gaming the system rises most significantly,” says Jay P. Shimshack, an expert on the effects of environmental regulations at the University of Virginia.

Gamesmanship comes in many varieties, researchers have noted.

In a paper for The American Economic Review, Maximilian Auffhammer of the University of California Berkeley and Ryan Kellogg of the University of Michigan found that restrictions on the chemical composition of gasoline and use of volatile organic compounds — intended to reduce ground-level ozone — were largely foiled because refiners were given too much discretion. Many refiners took the cheaper route and simply reduced volatile organic compounds that didn’t have much impact on ozone. But their data showed they were in compliance. Only California, which had stricter regulations on which chemicals had to be minimized, saw lower ozone levels, the researchers found.


Another study saw similar exploitation of wide discretion, showing how companies made a mockery of a voluntary greenhouse gas disclosure program sponsored by the Department of Energy. That research, by Eun-Hee Kim of George Washington University and Thomas P. Lyon of the University of Michigan, found that, thanks to corrupted data, firms that emitted more greenhouse gases reported exactly the opposite. (Indeed, some social science research suggests that voluntary regulations that are easily skirted actually induce more unethical outcomes than having no regulations at all.)

Moreover, Shimshack’s newest research suggests that data disclosed by publicly owned waste-water treatment plants “seem to show some signatures of strategic reporting.” More generally, the pervasive practice of “greenwashing,” or companies misleading consumers about environmental performance, has continued to be documented across the economy.

Convenient selection of datais only one tactic, experts say.

There are also sins of omission: Because accidental violations of air or water rules are not penalized as stringently (regulators are more interested in consistent, intentional violations), some firms stay understaffed or don’t invest in updated technology or training in order to keep costs lower. Thus, minor and recurring accidents or breakdowns — which may not attract especially hard penalties — are more likely.

Hedging thresholds or established data limits — and therefore staying just below the regulatory radar screen — is a temptation, too. For example, power plants and industrial facilities in the “major” or large category — there are about 16,000 now in the United States — come under heavy scrutiny in terms of emissions. These are continuously monitored, with detailed emissions data available online. To reduce compliance costs, then, it makes some sense to keep facilities’ production levels in the less-scrutinized medium- or minor-size categories. There are about 120,000 facilities in those categories domestically. “There are clearly incentives to remain below the threshold,” Shimshack notes. “And you’d expect some facilities to really work to do that.”


Rules that allow for grandfathering — what the experts call “vintage differentiated regulations” — of older machines and facilities invite strategies to keep dirtier cars and power plants running longer. Plus, consumers sometimes find their own creative solutions to rules, resulting in perverse net outcomes: In Mexico City, rules that attempted to get cars off the road one day a week (based on the last digit of license plates) ended up backfiring, as citizens bought second cars to get around the rules. A 2008 study by Lucas W. Davis, now at the University of California Berkeley, found that Mexico City’s rules increased the “total number of vehicles in circulation” and changed the “composition of vehicles toward older, higher-emitting vehicles.”

Issues of bad data are also particularly acute in newer energy-industry areas, such as with methane leaks during hydraulic fracturing, or fracking, and across natural gas systems. Scientific evidence has continued to mount that methane leaks around oil and gas facilities, and associated data, have been significantly underestimated by industry and the Environmental Protection Agency. Technology advances are needed to develop low-cost, continuous methane sensors. “It is very difficult to see how accurate the data is,” says David Lyon, a scientist with the Environmental Defense Fund, who researches oil and gas emissions issues. “I don’t think many companies are intentionally cheating, but a lot of the data seems pretty low quality.”

And the stakes for ensuring higher quality reporting are real. Climate change aside, MIT and Harvard researchers recently estimated that the VW manipulations produced, from 2008 to 2015, an excess 37 million kilograms of nitrogen oxide — one of the key carcinogens linked to lung disease.

John Wihbey is an assistant professor of journalism at Northeastern University.