It’s time for another ferocious federal election, and, if it’s anything like 2016, troublemakers both foreign and domestic will unleash a flood of fake news stories on the Internet.
But I’ve seen the future of fake news, and so can you. Just go into your e-mail program and open the spam folder.
Remember spam? Once upon a time, billions of fraudulent messages were supposed to make our e-mail systems virtually useless. But spam filters have become so good, we now almost never seen them, even though spammers send more messages than ever.
“It’s a manageable disease,” said John Reed, a researcher at Spamhaus, an international spam-fighting organization
Fake news is probably destined for the same fate. While Facebook and Google are scrambling to fend off another wave of phony political stories, they will probably never be completely eradicated.
But around the country and throughout the world, people are working on ways to identify and flag the fakes. Some efforts rely on human reviewers; Facebook alone has hired 20,000 people to monitor the content posted by its users. But there can never be enough human eyeballs to spot all the phony stories.
So scientists at many institutions, including the Massachusetts Institute of Technology, the University of Michigan. and Clemson University, are teaching computers to spot such fakes, using the same artificial intelligence techniques that Reed uses to stifle spam.
“It’s able to identify things that we as humans wouldn’t even have seen,” said Reed. “Why not turn it to something like looking at articles?”
Marten Risius, an assistant professor of management at Clemson, had the same idea not long after the 2016 election. He and a colleague in Germany, Christian Janze, went to work on an automatic fake news detector.
“We heard it wasn’t possible to predict fake news on social media, and we thought that doesn’t make a lot of sense to us,” said Risius. After all, he reasoned, computers can be taught to recognize all kinds of patterns. Perhaps fake news stories, like spam e-mails, shared many common characteristics.
Risius and Janze obtained a collection of more than 2,000 articles about the election posted on Facebook and compiled by the news site Buzzfeed. The stories fell into four categories: “mostly true,” “somewhat true,” “mostly false,” and “entirely false.” There was no point looking at entirely true stories, and the researchers skipped those that were entirely false because they mainly came from obvious satire sites such as The Onion.
Since fake news stories always contain at least a little truth, they had their software look for common features of stories that were somewhat true or mostly false. For example, Risius and Janze looked at the descriptive text found at the top of most Facebook postings. Fake stories tend to have an unusually large number of capitalized words and exclamation points.
They also found clues in the way other Facebook readers responded to fake stories. They weren’t as likely to reward these stories by clicking the “love” or “laughter” icons. But they did share fake news stories more frequently than legitimate news.
Once trained, they tested the software on another batch from the Buzzfeed compilation that it had not seen before, and Risius said it accurately identified the fakes 88 percent of the time. Unfortunately, there were also plenty of false positives; 25 percent of mostly true stories were marked as fakes.
Now Risius is using the technology for a slightly different purpose: a news app that recommends verified stories to readers — but from the opposing point of view of their stated political position. This, he reasoned, would prod people to break out of their ideological bubbles.
“We can recommend high-quality articles from the other side,” Risius said.
Ramy Baly, a postdoctoral student at MIT is also developing software to identify trustworthy news sources on the Internet. Baly is a native of Syria and believes fake news stories have helped fuel the ferocity of his country’s civil war.
“It has contributed to magnify the impact of what’s going on,” he said. “It’s made things worse. It’s not only in Syria. It’s everywhere in the world.”
Baly’s software would flag sites that traffic in fakery, and he is also developing techniques to measure a site’s political slant, so visitors can know going in to expect a left- or right-wing perspective.
Baly has pinpointed a host of subtle indicators to suggest a news site might not be on the level. For instance, the software flags whether the site is listed in Wikipedia. Well-established publishers, whether Mother Jones on the left or National Review on the right, will usually be memorialized in Wikipedia; sites without such listings may be honest upstarts, of course, but they might also be rumor-mongers.
Baly’s software also analyzes the Twitter account connected to the news source: How old is the Twitter account? How many people follow it? How many retweet its messages? Older Twitter accounts with large user bases are more to be trusted than relative newcomers.
The Web address, or URL, for the story can also be a giveaway. Genuine media sites usually have relatively clean and simple addresses, while fly-by-night websites often have long addresses full of unusual characters.
After training the software on a database of about 2,000 stories, Baly said it could identify fake news sites with 65 percent accuracy and accurately gauge the ideological slant of a story in about 70 percent of the cases.
Like Risius, Baly hopes to create a consumer news app that would direct users to reliable news sources from every point on the political compass.
Rada Mihalcea, director of Michigan’s artificial intelligence lab, has been working for years on automated systems to detect deception in many different forms. In 2015 she demonstrated software that could detect lying by analyzing videos of humans talking. Mihalcea claimed that her software could spot lies with 75 percent accuracy. Now she’s using similar techniques to pinpoint phony Internet news.
Whether in speech or writing, liars tend to use similar techniques, according to Mihalcea. “Liars will more often use certainty as a way of making up for the lie,” she said, so false posts would have words that denote extreme certainty, such as “absolutely” and “always.”
“Another thing we look for is readability, or how complex the text is,” Mihalcea said. “In fake news, sentences tend to be simpler.” She thinks that’s because lying takes extra mental effort, so liars ease the strain by using fewer words.
Mihalcea said her system identifies fake news with 76 percent accuracy. She eventually hopes to produce an add-on to Internet browsers that would flag incoming stories full of short sentences and absolute certainty, but devoid of fact.
These tools won’t be ready this November and perhaps not in 2020 either. It took years to build good spam filters, and the fake news police are just getting started. Even if they succeed, fake news will never die, because there are always plenty of credulous people looking to have their prejudices confirmed. But for those of us who’d rather not be lied to, help is on the way.
Hiawatha Bray can be reached at firstname.lastname@example.org. Follow him on Twitter @GlobeTechLab.