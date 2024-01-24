Over the weekend, unknown scammers sent out automated phone calls impersonating President Biden and urging them not to vote in the Tuesday primary election. Some of those who got the calls contacted former New Hampshire Democratic chairperson Kathy Sullivan, who in turn, contacted the New Hampshire Department of Justice, which has launched an investigation.

The fake Joe Biden robocalls to New Hampshire voters are a worrisome reminder that artificial intelligence systems have given political scammers a potent tool for mass deception. For now, the best defense against such dirty tricks is a healthy skepticism, as AI technologists warn that “deepfake” audios are easy to create, and very difficult to debunk.

The investigators will have to get lucky to find the perpetrators. These days, anybody can create a realistic audio deepfake of a person’s voice using inexpensive online services from companies like Parrot AI and ElevenLabs.

In addition, much of the underlying technology is available as free open-source software. James Glass, a senior research scientist at the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology, said this free software hasn’t yet been turned into a ready-to-use voice cloning program. But Glass added that “enough code is out there that … Joe Random Person could go out there and cobble together their own software.”

With the genie out of the bottle, restricting deepfake software is a hopeless cause, said Matt Mittelsteadt, research fellow at the Mercatus Center at George Mason University. “The first response I hear is, oh, let’s regulate it,” said Mittelsteadt. “What people never get to is, how?”

The problem is global. An audio clip that implicated a candidate in attempted vote fraud during the February 2023 Nigerian presidential election was in fact an AI-generated fake, according to analysts who studied the recording. A similar thing happened to a presidential candidate in Slovakia in September.

But it’s not always possible to be sure an audio has been faked. Last April, a Indian politician denounced as fake a recording in which he seemed to accuse his own party of financial corruption. In this case, audio experts have not been able to either verify or debunk the recording.

This uncertainty means that a politician confronted with an apparently damning audio clip can simply declare it a fake. Donald Trump has already used this tactic by falsely claiming that a YouTube video of his verbal blunders was fabricated by his enemies.

MIT’s Glass said that audio deepfakes will continue to improve, making it difficult for software to reliably detect them. “That’s always going to be a moving target,” he said, Glass said it might be easier to develop a way to attach digital certificates to legitimate audio and video recordings. That way a smartphone could confirm that an incoming call really was from the Biden campaign.

Vandana Janeja a professor of information systems at University of Maryland Baltimore County, is more optimistic about spotting audio deepfakes. She’s trained an AI system to look for audio cues that deepfake programs rarely get right. She’s identified five specific cues — the overall quality of the audio, the breathing sounds made by the speaker, the pauses between words, the pitch of the voice, and the “puffing” sound that people make when pronouncing certain consonants.

“Those are very unique to humans,” Janeja said. By monitoring these cues, Janeja said she’s trained an AI system to identify deepfakes with 88 percent accuracy.

But even with improved deepfake detection, it’s likely that many phony messages will get through before someone sounds the alarm. “There’s a lot of people, at least this year and the next couple of years, who are likely to be deceived, because they’ve never experienced anything like this,” said Mittelsteadt. But over time, he said, consumers will become more familiar with fake voice messages and less inclined to believe them.

Jennifer Huddleston, technology policy research fellow at the libertarian-leaning Cato Institute, notes that new technologies have often posed similar problems. For example, the invention of photo editing software like Photoshop taught people to be more skeptical of visual evidence, and nearly everybody learned to ignore spam email scams. Huddleston said the rise of audio and video deepfakes is an opportunity to teach that lesson again. “We want people to really consider the information they’re taking in,” she said, “consider the sources it’s coming from and make that analysis for themselves.”

Janeja is working this end of the problem as well. She’s conducting experiments in which she plays deepfake audios for people and teaches them to recognize the subtle clues that it’s really a machine talking. Janeja said that it works. “We saw that people started listening better,” she said.

Janeja believes that training people to hear the difference between real and fake audio is an essential part of the solution. “If you take humans out of the loop,” she said, “you’re just going to have an arms race with algorithms.”

Hiawatha Bray can be reached at hiawatha.bray@globe.com. Follow him @GlobeTechLab.