First you see the Oval Office with flags and microphones. Then the choppy video — typical of 1960s TV broadcasts — zooms in on a stonefaced Richard Nixon as he looks into the camera and informs the American public that astronauts Neil Armstrong and Buzz Aldrin have been stranded on the moon.
“Fate has ordained that the men who went to the moon to explore in peace,” Nixon says, “will stay on the moon to rest in peace.”
Watching it is eerie — because it never happened.
In this alternate history of humankind’s first venture to the moon — reimagined by a team of researchers, journalists, and artists at the Massachusetts Institute of Technology — there were no metaphorical “giant leaps." Instead, the lunar surface itself had become “forever mankind,” per the poetic moon-landing contingency speech drafted by William Safire some 40 years ago, which of course, was never needed.
The group of MIT creatives, which includes myself, took actual video of the former president and altered it — making it look as though he’d given a televised speech in 1969 that he’d never actually delivered.
The video is part of “In the Event of a Moon Disaster,” an award-winning art installation, directed by Francesca Panetta and Halsey Burgund. The art piece manipulated montage from NASA’s 1969 historic footage and combined it with AI-powered videos (colloquially known as deepfakes) to craft a narrative about what could have been. It starts with the announcement of Apollo 11, then the countdown followed by the blastoff into space before we see that something goes wrong and disaster strikes — a very real possibility for the seminal mission and the astronauts involved.
“We were hoping to create a connection with the astronauts who in our alternative history become stranded on the moon, and give the viewer an emotional journey, so that when they do hear Nixon finally announce it, it’s not only like, ‘Oh, I know this isn’t real,'” says Burgund.
The film is meant to raise awareness of the power of fake videos — and in turn, fake narratives — and the technology currently at the frontiers of synthetic media and at the center of many a conversation about truth and disinformation.
It isn’t just a meditation on technology, however, but on the concept of history itself.
“We felt that if we wanted to do something about deepfakes that was artistic, creative, but also educational, this elegy was a brilliant starting point,” says Panetta, who is the XR creative director in the MIT Center for Advanced Virtuality, where the project now lives.
But the question of how to make a deepfake without creating misinformation became central to the creative endeavor. The group drew inspiration from scholars and researchers in the Cambridge and Boston area who were already engaged in discourse over digital forgeries, misinformation, and the nature of truth.
Their research later informed the literature that was distributed with the film, sponsored by Mozilla Foundation, when it premiered at the International Documentary Festival Amsterdam in November.
In Amsterdam, the film played inside a physical installation that mimicked a 1960s era living room, with three screens that included a vintage television set. The full montage represented an array of disinformation techniques; not just deepfakery, but static and visual effects such as speeding up footage or reversing it and closing in on the astronauts’ faces.
Believability was key. “If you’ve never seen a deepfake, it really takes for you to see one to realize how realistic they can be," Panetta says. "As journalists and artists, we wanted to make something that really moves people, sticks with them, [and] that’s memorable.”
She and Burgund agree that if someone who experienced the installation later recalls the believability of the piece and as a result uses more caution interpreting a video in their Facebook feed, the project will have been successful.
To make the deepfake real, the team worked with Canny AI, an Israeli company that does Video Dialogue Replacement, and Respeecher, a Ukrainian startup specializing in speech-to-speech synthetic voice production. But as they got to work, even with the right technology at their disposal, it quickly became clear that creating a deepfake wasn’t going to be easy.
First it wasn’t one AI, it was two AIs. One to build a model for the visuals — to make Richard Nixon look like he was in the Oval Office reading the speech, for his lips to mouth the words he never spoke. And a second to build the synthetic Nixon voice so that the words emanating from his perfectly moving lips would sound and feel like Nixon was actually delivering them himself, according to the directors.
For the Nixon voice, the team had to produce a large set of training data for the AI to use to generate the speech. Lewis D. Wheeler, the voice actor hired by the team, spent several days in the studio painstakingly reconstructing the infrastructure for the audio and video models that were later used by Respeecher to create Nixon’s synthetic voice and by Canny AI to map the mouth movements and layer it on the original source video.
Wheeler listened to thousands of short clips of Nixon and repeated phrase after phrase until he got the rhythm and cadence right, then had to record 20 takes of the full contingency speech.
Dmytro Bielievtsov, Respeecher’s chief technology officer, tells me that the clean data for the target speech enabled them to create good parallel datasets to train the AI models.
Omer Ben Ami, one of Canny’s co-founders, explains that their AI receives the target video to be manipulated — in this case Nixon’s resignation speech — and a source video of a voice actor that produces the new dialogue — that was Wheeler’s best takes. The AI then learns how to create each frame of the target video with facial expressions and lip sync that match those in the source video. “Imagine for example a visual effects artist going frame by frame in the target video moving the pixels around to create a similar expression to a corresponding frame in the source video,” Ben Ami says. “This is a somewhat similar process but of course very fast and realistic.”
It would take a visual effects artist weeks to months to be able to produce a manipulation that accurate; AI whips it out in less than 24 hours.
Despite the heavy technical components of the film, Panetta and Burgund say that “In the Event of a Moon Disaster” is first and foremost an art piece.
“One of the amazing things about art is that it can hit you on a level that is different than an intellectual one, or a scientific one," Burgund says.
The team will next roll out a digital version of the project, including the full film and literature around the influence of deepfake and its place in public discourse, to the public in an open web portal.
Panetta says it will require refining their approach. “Misinformation is a really sensitive topic,” she says. “We want to get it right and we want to be responsible. We also want to do it creatively.”
Pakinam Amer is an award-winning journalist, a former Knight Science Journalism fellow, and a research affiliate of the Center for Advanced Virtuality at MIT. Send comments about this story to firstname.lastname@example.org.