Can a Cambridge startup help to tame the wild frontier of video game voice chat?
Modulate hopes to apply artificial intelligence technology to help understand what’s happening when players in a game talk to one another, differentiating between friendly trash talk and outbursts that cross the line into abuse. The company also believes it can sound the alarm when adult predators are trying to establish inappropriate relationships with children.
And Modulate just banked $30 million from a group of investors, including Boston-based Hyperplane Venture Capital, to further develop and market its monitoring technology.
Over the last two decades, live voice conversations have been introduced into a growing number of video games as a way for players to socialize. But sometimes, the conversations in those spheres can turn abusive, racist, or predatory — and there have been few tools for monitoring it. A recent report from the Anti-Defamation League, a nonprofit that battles bias and hate speech, found that 60 percent of young people and 83 percent of adults said they’d experienced harassment in online multiplayer games. The survey found significant increases in identity-based harassment of women, Black people, and Asian Americans.
But few people who encounter toxic speech and abuse when playing a game will file a report about it with the moderators who are charged with enforcing the game’s rules, says Mike Pappas, chief executive and cofounder of Modulate. “And if you look at child predation, the kid doesn’t know what’s happening, so who is going to report that?” Pappas asks.
Modulate plans to sell its technology to game publishers so that moderators can get a better understanding, in real-time, of where problems may be popping up across dozens or hundreds of live voice chats that accompany game play. Already, the company is working with customers such as Poker Stars VR, a virtual reality poker game, and Rec Room, an online platform that lets people build and share their own video games. Modulate has 27 employees, but Pappas says he expects it to grow to 35 or 40 by the end of the year.
Modulate’s system, called ToxMod, develops “a deep understanding of the conversation that’s going on,” says cofounder and chief technology officer Carter Huffman. “Not only do we have to transcribe the audio” — using software, not humans — “but we also need the context,” Huffman says. That can come from the intonation of the speakers, the rhythm of their speech, or how people react. “Do other people laugh, or do they sound angry or uncomfortable? Is the speaker angry?” Huffman says. The system also can estimate the age and gender of speakers based on their voice. “Malicious intent may go up if people perceive you as feminine, and abuse may go up,” Huffman says. Phrases that may be harmless when spoken between adults may have a different motivation when spoken by an adult seeking to “groom” a child for an inappropriate relationship.
Modulate highlights for a game’s human moderators the most severe violations of the game’s guidelines and allows them to decide on the consequences after they examine the situation: They may send the player a warning, mute them temporarily, or ban them from the game. The ToxMod system “can escalate something that’s happening live in less than a minute,” Huffman says. “So the moderator can actually intervene as the situation is escalating.”
Pappas and Huffman acknowledge that some players may not love the idea of an artificially intelligent system eavesdropping on their chats with friends. “ToxMod isn’t defining new kinds of bad behavior,” Pappas explains. “It gets customized to the game’s existing code of conduct. If the code says, ‘no racist hate speech,’ people were uttering that hate speech in the voice chat, but there were no consequences, or there was no consistency in how the code was enforced.” And when consequences are doled out, Pappas adds, “this is not just the AI blithely making decisions on its own. Moderators are still heavily in the loop.”
Pappas and Huffman met as undergrad students studying physics at MIT; they started the company in 2017 and initially were focused on enabling gamers to create their own customized synthetic voices — either to alter or conceal their identities or to sound like a celebrity or fantasy character. The company raised $2 million in 2019 to develop that product. But over the last year, Modulate decided to change course.
“As we brought this technology to the game studios, the thing we heard was, ‘Absolutely our players want to sound like Morgan Freeman or whomever,’ but games have become so social,” Huffman says. “The studios said, ‘People don’t feel safe in voice chat, so they’re not participating. They’re not getting the social experience.’” In tackling the problem of keeping voice chats safe, Pappas says, Modulate was not just enhancing the fun of playing games, “but creating a more inclusive and accessible online experience for people.” Pappas says that Modulate is working with several big game studios as customers that he can’t yet divulge.
“Growing a thick skin and dealing with it has been seen as a rite of passage in the gaming community,” Pappas says. “You’re getting harassed all the time — just deal with it. But studios and platforms are realizing that that’s not an acceptable answer.”
And new regulations in the United States and abroad could make game companies more accountable for what happens in their digital realms. In June, Vice President Kamala Harris launched a new task force to study online harassment and abuse, and the British Parliament has been debating an online safety bill with provisions that seek to protect minors online.
“The game industry needs companies looking to solve these problems, and you’ll probably see more artificial intelligence come into the market to help moderators do their job,” says Tim Loew, executive director of MassDigi, a trade association for the video game industry in Massachusetts. Loew says it will take Modulate time to run tests with big game companies “before the big publishers commit to deploying their system at scale, but my guess is they’re already doing that.” It’s also possible that some of the big game studios, such as Microsoft and Activision Blizzard, may opt to develop their own approaches to monitoring voice chat and flagging bad actors.
“Anything that can improve online experience — and make it safer — will result in a big improvement in the lives of gaming consumers,” says Jon Radoff, a veteran gaming industry entrepreneur and chief executive of Beamable, a company that sells tools to game developers. That, he adds, “ought to propel the industry beyond its traditional core gamer audience.”