fb-pixel
Kendall Sq. visionaries

Speech-tech brains behind Amazon’s Echo came from Cambridge

The voice recognition technology in Amazon’s Echo is a key focus of the company’s Cambridge office.
The voice recognition technology in Amazon’s Echo is a key focus of the company’s Cambridge office.AP

From an ‘old fart’ to millennials, this collection of doers, thinkers, and visionaries could help to shape the neighborhood, and the world, for years to come.

After years of working on speech and language technology, Rohit Prasad was accustomed to ideas that were ahead of their time or too late to catch on. The concept for Amazon’s voice-controlled, Internet-connected Echo was a different story.

“My eyes lit up,” Prasad said. “The fact that this was not a sideshow on a smartphone was extremely exciting for me, personally. And technologically as well.”

Today, the Echo is a critically acclaimed hit for Amazon. The canister-shaped device can turn on a smart TV, play your favorite album, or fetch the weather report, all by listening to your voice. And Amazon plans to add even more functions to its growing list.

Advertisement



The high-tech brains that make it work come from Prasad’s team. He runs Amazon’s speech science efforts from the company’s Kendall Square office, overseeing the scientists, engineers, and data specialists who make the Echo something you can talk to.

It’s a pretty hefty technological task. The Echo doesn’t send any information to Amazon’s online services until it hears its “wake word,” which is usually calling out the artificial intelligence assistant’s name, Alexa.

“It’s a needle-in-a-haystack problem.” Prasad said. “There’s a lot of media speech coming out of the television and the radio.” NPR fans found this out firsthand earlier this year when a story about the Echo reportedly began firing up the devices of some “Weekend Edition” listeners.

Once the switch is turned on, things get even more interesting. “If you’re saying, ‘Play music by the Rolling Stones,’ there’s a lot of intelligence that is required to understand that. Especially if people refer to the Rolling Stones as ‘the Stones,’ ” Prasad said. “Who are you talking about? Which song do I play?”

Advertisement



Prasad, a veteran of speech-tech stalwart BBN Technologies, said this region’s historically strong speech and language-tech sectors have Echo well positioned.

“The combination of processing power, data, and the technology maturing all made it the right time to pull something like this off,” he said. “And of course, the people.”


Curt Woodward can be reached at curt.woodward@globe.com. Follow him on Twitter @curtwoodward.