scorecardresearch Skip to main content
Innovation Economy

After the Nuance deal, what’s next for voice recognition tech?

The Burlington company was a pioneer in a field that still has plenty of room for innovation.

Jim and Janet Baker, cofounders of the speech recognition pioneer Dragon Systems, whose technology later became part of Nuance Communications.Janet Baker

Microsoft’s nearly $20 billion deal to acquire Nuance Communications, the speech technology giant based in Burlington, feels like the end of a chapter for Boston’s tech scene.

And this particular chapter is a long one — it began 50 years ago, in 1971.

That was the year that two grad students, Jim and Janet Baker, met in May, decided what they wanted to do with their lives, and got married in October. “We wanted to find an audacious goal that was interesting, and multidisciplinary, and something we could achieve in our lifetimes,” Janet Baker says. “And if we did, it would have a positive impact on society.” The goal? Building software that would be able to transcribe natural, conversational human speech.


Some of that software eventually wound up being acquired by Nuance, and serving asbuilding blocks of what Nuance sold to Apple to help it launch its Siri voice-driven personal assistant. Nuance not only bought some of the technology that the Bakers created at their company, Newton-based Dragon Systems, but it acquired other local companies that included SpeechWorks, eScription, VoiceSignal Technologies, and Vlingo. (Nuance, formerly known as ScanSoft, even got its name after acquiring a Silicon Valley company in 2005.)

Nuance, which today has about 825 employees in Massachusetts, was an acquisition machine. The company “has more intellectual property around speech than anybody on earth,” says Stu Patterson, who was chief executive of SpeechWorks. And its alumni were often hired by companies such as Google and Amazon to help build up their speech capabilities. For instance, Jeff Adams,founding manager of Amazon’s Alexa team, spent eight years at Nuance improving the Dragon NaturallySpeaking dictation product.

It was also a lawsuit machine, defending that trove of patents in the courts. An ongoing lawsuit with a company based in Cyprus, Omilia Natural Language Solutions, asserts that Nuance has filed at least 17 lawsuits over the last decade, some of them after the company in question refused Nuance’s acquisition offer. Nuance alleges patent infringement, and Omilia calls Nuance’s behavior anticompetitive.


Ron Croen was chief executive of the Silicon Valley company, Nuance, that gave the Massachusetts Nuance its name in 2005. Croen formed that company in 1994 with three researchers from SRI International, a nonprofit research and development group. For decades, Croen says, companies like Microsoft and IBM have poured millions of dollars into trying to get software to better recognize human speech. But turning those lab breakthroughs into marketplace products “has never been obvious to the Microsofts or the IBMs,” says Croen. “They just haven’t been good at it.”

One of Nuance’s first customers under Croen’s leadership was the brokerage Charles Schwab, which enabled its customers to call up a phone number, say a ticker symbol or mutual fund name, and get an instant quote. After Croen’s company was acquired, the chief executive of that “new Nuance,” Paul Ricci, continued to focus on applying the company’s technology to actual business problems — like freeing brokers from looking up stock quotes or enabling doctors to dictate their notes from a patient visit — as opposed to honing technology for technology’s sake.

“Microsoft has already been a very front-running technology developer — their labs and their people are world class,” Croen says. “They publish papers, and they announce the results of their speech recognition accuracy. It turns out, that’s not necessarily a business.” But helping medical facilities and call centers save money by operating more efficiently is, Croen observes. Getting access to that big customer base is a key goal of the Microsoft acquisition.


With that acquisition, it’s likely that most of Nuance’s talent will stay put here in Massachusetts, but will work under the umbrella of one of the Left Coast tech giants. Apple has a speech-focused research group in Cambridge, and Amazon has developed much of its Alexa technology here as well.

Those companies “have huge resources,” says Michael Phillips, that they can apply to continuing to improve speech recognition and understanding, and offer it essentially for free through their smartphones and intelligent speakers, like the Alexa. Those resources are “not just smart people, but it’s access to huge amounts of data and huge amounts of computational power,” he says. “We’ve entered the stage now where people like Google and Amazon are just going to win on the core technology.” Phillips is chief executive of Sense, a Cambridge home energy analysis startup, and he was acquired into Nuance twice — once while working for SpeechWorks, and once while working for Vlingo.

Does the end of this particular chapter of Boston tech history mean that there are no more opportunities for startups to create important new products that use spoken language in a more sophisticated way? Absolutely not, says venture capitalist Bob Davoli. “There’s a lot of room for innovation,” he says, especially connecting what you say to artificial intelligence and process automation that can understand, for instance, what words to put into various fields in a form or database. Technology may eventually lead to the end of call center agents, Davoli suggests. Among his investments are Interactions Corp., a Franklin company that sells its technology to customers such as Citi, Shutterfly, and Constant Contact. Other Massachusetts startups, including CallMiner and ChipBrain, try to understand a person’s emotional state, and try to provide insights on how to respond to complaints — or sell them a product.


But Steve Chambers, a former Nuance marketing leader, says he does “wonder whether VCs will invest in any startup’s artificial intelligence or voice technology, other than that which will tactically appeal to the big five tech companies” as a potential acquisition target. Chambers also says he has “concern” for the job security of friends working at Nuance, following the Microsoft acquisition.

“When it comes to higher-level understanding and reasoning, it’s still early days,” says Vlad Sejnoha, a partner at Boston-based Glasswing Ventures, an investment firm, and a former chief technology officer at Nuance. Patterson talks about the industry’s next big step being “conversational AI,” or artificial intelligence. His current startup, LifePod Solutions, sells smart speakers that can initiate conversations rather than waiting to be asked something. One application: reminding seniors about a medical appointment or a pill they’re supposed to take before bedtime.

But we’ve now arrived at a moment where you can talk to your car, your phone, your laptop, and the Alexa speaker on your kitchen counter. Thinking back to 1971, Janet Baker, now a mentor and research affiliate at MIT’s Media Lab, says, “We met our long-term goal, not just to make it work, but make it something that a huge number of people use as a part of daily life.”


Huge deals like Microsoft’s acquisition of Nuance are announced with a press release at a predetermined day and time. But sometimes, they’re a few decades in the making.

Scott Kirsner can be reached at Follow him @ScottKirsner.