Business & Tech

scott kirsner | INNOVATION ECONOMY

Local startups need to make noise about voice computing

Cambridge startup Woobo is designing this voice-recognition toy to be "fun and educational."
Cambridge startup Woobo is designing this voice-recognition toy to be "fun and educational."

‘Alexa, where were you born?”

“I was designed and built by Amazon. They’re based in Seattle.”

That’s the answer you get when you ask an Amazon Echo device about its roots — even though a good chunk of the software it uses to decode what you’re asking was crafted in Cambridge.


Siri and the forthcoming Apple HomePod device hail from Silicon Valley, of course, as does the Google Home speaker.

Once more, it feels like a gold rush is happening on the West Coast related to development of these chatty, artificially intelligent personas. Here in Massachusetts, we knew it was coming. We built some of the earliest successful speech-recognition software (Dragon NaturallySpeaking, 1990), launched the first intelligent phone assistant (Wildfire, 1994), worked at the cutting edge of AI, and prototyped robots that would do not just manual labor, but exhibit friendly personalities.

As this new gold rush proceeds, however, we seem to be left trying to sell mining equipment by mail order. That can lead to decent success, but rarely a giant business. You’re simply too far from the action to understand what’s happening and what the needs are.

Get Today's Headlines in your inbox:
The day's top stories delivered every morning.
Thank you for signing up! Sign up for more newsletters here

We saw it happen a decade ago, after the iPhone was launched. (Apple’s market value was about $94 billion.) Lots of startups in Boston knew the iPhone was going to be important. Some started building apps for it, like Runkeeper, which let you use the iPhone to keep track of your workouts. Others created consultancies like Intrepid Pursuits to build iPhone apps for clients like Newbury Comics or WBUR.

Runkeeper employed about 35 people last year, when it was acquired by the shoemaker Asics for $85 million — not too shabby.


Intrepid Pursuits was acquired by the consulting firm Accenture this month for an undisclosed amount; it had grown to about 150 employees.

But Apple, the company that controlled the iPhone “platform,” in techspeak, grew from 21,600 employees in 2007, the year the iPhone debuted, to 116,000 employees last year. And its market value is now about $750 billion — more than any other company’s.

At a demo showcase held at District Hall in South Boston last week , everyone seemed excited about the potential of voice-driven technology. A local Amazon employee, Robert McCauley, talked about how easy it is to build games and other apps for the Echo device. He also plugged a new device, the Echo Show, which includes a screen for conducting video chats with friends or seeing who’s standing at your front door.

The showcase also included entrepreneurs like Scott Cohen of, a Concord consulting firm that is creating its own intelligent persona, named Jaxon, to help retailers conduct conversations with prospective customers, answer questions, and — ideally — close the deal.

Boston-based Vesper Technologies was showing low-power microphones that could be built into all sorts of devices — from smartphones to speakers to trash cans — founder Matt Crowley explained, so that you can control them with your voice. (“Trash can, how soon do you need to be emptied?”)


Crowley predicted that “hearables” — small devices built into an earpiece — “will be the next big product after the Echo,” a way to effectively integrate these artificially intelligent personas into your head, sans surgery.

The Cambridge startup Woobo is designing a toy to be “fun and educational,” said Susana Zhang. “You learn and play at the same time.”

I asked Woobo how many planets there are in the solar system; its reply was “More than one.” Not inaccurate, but not that educational. It was better at comedy that would crack up a 4-year-old. Woobo’s favorite food, the furry creature explained, is “blueberry peanut butter baloney ice cream.”

The Echo can’t yet offer automated reminders or speak without being spoken to, but a startup called LifePod aims to build technology around the Echo to make that possible. The goal is to create a “virtual caregiver” for seniors that can remind someone to take a lunchtime dose of meds, ask if they’ve popped their pills, and then e-mail a report to a relative or an actual human caregiver at the end of the day.

Boston-based Mylestone invites you to upload photos of a birthday party, trip, or other life event and turns them into a spoken story that can be “performed” by your Amazon Echo. “Capturing memories is too much work,” said Mylestone founder Dave Balter.

The stories are today written by humans who look at the photos, but Balter suggested that was merely a temporary kink in the process.

After the demo night, I rang up Joel Evans, a cofounder of the Waltham software development firm Mobiquity, and a co-organizer of the event. I shared my concern that we’re sitting on the sidelines of an exciting baseball game, making a few bucks selling peanuts while the players and teams are raking in millions. He’s more optimistic.

“I came away feeling pretty good,” Evans said. “There’s some serious technology being baked in Massachusetts, but maybe we’re just not hyping it enough.”

Maybe I’m like King Canute yelling at the waves. It’s possible that Google, Amazon, and Apple have gotten so rich and so big that they’re destined to dominate every emerging technology that matters, and it’s silly to wish otherwise.

Or maybe voice computing is new enough, and as it’s integrated into enough new products — earpieces or coffee makers or trash cans — there will be an opportunity to build big new companies in Boston and elsewhere.

One startup I’ve been following is Jibo, which is designing a countertop robot with a screen and solid conversational skills. But it has been working on its product for about 4½ years, and the release date has been pushed back several times. (Spokeswoman Nancy Dussault Smith said it will be out later this year.) It will also be priced at about $800, more than three times the price of Amazon’s new Echo Show device.

And then there’s Bose, the Framingham maker of speakers and headphones. It is owned by MIT. Why has Bose, famously a haven for engineers who want to work on bleeding-edge stuff, been so slow to realize that voice control and artificial intelligence are shaking up its business?

“We don’t currently offer a product that has an integrated personal assistant or voice control,” spokeswoman Joanne Berthiaume said. But, she added, many of the company’s existing speakers can be plugged into an Amazon Echo or Amazon Dot to add voice control.

“There are other things on the horizon, too,” Berthiaume said. “But we’re keeping those things confidential.”

And I’m keeping my fingers crossed that they’ll be big.

Scott Kirsner can be reached at Follow him on Twitter @ScottKirsner.