IBM Corp. and the Massachusetts Institute of Technology have teamed up to tackle a major challenge facing computers: teaching machines to recognize images and sounds as people do, and react in useful ways.
On Tuesday the two esteemed organizations announced a multiyear research program in artificial intelligence, the IBM-MIT Laboratory for Brain-inspired Multimedia Machine Comprehension.
Neither IBM nor MIT revealed how much will be spent on the program or how many years it will run. But scientists from both institutions said they’re up against a formidable technical challenge.
“We think of people and machines working together to solve problems,” said Guru Banavar, chief science officer for cognitive computing at IBM. But before machines can do that, Banavar said, they must “be able to watch people, understand what they’re doing, and make some decisions about how to help them.”
For instance, a home video camera might someday be able to recognize at a glance that an ailing parent had fallen down and needed help. The camera could transmit the information to other family members or the police. The research could also enable machines to learn from other machines. A robot on an assembly line could learn a new task simply by watching another robot perform it or even by viewing a YouTube video.
Learning by watching and listening comes naturally to humans. So IBM will work with MIT’s department of brain and cognitive sciences to find ways to teach this innate human ability to computers and robots.
“There’s been a lot of advances in recognizing objects,” said department head James DiCarlo, ”but there’s been much less progress in predicting actions or even recognizing actions. . . . These are all cutting-edge problems.”
While Banavar wants to emulate the human brain’s ability to recognize and respond to events, DiCarlo noted that scientists still don’t know how our own brains manage it. “As we can build systems that can do that,” said DiCarlo, “we learn ways our own brains might be doing that.”