In the online, big data world, it’s important to be able to separate the wheat from the chaff. This is true when it comes to refining search results and culling a Twitter feed, and it’s true with photographs, too. When you search for pictures of a friend, you’re probably less interested in images of that person in the crowd at someone else’s wedding, and more interested in pictures in which they’re the main event.
Photo sites don’t know how to make that kind of distinction, yet, but a new innovation out of Virginia Tech and recently posted to arXiv.org might change that. It’s an algorithm (of course) that takes advantage of all sorts of social and technological cues to figure out who really matters in an image.
“We have the ability to look at a scene and, just by coding what people are doing, how people are looking at each other, we can get a sense of the important actors,” says Dhruv Batra, a professor of electrical and computer engineering and creator of the program, along with graduate student and lead designer Clint Solomon Mathialagan and Andrew Gallagher, an engineer at Google.
The program looks at 45 aspects of an image in order to rank the people in the image in terms of importance. These include some obvious factors. A person in the center of an image ranks higher, as does a person whose face is in focus, while a person whose face is partially obstructed ranks lower. They also include more subtle factors, like the directions in which people are looking. “If you look at a scene you can tell there is an important person because everyone else is looking at that person,” says Batra.
The program still has limitations. In one test image, it identified a graduate’s tie as a person’s face, though, to the algorithm’s credit, it determined the “tie-person” to be of low importance. The engineers hope to make the program more discerning by adding in the ability to detect social cues (A white dress means bride! The person holding the scissors at a ribbon-cutting ceremony is important!), and by making it more attuned to a photographer’s intent (photographers following the “rule of thirds,” for instance, place their subjects slightly off-center).
In addition to organizing images by importance, the engineers anticipate their invention could be useful for auto-cropping and for creating automatic descriptions of images. “You would like the algorithm to focus on the important activity,” says Batra. “You don’t want the algorithm to say there is a man in the background with gray hair.”
Of course, any ranking system can be exploited, which creates one more potential application for the program — as a hyper self-aware party game in which people vie over the course of an evening to see who can rank as the most important person in the most pictures.
Kevin Hartnett is a writer in South Carolina. He can be reached at firstname.lastname@example.org.
Correction: An earlier version of this story incorrectly referred to Dhruv Batra’s title. He is a professor of electrical and computer engineering.