scorecardresearch Skip to main content

Navigating the grocery aisles with AI

Can you detect which foods on supermarket shelves are more processed than others? Fortunately an algorithm can. -

Walk into any supermarket, and much of what you find in the center aisles will have a shelf life of six months to a year. Most of these items will be ultra-processed, but unless you’re a food scientist, you’ll have a hard time telling the difference between a salad dressing made from natural products and one basically assembled in a lab. Yet the difference does matter: Ultra-processed food is usually loaded with fat, starch, salt, or sugar, and multiple studies indicate that eating too much of it is killing us.

I am a data scientist, and my lab has spent the last five years curating an extensive database of food composition. Our goal has been to help consumers make informed choices about how healthy or unhealthy any given product is. Such choices are hard to make when you’re in the grocery store, because food packaging does not necessarily reveal the degree of processing any given product has undergone.


So my colleagues and I have developed a tool that uses AI to figure it out.

Why is this a puzzle that calls for the help of artificial intelligence? Sure, completely unprocessed foods are easy to spot: Think fresh fruits and vegetables, raw meat and fish, or bags of beans. But most food in a supermarket is not like that. It’s made from substances extracted from foods, such as fats, starches, and sugars, combined with additional ingredients, like salt, artificial colors, and preservatives to tick the boxes in our brains that will make us crave move.

Some ultra-processed food makes no claim to be otherwise — squeeze cheese, marshmallow fluff, energy drinks — but others shape-shift and are deceptively unrecognizable. A good example is most store-bought orange juice. It does start out with squeezed whole oranges. Yet before it hits the shelves its water is evaporated, its juice concentrated, and its pulp separated, a process in which much of its flavor and nutritional content is lost. When it is later reconstructed, often hundreds of miles from where the oranges were collected, it needs an array of preservatives and flavor chemicals to ensure that the final product resembles what it started out as.


While ultra-processing lengthens shelf life and reduces the cost of food, eating it at the rate we do is not without consequences. A 2019 study published in BMJ, formerly the British Medical Journal, found that small increases in the percentage of ultra-processed food in people’s diets were associated with increases in cancer, depression, cardiovascular disease, and obesity. We don’t fully understand why consuming ultra-processed food would be bad for us. But we do know that such food contains nutrients in concentrations that our metabolism, which evolved with natural food, may have difficulty handling.

Such studies determine the level of processed food in individual diets by relying on a database developed by Brazilian researchers. But that database, known as NOVA, isn’t publicly available, and it doesn’t track individual products in stores; it tracks the degree of processing for a few thousand categories of food. So it wouldn’t be useful to you if you wanted to know which of two similar items were less processed.

And even if you know how to interpret nutritional labels on food packages, these labels don’t necessarily tell you just how extensively the juice or frozen French fries or chicken soup has been processed. There is not a single nutritional component, like sugar or salt content, that reveals this. So I and Giulia Menichetti and Babak Ravandi, data scientists in my lab at Northeastern University, and Dariush Mozaffarian of the Dorothy R. Friedman School of Nutrition Science and Policy at Tufts University decided to rely on artificial intelligence.


We trained an AI-based algorithm on the NOVA database so that it learned the patterns of nutritional information associated with ultra-processed food and could apply that knowledge to a much larger array of items in grocery stores. It learned to rate food from 0 to 1, with a score close to 0 for whole foods, a small fractional number for food that has been simply processed, and a value close to one for ultra-processed food.

Take for example Pepperidge Farm Farmhouse bread, whose name evokes open fields of grain and cozy farm kitchens. It didn’t fool our algorithm, which assigned it a processing score of 0.997. Indeed, a closer inspection of the fine print on its nutritional label reveals the presence of resistant corn starch, soluble corn fiber, and oat fiber, which all require extensive processing in their extraction.

And that’s kind of crazy. A baker can mix flour, yeast, salt, and water and make a delicious loaf from four simple raw ingredients. For example, Manna Organics multi-grain bread, which is composed of whole wheat kernels, barley, and rice, and lacks additives, added salt, oil, and even yeast, got a score of 0.314, indicating that it is merely processed in a simple sense of the term — it’s baked.


You probably don’t need AI to tell you which of these breads is highly processed and which one is not. But in the breakfast cereal aisle, would you pick Whole Foods’ 365 brand of Fruit and Nut Muesli or Cascadian Farm Organic Honey Nut O’s, labeled as “always whole grain oat and barley”? According to our AI system you should opt for the Whole Foods cereal, which scores 0.5 while the Cascadian Farm cereal is at 0.82.

Even when the food in question is obviously processed — as is the case with boxed macaroni and cheese, the ubiquitous school-kid favorite — we can still find a range in processing scores. One type of Cauliflower Mac & Cheese got a score of 0.65, while the very healthy-sounding Annie’s Organic Grassfed Mac And Cheese had a score of 0.97.

We’ve had the AI determine processing scores for 50,000 products carried by three major US grocery stores, and it probably can offer less-processed alternatives for most foods you buy. Would you prefer a snack bar with a score of 0.269 over a neighbor that came in at 0.997? The next time you compile your grocery list, go to and run it by our algorithm.

Albert-Laszlo Barabasi is a professor of network science at Northeastern University and the author of “The Formula: The Universal Laws of Success.” Follow him on Twitter @barabasi.