There is big business in making sense of massive amounts of information generated on the Web, from online sales to Facebook posts, and Massachusetts is fast becoming a hub for what is deemed the next big trend in technology: analyzing data.
The state is home to more than 100 companies that focus on what is known as “big data’’ - the ability to quickly dissect and understand the flood of digitized information. The firms range from small Cambridge start-ups like Hadapt Inc. to EMC Corp., one of the area’s largest technology employers.
Employment in this sector is expected to more than double over the next six years, adding an estimated 15,000 jobs, according to a report to be released today by the Mass Technology Leadership Council.
“We have an opportunity for Massachusetts to be a real center of gravity around big data,’’ said Tom Hopcroft, the council’s chief executive. “In Massachusetts, we have decades of really hardcore database expertise going back to the 1960s, and a lot of this knowledge has been retained here.’’
Bluefin Labs Inc. is one of the companies expected to grow. The Cambridge start-up is trying to make sense of a big chunk of social media-generated data by analyzing 3 billion online comments about television every month.
The idea is to measure the collective mood about everything from commercials to reality TV to political debates in a way traditional polling or television ratings cannot. For instance, Bluefin used social media comments to gauge viewer attitudes toward Republicans presidential candidates during a recent debate.
“Literally, on our screens, we can see those comments in social media in real time,’’ said Tom Thai, Bluefin’s vice president of business development. “The really cool thing about social data is that you can see more than just volume,’’ he said. “What were people saying about their shows? What were they saying about their competitors?’’
Big data’s potential benefits for marketers drive much of the boom.
In addition to the state’s 100 big data companies, the report notes, another 20 start-ups have recently emerged for a piece of the data and analytics marketplace, which McKinsey Global Institute recently estimated was worth $64 billion. In the past several years, local tech companies have been acquired for their data analytics know-how by the likes of IBM, Oracle, and Hewlett Packard for price tags that have reached into the billions.
Michael Stonebraker, adjunct professor at the Massachusetts Institute of Technology and one of the area’s information-crunching pioneers, said there has long been a deep talent pool in data management in the state. But new technology has moved the field beyond traditional database systems from Oracle or IBM.
The growth is being driven not only by the digital information collected via the Web, but advancements that allow computers and software engineers to analyze all sorts of unstructured data moving swiftly and growing fast. The information comes every second from sensors capturing health data, GPS systems and location software that record whereabouts, and ATMs that record financial transactions.
“Everywhere you look, people are drowning in big data,’’ said Stonebraker, who has helped start seven companies, including Paradigm4 Inc., VoltDB Inc., and Vertica, which was bought last year by Hewlett-Packard.
Ninety percent of the data stored on hard drives, on Internet servers, or in big databases has been collected in just the past two years, according to IBM.
“Think of information flows that would swamp a traditional database structure,’’ said Stephen O’Leary, managing director of Aeris Partners, a mergers-and-acquisitions consulting firm, and a cochairman of Mass TLC. “We are in an era in which social media is generating a tremendous amount of data.’’
Big data’s potential benefits for marketers are driving much of the boom. Venture capital firms invested almost $1 billion in data companies from 2008 through 2010, according to 451 Research. In November, Accel Partners, an early investor in Facebook, announced a $100 million fund for big data start-ups.
“Data has always been at the heart of every enterprise’s ability to compete,’’ said Accel’s Ping Li, who manages the fund. With the ability to break down and explore larger data sets than ever before, and often in real time, the possibilities are just beginning to be understood. “Imagine if you can analyze your sales trends for the past 10 years,’’ he said.
What Echo Nest Corp., a Somerville company, is doing with big data technology is figuring out what types of songs people will like, based on their musical preferences. It has broken down 35 million songs into 1,000 different parts - bars, beats, tempo - to create a music intelligence engine to power music streaming services like Spotify.
“We hope to be the data layer for music,’’ said Jim Lucchese, Echo Nest’s chief executive. “Our job is to help our customers understand music consumption and music content. And that’s a big data problem.’’
One of the most significant aspects of big data is how it applies to science, health care, and drug discovery, said Marilyn Matz, chief executive of Paradigm4, a Waltham big data company. For instance, she said, big data companies are helping life sciences researchers mine genetic sequencing information to spot abnormalities or find patterns in all the bits of data collected from medical equipment.
Big data’s explosion is also driving demand for software engineers with the talent to understand the complex algorithms these companies use. Sixty-five percent of data professionals expect a deficit in expertise in the field over the next five years, according to a December report from EMC Corp.
With every technology trend, from blogging to social media to cloud computing, there is a degree of hype surrounding big data, said Boris Evelson, a technology analyst at Forrester Research in Cambridge.
“At the end of the day,’’ he said, “on the left side it’s all bits and bytes, in the middle it’s a fuzzy cloud, and on the right it’s the businessman trying to make sense of it all.’’Michael B. Farrell can be reached at firstname.lastname@example.org.