There is this question of how does one define “big data.” The answer is that there really is no way to define “big data.” I think that’s true of pretty much any emerging concept. When you think about some of the things that did become obsolete — the Yellow Pages, White Pages — to some extent they represent big data. I think big data has always existed, but the ability to interrogate massive data sets has not. And a huge amount of the Internet, and services like Google, are built on the ability in real time to interrogate large data sets.
The goal [of my research partner and coauthor Jean-Baptiste Michel and I] had always been to study culture. But there were no tools that allowed you to ask the basic questions like “How often did people talk about democracy?” and “When did people get interested in feminism?” At a certain moment, when the Google Books data became sufficiently large, when we had so many books in sufficient quantities, we started to see statistically significant counts for different concepts, like separation of church and state. And all of a sudden it became possible to have serious quantitative, statistical conversations about where concepts came from. Once you could do it, it was impossible to stop.
[Michel and I] had the experience of working on this stuff and figuring this stuff up. It was a big adventure and incredibly exciting and fun. We felt that telling the story of our big data adventure, if you will, and highlighting the extent to which it has become possible for a couple of random people to do amazing things because of the power of the information infrastructure that is emerging around us — we felt this was a story that would benefit from being told.
— As told to Rachel Deahl (Interview has been edited and condensed.)
FOR MORE Aiden’s book, Uncharted: Big Data as a Lens on Human Culture, comes out December 26.