The latest artificial intelligence systems can create original works of art or write high-quality essays without human assistance. But are they smart enough to make sense of the latest in scientific research?
A pair of former college football players turned entrepreneurs say they’ve cracked the code, with a new search engine designed to make complex academic papers accessible to anybody.
Their Boston company, Consensus, uses artificial intelligence to scour millions of academic papers and identify the key ideas contained in each of them. The site features a Google-like search service where users can type in questions like, “do masks reduce the spread of COVID?” In seconds, the screen displays a list of scientific papers on the subject, along with a sentence summarizing the key findings of each one.
Advertisement
“We really built this for people like ourselves,” said Christian Salem, who played backup quarterback for Northwestern University in suburban Chicago from 2012 to 2016 and graduated with a degree in economics. “We are technology professionals who secretly wish that we were scientists . . . but don’t have the time or attention span to read scientific papers.”
Cofounder Eric Olson, a native of Sudbury, played right tackle at Northwestern and earned a master’s degree in predictive analytics. Olson later worked as a data scientist at Boston-based sports betting company DraftKings, while Salem became a product manager for the National Football League.
Olson and Salem aren’t the first to try their hand at a science search engine. Since 2015, Seattle’s Allen Institute for AI has operated Semantic Scholar, a website that summarizes the contents of academic research papers.
And in mid-November, tech titan Meta, the parent of Facebook, launched a public beta version of a tool called Galactica that had been trained to summarize 48 million academic papers. But Galactica was shut down after just two days because its results were often meaningless gibberish.
Advertisement
“At launch, we outlined the limitations that come with large language models such as Galactica, including the potential for it to generate inaccurate and unreliable output,” Meta said in a statement. “Given the propensity of large language models such as Galactica to generate text that may appear authentic, but is inaccurate . . . we chose to remove the demo from public availability.”
Carl Bergstrom, a professor of biology at the University of Washington who gave it a try, said the Galactica AI would often crank out answers that seemed believable but were dead wrong. “If you give it something it doesn’t know,” Bergstrom said, “it just makes up stuff out of whole cloth.”
Instead of trying to summarize academic papers, Consensus simply identifies and highlights key findings. Bergstrom said the results are far more useful. “I really, really, really dislike Galactica,” he said, “and I like this.”
Olson and Salem had been thinking about a scientific search engine for years. But they got serious about it when they saw how readily a traditional search engine like Google could spread false information. The problem, Olson said, is that Google’s algorithm tends to favor the most popular Internet sources, not the most trustworthy.
“Searching for what the experts think . . . is really, really, really difficult,” Olson said. “Google is just not designed to do this for us.”
Meanwhile, Olson and Salem realized that AI systems had become far more powerful in recent years and were now capable of understanding words and phrases in their larger contexts. This convinced them that an AI could be taught to break down a document into its key sections and display only the most important parts.
Advertisement
In 2021, the two men hired three software engineers and raised $1.3 million in funding. The biggest chunk came from Winklevoss Capital, a company founded by Tyler and Cameron Winklevoss, best known for their bitter legal feud with Meta chief executive Mark Zuckerberg over the founding of Facebook.
Olson and Salem began with a preexisting AI model that had already been trained on scientific documents written by humans. Next, they hired scientists to read and annotate about 100,000 academic papers in various fields. These papers and the scientists’ notes were then used to teach the AI to recognize specific features found in all academic papers, especially the parts that summarize authors’ conclusions.
The Consensus search engine is linked to a database of 200 million academic papers in the public domain. When a user asks a question, Consensus takes about five seconds to reply with a long list of excerpts from scientific journal articles. The system doesn’t seek to provide simple answers, but to give the user a spectrum of academic research on the topic. Ask Consensus “is nuclear power safe?” and it serves up citations from multiple papers, some supportive of nuclear power and others more skeptical. But all the citations come from reputable academic publications, rather than opinionated amateurs.
Advertisement
For now, Consensus can’t access many millions of academic papers locked behind paywalls by subscription-only publishers. Salem said Consensus plans to work with the publishers to negotiate a solution. In addition, the Biden administration in August announced that all research papers produced with federal funding must be made accessible to the public starting in 2025.
Since Consensus opened for public use in September, about 15,000 people have logged on. Salem said the service gets lots of queries from health and fitness buffs, from parents in search of child-rearing advice, and students looking for homework help. Salem and Olson plan to offer a premium version that will provide more detailed summaries of each paper, such as information about the organizations that funded the research. But for now, Olson said, “we’re completely free, and we’d like to keep part of the product free forever.”
Hiawatha Bray can be reached at hiawatha.bray@globe.com. Follow him on Twitter @GlobeTechLab.