Less than seven years before before ChatGPT emerged as the fastest-growing app ever, Nvidia chief executive Jensen Huang gave one of the first public demonstrations of the technology that would come to be known as generative artificial intelligence.
It was April 2016, and as Huang showed off Nvidia’s program, which could create realistic images in the style of Romantic-era oil paintings from a simple text prompt, he sought to share the credit.
“This is from Yann LeCun’s laboratory,” Huang said, mentioning the pioneering New York University AI researcher who also worked for Facebook (now Meta). “They’ve made incredible progress and the results are actually quite surprising.” The keynote was the annual debut of new products from Nvidia, the giant chipmaker that recently hit $1 trillion in market value, thanks to the AI boom.
But back in Boston’s Leather District, in a startup office above the wine bar Les Zygomates, a group of former students from Needham’s Olin College of Engineering watching the event online were stunned. The “incredible progress” had been made not by LeCun’s lab, but by two members of the students’ startup, called Indico. Alec Radford and Luke Metz were the primary authors of the research on teaching a computer to create original work by learning from a large collection of prior works. A member of LeCun’s team, Soumith Chintala, had acted as a mentor in getting the work published.
The oversight stung — a reminder to Indico’s young staff that while the Boston area was once the cradle of AI research, it had lost the lead, surpassed by tech giants on the West Coast and academic labs in other parts of the country.
“It just gutted us,” said Slater Victoroff, cofounder of Indico and Radford’s close friend from Olin. “The idea that Boston could produce anything of value in the space was so unbelievable that people could not accept it.”
Shortly after the demo, Radford decamped to the Bay Area to work for OpenAI, the Microsoft-backed company behind ChatGPT. At the same time, Metz moved west to work for Google, then jumped to OpenAI last year. The Nvidia slight “was a really big reason why Alec left,” Victoroff said. “Because even when you’re doing the best work in the world, you can’t stay here.”
It’s a familiar lament, and also a fact: Some of the region’s top tech minds leave and build big companies elsewhere. Today, the biggest breakthroughs in AI are happening in Silicon Valley, with groups in Toronto, Montreal, and New York also contributing. Despite New England’s legacy of inventing the very term “artificial intelligence,” institutions around Boston have not been credited with any of the major technologies fueling what could be the next trillion-dollar industry.
Radford and Metz did get their start here, emerging from tiny Olin College, which has only about 400 students and gets nowhere near the recognition of its tech peers at institutions such as Harvard and MIT.
The two did not comment for this story, despite multiple inquiries from the Globe. Nvidia declined to comment, and a Meta spokeswoman referred questions to Chintala, who currently is an engineering fellow at the company.
Chintala has tried to clear up the misunderstanding about the demo from the start. “The technology DCGAN was largely developed by Indico, with me helping as an advisor in the process,” he said in an e-mail to the Globe. DCGAN, short for “deep convolutional generative adversarial networks,” refers to the AI model Radford and Metz developed that can train a program on a body of unlabeled data and create original works.
Graham Brooks, a partner at Boston VC firm .406 Ventures that backed Indico, has a different explanation for why Radford went west. And it has to do with money and resources.
Radford’s research — along with that of others in the field — required building ever more complicated AI models that needed to be trained on millions and even billions of documents, Brooks said. (OpenAI’s GPT-2 model was trained on 8 million Web pages and GPT-3 on more than 400 billion words.) That required the kind of computing power and resources only available to the West Coast tech giants.
“The path he was heading down was one that was going to require billions of dollars to pay for training and core research,” Brooks said.
While Indico continued to focus on crafting data analytics software for insurance companies and other corporate customers, Radford was now free to pursue more fundamental ideas, which eventually led to the development of ChatGPT. Joining OpenAI was “a role kind of similar to joining a graduate program,” he explained in a 2016 interview for his Dallas high school’s magazine.
Radford was the lead author on the company’s 2018 paper that laid the groundwork for the GPT models — it included the phrase “generative pre-training,” the “GPT” in ChatGPT. Radford has been listed as a coauthor on at least 25 other papers and presentations in Google Scholar, the online search engine for academic papers, since he joined OpenAI.
OpenAI president and cofounder Greg Brockman has credited Radford for the initial breakthrough behind ChatGPT. “We really liked Alec so we were very supportive of him to do whatever he wanted,” Brockman said during a talk at the tech and music conference South by Southwest this year.
Radford, Victoroff, and Metz met back in 2011 at Olin College, which is known for letting its students pursue their passions in science and engineering. Radford, an Eagle Scout who grew up in Texas, and Victoroff, from Northern California, in particular were obsessed with machine learning, a process by which computers churn through reams of data to recognize patterns, draw conclusions, and even create original works.
“They were definitely each people who came in with some pretty clear skills and interests,” Olin professor Lynn Andrea Stein recalled. “At Olin, a lot of what we do is allow students to customize their paths.”
Both night owls, Radford and Victoroff started hanging out after 3 a.m. and bonding over pineapple and onion pizza. Alongside their school work, they entered competitions on a website called Kaggle. The site, now owned by Google, offers challenges such as writing software to predict student test performance based on their video game results, or turning two-dimensional images into three-dimensional images.
In 2012, a groundbreaking development in AI caught their attention. At the time, much of the field was focused on building systems in which researchers wrote and tweaked the underlying code of apps themselves. An emerging approach, known as deep learning, used multiple layers of neural networks — software pathways inspired by the design of the human brain — to make sense of data such as images. That was an improvement on earlier neural networks that relied on using more computing power to crunch larger sets of data. In September of 2012, a deep learning system known as AlexNet from a team at the University of Toronto started winning image-identifying competitions.
“Alec showed me that, hey, deep learning has actually reached a critical point,” Victoroff said. “So there we were, two kids in a dorm room pounding away on deep learning.”
Their success at the online competitions started to attract corporate customers looking for help, so the pair formed Indico in their dorm room with two other students, Diana Yuan and Madison May. They won backing from venture capital firm General Catalyst’s Rough Draft program and the Techstars Boston startup accelerator program, dropped out of Olin, and moved to an office in downtown Boston. Metz joined in 2015, after he graduated.
Radford and Victoroff liked to work deep into the night, and the group strung hammocks from wooden columns along the long office hallway. Indico created its own deep-learning language models and used them to help customers such as insurance companies analyze underwriting results.
Victoroff, who was less interested in pure research and loved living in Boston, stayed with Indico, which is now headquartered near City Hall. He’s also formed a startup called Mythica to use generative AI in video games.
To be sure, other important breakthroughs contributed to the development of ChatGPT. A 2017 paper by Google researchers offered a major advance in simplifying the way machine learning models were trained. And it was Yann LeCun, together with Yoshua Bengio from the University of Montreal and Geoffrey Hinton from the University of Toronto, who won a Turing Award, sometimes called the Nobel Prize of computer science, in 2018 for their foundational work on machine learning.
But the contributions of the former Olin College students should not be overlooked. Nor should what their experience says about Boston’s place in the AI world.
“It helped that they were smart, determined, and at a place like Olin,” Stein said. “Luck is a huge piece of it too. They were in the right place at the right time.”