fb-pixel Skip to main content
Ideas | Swathi Meenakshi Sadagopan

Artificial intelligence’s diversity problem


Artificial intelligence is taking up all sorts of tasks that would have been hard to imagine just a few years ago: drafting legal documents, deploying police officers, even writing news articles.

But there are limits to what AI can do. If it’s going to be truly effective, not to mention fair, it’s got to have a steady human hand — a programmer who can spot problems and make the necessary adjustments.

Making those adjustments, though, isn’t just a matter of technical proficiency. Sometimes, it requires cultural competency. A forthcoming paper by Carnegie Mellon PhD student Kenneth Holstein and others makes the point.

Holstein writes about a system designed to recognize online images and generate captions that did a good job identifying celebrities from some countries — and a not-so-good job picking out famous people from others.


“It sounds easy to just say, ‘Oh, just add some more images in there,’” said an anonymously quoted member of the team that built the system. “But. . . there’s no person on the team that actually knows what all of [these celebrities] look like. . . If I noticed that there’s some celebrity from Taiwan that doesn’t have enough images in there, I actually don’t know what they look like to go and fix that . . . But, Beyonce?, I know what she looks like.”

Much of the criticism leveled at artificial intelligence focuses on the bias that’s too often baked into its data: say, the racially motivated police stops that are fed into a machine, which then proscribes a heavier police presence in minority neighborhoods, which begets more questionable stops.

But the Holstein paper points to another, less appreciated problem: a lack of diversity in the AI teams building and overseeing systems.

Anyone watching the field closely can see it has a diversity problem. Take the controversy, last year, over the name of the annual machine learning conference on Neural Information Processing Systems.

Its unfortunate acronym, NIPS, had long invited crude jokes. The 2017 edition of the conference included T-shirts with the slogan “My NIPS are NP-hard.” There were gags about the pre-conference event Transformationally Intelligent Technologies Symposium, or TITS for short. And there was an on-stage joke about sexual assault by a member of the Imposteriors, a now-defunct band of well-known statisticians.


Learning about this on Twitter, Kristian Lum, a statistician investigating bias in machine learning at Human Rights Data Analysis Group, wrote a blog post recounting her own experience with harassment by powerful men in the field.

The post sparked online campaigns and calls for a more inclusive environment from a number of prominent academics. Eventually, the conference was renamed NeurIPS. And there was a new focus on an open secret in the world of academic tech: gender-based harassment.

According to a report published in June 2018 by the National Academies of Sciences, Engineering and Medicine, 50 per cent of women studying science face harassment. Fewer than 18 per cent of NeurIPS conference attendees and, more broadly, only 13.5 per cent of machine learning engineers are women.

Rediet Abebe, a Cornell University PhD student who has been active in the push to diversify AI, says broader representation could head off unintended bias in AI. “[In AI applications] discrimination rarely happens through malice,” says Abebe. “A lot of the time it happens through neglect.”

There have been some signs of improvement. Hal Daumé III, a natural language processing researcher at Microsoft Research, was a vocal critic of NIPS in 2017. When asked to be one of the inaugural diversity and inclusion chairs in January 2018, he took the job along with Duke assistant professor of statistical science Katherine Heller. Together, they have worked on a stronger code of conduct, arranged for childcare support at the conference venue, printed pronoun stickers for attendees, and shared tips on social media about promoting inclusive behavior.


Several grassroots groups fostering inclusivity have emerged, too, including Black in AI, co-founded by Abebe and Timnit Gebru, a Google AI research scientist. From its beginnings as an email list, Black in AI has grown to more than 1,000 members in under two years and serves to amplify the research conducted by black researchers through a workshop held after the main NeurIPS conference is over.

“I didn’t fit into some people’s idea of what an engineer looks like,” says Deborah Raji, an undergraduate student at the University of Toronto researching bias in commercial facial recognition systems including the one offered by Amazon. “The support and encouragement I’ve received from this group has given me the courage to not only stay, but thrive and become unapologetic in my difference.”

A lack of diversity is often written off as a “pipeline problem” — not enough female applicants or people from other underrepresented groups. Abebe recounts asking the admissions committee at Cornell University why there was only one other black student in her computer science graduate program. “We don’t get enough applications from black students,” was the response she got.

This idea was disputed, though, in a report from the National Academies of Sciences. It says organizational climate can enable the kind of harassment that drives women and minorities away, beginning at the undergraduate level and continuing through graduate programs and research positions.


In 2017, Abebe mentored 15 graduate school applicants and co-launched a program within the Black in AI community to match more than 200 students with over 80 mentors. In 2018, the computer science department at Cornell tripled the number of black applicants over the previous year, thanks in part to this effort.

Still, some activists say there are some AI systems that even the most diverse machine learning team can’t fix. University of Washington PhD student and transgender researcher Os Keyes points to Automatic Gender Recognition (AGR), which displays targeted advertisements based on gender inferred from physical appearance.

“Infrastructure [like these apps],” says Keyes, “doesn’t just reflect [societal] values — they perpetuate them. If you have a facial recognition app that determines gender, then shows ads [such as pretty dresses for women and cars for men] it is also reinforcing that gender is a binary thing, and can be physiologically inferred.”

Keyes says some transgender people use AGR, built into many popular photography apps, to test if they “pass” for the gender they identify with and to then tailor their appearance.

But the AI community, Keyes says, must be attuned to its limitations: “Technology impacts reality,” says Keyes. “Inclusive and diverse hiring is necessary, but it is also insufficient.”


Swathi Meenakshi Sadagopan is an electrical engineer and data analyst at Deloitte Canada. Currently, she is a Munk Fellow in Journalism at the University of Toronto.