It will be ironic if a high-profile lawsuit against Harvard ends up overthrowing race-conscious admissions at colleges across the United States. Americans have been fighting over the issue for more than half a century, but the decisive legal blow may come just as the technology to make affirmative action more effective is finally at hand.
Technologies such as data mining and machine learning have made us all more visible. Computer algorithms can infer intimate details about a person, from race to political orientation to personality characteristics, by looking at a plethora of other details in that person’s online behavior. This kind of technology is already changing the way colleges market themselves to prospective students. Recruiters can target underrepresented niches — say, middle-class Latino athletes who like classical music — with greater precision than ever before.
Someday soon, data science could help colleges sort through applications as well. Schools could make decisions based not only on the information students provide on their applications, but also on rich data profiles that bring in a student’s socioeconomic background and also a host of other variables.
Advertisement
“The technology is there if you sort it out the right way,” said former Boston University communications professor and veteran media and marketing analyst John Carroll, who gives talks on targeted marketing. “If Facebook can do it, then I think MIT can do it.”
Like other selective schools, Harvard tries to admit a class of students who are more representative of the overall population than a class chosen by GPAs and test scores alone would be. But the legal landscape for affirmative action has been murky: the Supreme Court has held that admissions officers can use a student’s race as a factor, but not the deciding factor in admissions; and while quotas are forbidden, highly selective universities have been free to evaluate applicants in a more holistic way.
When deluged with tens of thousands of applications, admissions officers may struggle to take in the big picture all at once. Fatefully, Harvard asked application readers to break applicants down according to a small number of traits, from academic achievement to athletic potential. The plaintiffs accuse the school of holding down the number of Asian-American students, not least by consistently giving them lower ratings for their subjective personal qualities.
Advertisement
Critics of the current system might not be reassured if human admissions officers instead outsourced admissions decisions to powerful computer algorithms. But computers are more adept than humans at making yes-or-no, up-or-down decisions based on dozens or hundreds of variables at once — and could provide a way for universities to build diversity without considering race directly.
***
LIke any other business, colleges have started using tracking programs to follow their customers around the Internet. Click on a link for admissions information on a university website, and you may find yourself getting e-mails promoting the school’s financial aid programs.
Even just signing up to take the SAT can flood a student’s virtual and analog inboxes with promotional materials from universities who buy contact names from testing companies. “In many ways data mining shifts the admissions process from a push dynamic to a pull dynamic,” Carroll said.
Universities can also target students of color in their marketing, rather than waiting for them to apply. Facebook, for example, offers advertisers the option to reach specific users Facebook has sorted into what it calls “racial affinity clusters.”
Dipayan Ghosh, a former privacy and public policy adviser at Facebook, says the company sorts users into these clusters based on a mountain of data comprising a user’s interests, patterns of online behavior, and social connections. “It’s based on everything,” he said.
The data Facebook gathers not always perfect — Ghosh is not African-American, though Facebook thought he was. But it’s “more accurate than you might think,” he said. And data brokers such as Experian or Oracle have access to even more complex webs of data about individuals than Facebook does — they can build a detailed dossier on an individual based on their online history — and all that information is for sale. “You can infer race with a very high confidence if you know the set of factors that a data broker might know,” Ghosh said.
Advertisement
That means that not only could colleges tailor ads to students of color, they could even find out an applicant’s race without ever asking the applicant to disclose that information — by asking data brokers instead.
There’s already a long history, of course, of colleges trying to diversify their student bodies without asking about race. In California, for example, where race-based affirmative action was outlawed in 1996, schools spent years trying to find alternatives. They tried using other categories — such as a student’s socioeconomic status, neighborhood, or parental education level — that were associated wth race, hoping that if they admitted more students who were disadvantaged in these areas, they’d also end up with more students of color.
“It’s a good idea, and it doesn’t work,” said Gary Orfield, co-director of UCLA’s Civil Rights Project.
Over the last two decades, Orfield said, “UCLA spent hundreds of millions of dollars trying to do everything they could think of” as an alternative to race-based affirmative action, including recruiting economically disadvantaged students and accepting the top 10 percent of students from underperforming high schools. “And nothing has worked very well,” Orfield said. “Race is not the same as anything else.”
Advertisement
UCLA’s efforts “made a difference,” Orfield said, “but not nearly enough of a difference.”
Natasha Warikoo, an associate professor who studies affirmative action at Harvard’s School of Education, agreed. “Ultimately,” she said, “most of the research suggests that the best way to achieve racial diversity is — surprise! — to consider race.”
OK, so maybe looking at just few factors that tend to correlate with race is too crude — like looking at a low-res image. What if the picture is more detailed? Mark C. Long, a professor of public policy at the University of Washington, tried looking at 195 characteristics of a group of 10th graders who had been surveyed for a US Department of Education study. He found he was able to predict the minority status with 82 percent accuracy (the most reliable indicator was the race of a student’s three best friends).
“To ward off the adverse effects of using an imperfect predictor of race, the university could seek to obtain additional information on students to help predict their [racial] status,” he wrote. “The university may want to . . . follow the path of private businesses that try to predict the characteristics of their customers.”
Moreover, schools could use race and in combination with other indicators of socioeconomic status to identify the most disadvantaged students.
***
Even so, there are still a few big problems. First of all, Long notes, buying such data from brokers is costly, and may be prohibitively so. But there are also ethical, and even legal, issues.
Warikoo said students just shouldn’t have to assume that everything they do online will factor into getting into college. “Young people are constantly on social media,” she said. “If you’re telling me everything is going to be watched — I think that’s dangerous and harmful.”
Advertisement
But a lot also depends on to what extent race-based affirmative action is forbidden. Would Harvard be prohibited only from taking race into account when choosing between applications -- or also from making any special overtures to encourage members of underrepresented minorities to apply? If schools are forbidden to ask applicants about their race, will they also be forbidden to infer that information from other data?
If colleges just aren’t allowed to ask about race, using this kind of data as a “proxy” for race would be kosher. But if universities are forbidden from considering race in admissions altogether, the law might also forbid considering such proxy information that correlates with race.
“You’re at the mercy of the judicial system in terms of whether they view this as a legit alternative,” said Carroll.
The irony is that this kind of proxy information strategy has been before in American history — by racist government officials, to get around laws forbidding racial discrimination. From the 1890s to the 1960s, for example, state governments across the south used literacy tests to block African-Americans from voting, knowing that they were less likely to have accessed a basic education — a kind of primitive algorithm that went unchallenged for too long.
“I think it’s often easier to maintain seemingly race-neutral policies that disadvantage already disadvantaged groups than the opposite,” Warikoo said. “It’s because of who is in power and who makes decisions. That is how power works.”
That, at least, is not likely to change — no matter how smart our algorithms get.
S.I. Rosenbaum can be reached at si@arrr.net. Follow her on Twitter @sirosenbaum.