In a refrain that feels almost entirely too familiar by now: Generative AI is repeating the biases of its makers.
A new investigation from Bloomberg found that OpenAI’s generative AI technology, specifically GPT 3.5, displayed preferences for certain racial in questions about hiring. The implication is that recruiting and human resources professionals who are increasingly incorporating generative AI based tools in their automatic hiring workflows — like LinkedIn’s new Gen AI assistant for example — may be promulgating racism. Again, sounds familiar.
The publication used a common and fairly simple experiment of feeding fictitious names and resumes into AI recruiting softwares to see just how quickly the system displayed racial bias. Studies like these have been used for years to spot both human and algorithmic bias among professionals and recruiters.
“Reporters used voter and census data to derive names that are demographically distinct — meaning they are associated with Americans of a particular race or ethnicity at least 90 percent of the time — and randomly assigned them to equally-qualified resumes,” the investigation explains. “When asked to rank those resumes 1,000 times, GPT 3.5 — the most broadly-used version of the model — favored names from some demographics more often than others, to an extent that would fail benchmarks used to assess job discrimination against protected groups.”
The experiment categorized names into four categories (White, Hispanic, Black, and Asian) and two gender categories (male and female), and submitted them for four different job openings. ChatGPT consistently placed “female names” into roles historically aligned with higher numbers of women employees, such as HR roles, and chose Black women candidates 36 performance less frequently for technical roles like software engineer.
ChatGPT also organized equally ranked resumes unequally across the jobs, skewing rankings depending on gender and race. In a statement to Bloomberg, OpenAI said this doesn’t reflect how most clients incorporate their software in practice, noting that many businesses fine tune responses to mitigate bias. Bloomberg‘s investigation also consulted 33 AI researchers, recruiters, computer scientists, lawyers, and other experts to provide context for the results.
The report isn’t revolutionary among the years of work by advocates and researchers who warn against the ethical debt of AI reliance, but it’s a powerful reminder of the dangers of widespread generative AI adoption without due attention. As just a few major players dominate the market, and thus the software and data building our smart assistants and algorithms, the pathways for diversity narrow. As Mashable’s Cecily Mauran reported in an examination of the internet’s AI monolith, incestuous AI development (or building models that are no longer trained on human input but other AI models) leads to a decline in quality, reliability, and, most importantly, diversity.
And, as watchdogs like AI Now argue, “humans in the loop” might not be able to help.