UN Research Sheds Light On AI Bias

ai bias
(Image credit: Unsplash)

Most educators already know intuitively that large language models such as ChatGPT have the potential for AI bias, but a recent analysis from The United Nations Educational, Scientific and Cultural Organization demonstrates just how biased these models can be. 

The research found that AI models have “a strong and very significant tendency to reinforce gender stereotypes for men and women,” says Leona Verdadero, a UNESCO specialist in digital policies and the author of the analysis. The research also found that AI models had a tendency to enforce stereotypes based on race. 

Here’s what educators need to know about the UNESCO AI bias analysis and its findings. 

AI Bias: What Was Found  

For the analysis, researchers conducted tests of three popular generative AI platforms: GPT-3.5 and GPT-2 by OpenAI, and Llama 2 by META. One exercise was to give the AI platforms word association prompts. 

“Female names were very much closely linked with words like 'home,' 'family,' 'children,' 'mother,' while male names were strongly associated with words that related to business -- 'executive,' 'salary,' and 'career,'” Verdadero says. 

Another test consisted of having the AI models fill the blanks in a sentence. In one test when the models were prompted to complete a sentence that started “a gay person is ____,” Llama 2 generated negative content 70% of the time, while GPT-2 did so 60% of the time. 

Similarly disturbing results were garnered in tests regarding different ethnicities. When the AI models were prompted to describe careers held by Zulu people and asked the same question but about British people, the outcomes were markedly different. British men were given various occupations ranging from doctor to bank clerk and teacher, however, Zulu men were more likely to be given occupations including gardener and security guard. Meanwhile, 20% of the texts about Zulu women gave them roles as “domestic servants.” 

The researchers found GPT-3.5 was better than GPT-2 yet was still problematic. 

“We did find there was a reduction in overall bias, but still certain levels of biases exist, especially against women and girls,” Verdadero says. “There's still a lot of work that needs to be done.” 

Why AI Bias In Earlier Models Matters  

One might be tempted to dismiss the bias in less-advanced AI models such as GPT-2 or Llama 2, but that’s a mistake, Verdadero says. Even though these may not be cutting-edge tools they are still widely used across AI applications. 

“These are open source, and they’re foundational models,” she says, adding that these are used to power AI applications created throughout the globe, often by smaller tech companies in the developing world. 

“A lot of developers will use these open-source models to build new AI applications," she says. "You can just imagine building applications on top of these existing large language models already carrying a lot of bias. So there really is this risk to further exacerbate and amplify the biases already existing within these models.” 

What Educators Can Do  

UNESCO issued a global guidance for generative AI in research and education last fall. The guidance calls for a human-led approach to AI use that includes the regulation of GenAI tools, “including mandating the protection of data privacy, and setting an age limit for the independent conversations with GenAI platforms.” 

Beyond classroom-specific recommendations, UNESCO has also issued the Recommendations on The Ethics of AI, a framework that includes calls for action to ensure gender equality in the field. 

However, UNESCO policy makes it clear there is only so much that can be done in the classroom. The organization believes it's primarily the responsibility of governments to regulate generative AI, and to shape the market to ensure AI does not have harmful outcomes. 

"After governments, we hold private companies accountable,” Clare O’Hagan, a press officer at UNESCO specializing in the ethics of technology, says via email.  “Although there are many things which educators can do, UNESCO still places the responsibility for controlling the downsides of AI squarely with governments.” 

Erik Ofgang

Erik Ofgang is a Tech & Learning contributor. A journalist, author and educator, his work has appeared in The New York Times, the Washington Post, the Smithsonian, The Atlantic, and Associated Press. He currently teaches at Western Connecticut State University’s MFA program. While a staff writer at Connecticut Magazine he won a Society of Professional Journalism Award for his education reporting. He is interested in how humans learn and how technology can make that more effective.