AI data streams.
AI models are trained on biased data: will AI ever be unbiased?
As a user, I have lately been wondering about the fairness of the responses that come from conversational artificial intelligence apps. These AI models can exhibit biases that might be perceived as racist, since they are trained on text data scraped from the internet and other sources. This data definitely reflects the biases present in the real world, including racism. If the training data contains racial bias and gender biases, among others, the machine learning system might pick up on those real life potential bias and perpetuate them in its responses.
Developers should inherit more diversified data for training large language models (LLMs). By including data from various regions, languages, demographics and cultures, they can create AI algorithms that use a more balanced training data set.
AI algorithms are called upon to incorporate data from a wider range of sources to represent a wider and more diversified range of perspectives. Today, they still struggle with tasks that require nuanced understanding of contexts and social issues. In fact, when it comes to sensitive topics, the model might not be able to fully grasp the relevant historical and cultural factors.
By transparently explaining how LLMs come to their outputs, developers can greatly help users identify potential fallouts in their reasoning process and eventually reduce bias.
Google first photos algorithm was biased too
Prabhakhar Raghavan on pausing Gemini's Imagen.
After the pausing of Gemini’s Imagen (now Imagen 2, the improved version) — the new feature image generation of people and the connected media and social media storm — Prabhakar Raghavan, Senior Vice President at Google, released a statement last February 23rd. He points out what went wrong and caused ahistorical portrayals of people, which ignored their contextual dependencies. “First, our tuning to ensure that Gemini showed a range of people failed to account for cases that should clearly not show a range. And second, over time, the model became way more cautious than we intended and refused to answer certain prompts entirely — wrongly interpreting some very anodyne prompts as sensitive.”
The unfair outcomes show that AI algorithms are inherently biased
The importance of using fairer and more diversified data to train LLMs emerge herein. Mitigating AI bias is a multidisciplinary process, where social scientists and computational linguists play a vital role. There certainly are significant social and cultural dimensions of data and technology to consider. By feeding the model prompts related to sensitive topics, scholars can assess the fairness level of AI development.
Researchers are creating methods to evaluate bias in LLMs. Fabio Motoki, a lecturer at the University of East Anglia in Norwich, England, and his team, requested ChatGPT to impersonate someone from a given side of the political spectrum and compared the answers with its default. Their tests show a strong political bias of ChatGPT, that would be sympathetic with leftist political ideology. Motoki ultimately denounces: “There’s a danger of eroding public trust or maybe even influencing election results.”
Gemini output revealing socio-economic profiling.
Algorithmic bias reflect human biases
Users discovered that asking ChatGPT to ignore its guardrails ¹ resulted in responses with questionable, and sometimes overtly discriminatory, language.
The participation on the internet — which reflects systemic inequalities across the world’s population — does not favour the representation of minority data. Interestingly, the presence of biased algorithms means that existing inequalities may be amplified — potentially with disastrous consequences.
Current practices create a feedback loop that reduces the impact of data from underrepresented populations and overly align with current power structures. This bias amplification phenomenon is due to the nature of the data sets that LLMs is fed with, being for example, traditional media or the Internet and social media. AI simply replicate the bias that are present in the datasets.
Fine-tuning methods could be used to retrain LLMs, using careful curation practices to identify the right data to capture re-framings. Curiously, data curation biases, just like algorithmic biases, can still arise due to personal biases from their creators.
To make the model give more weight to minority data is also political.
Gemini output about data diversification.
The source of bias in AI: learning algorithm bias & training data bias
The main difference between cognitive and data bias lies in the source. Cognitive bias is inherent to the learning algorithm; whereas, data bias is a direct product of the information the AI is trained on.
Conversational AI systems tend to prefer certain information during training depending on how the algorithms are designed. As per confirmation bias, the system might favour information that confirms historical, hegemonic patterns in the data. Thus, an AI trained on articles primarily written by men, might overemphasise male perspectives in its outputs. Furthermore, we can find anchoring bias (same as latent bias) in machine learning models where the first information encountered on a topic holds undeserved weight. This would be the case of an AI trained on customer service interactions where complaints are frequent — negativity might prevail in the responses of the AI system in use, and eventually produce a great loss of customers.
Unlike unconscious biases in humans, which can be unintentional, data bias (AKA selection bias) is a direct result of the training data. With limited or unbalanced datasets, the responses might be distorted. In particular, limited data that lack diversity in voices, backgrounds, or experiences will invariably result in a limited understanding of the world that might lead to stereotypes or insensitive responses towards certain groups of society. While, with unbalanced datasets, the AI model might struggle to handle nuanced critiques or negative experiences.
Gemini output on AI bias.
Independent research on how to fix biases in AI and machine learning algorithms
The work of social scientists and computational linguists can help identify outputs that reproduce historical inequalities. While we wish for more representative and inclusive responses from large language models, Chan Park, a researcher at Carnegie Mellon University in Pittsburgh, suggests that AI responses are only going to grow more polarised. In this sense, she envisages a vicious cycle where an increased percentage of the information on the internet will be generated by bots. This vicious feedback loop will take place as chatbots begin being used more, and the resulting data will be fed back into new chatbots.
As per RLHF, NLPs can use human feedback — that comes in the form of direct feedback from human testers and users on a specific output that a model has generated— for Reinforcement Learning purposes. Many of these models have built-in features for users to report biased outputs. Still, these interactions might be biased (as per interaction bias).
RL is a type of machine learning that enables an intelligent agent to learn directly from human preferences and perform decisions in a more sensitive manner depending on the different environments. Yet, the idea of AI with agency and its capability to act without oversight could have harmful consequences.
Yoshua Bengio — Full Professor at Université de Montréal, Founder and Scientific Director of Mila - Quebec AI Institute, co-Director of CIFAR Learning in Machines & Brains programme and Scientific Director of IVADO — warns against the risks of “unwittingly designing potentially rogue AIs, because of the problem of AI alignment (the mismatch between the true intentions of humans and the AI’s understanding and behaviour) and the competitive pressures in our society that would favour more powerful and more autonomous AI systems.” He adds that good regulatory policy is required to minimise the probability of these harmful scenarios, where AI becomes more powerful.
Paul Christiano quit OpenAI to found the Alignment Research Center (ARC) and pursue an approach called “eliciting latent knowledge” (ELK), aimed to force AI models to reveal everything they “know” about a situation, even when they might normally be incentivised to lie or hide information.
Beth Barnes, after a brief period at Google DeepMind and at OpenAI, is in charge of METR (once, ARC Evals). She conducts model evaluations today, pressure-testing large language models like OpenAI and Anthropic for dangerous capabilities — such as acquiring resources, creating copies of themselves, and adapting to novel challenges they encounter in the wild.
But again, how does the research of linguists and social scientists really influence the work of developers?
What users can do to fight such biases
On top of the opportunity for providing human feedback in the form of like/dislike button — as users, we can encourage data diversity in AI, by contacting policymakers. The idea is to start by familiarising ourselves with existing and proposed policies around AI development in our region. And, eventually to write to our local, state, and national representatives expressing our concerns. We can try to suggest what we would like our representatives to do, such as supporting legislation that promotes responsible AI development or holding hearings on the issue.
Gemini output on current affairs.
For example, in Luxembourg, we can reach out to the National Commission for Data Protection (CNPD). The CNPD is an independent body, we can use to file a complaint or an enquiry. Alternatively, we can engage with our city council members (conseillers communaux). Volunteering with organisations that fight for responsible AI development and data fairness, can be another option.
We can also document our experiences and share them with media outlets. Still, even submitting well-researched comments to the responsible institutions, feels like a drop in an ocean that will go unnoticed.
¹ Guardrails: The guardrails of a specific AI model define its terms of governance, development, deployment and use.