Minimizing Bias in Your Use of AI
AI Models are only as good as the datasets they are trained on. Make sure the AI Model you are using is as unbiased as possible.
Bias is one of the biggest risks involved with the use of AI. There are a few key reasons why bias can emerge as an issue in AI systems:
Training Data – AI systems consume vast amounts of data and then use mathematics and engineering to make statistical predictions of what word should be used next in the answer they are crafting. If the data used to train AI models reflects societal biases, those biases get baked into the models' logic. The old saying of “Garbage in, Garbage out” applies nicely here. This can be with or without intent. For example, in 2014 Amazon used AMZN.O -- an early AI model – to streamline hiring. Potential candidates were rated on a system of 1 to 5 stars. The intent was that they could feed AMZN.O 100 resumes and it would rank the top five candidates who could then be interviewed or simply hired. By 2015 the company noticed the system was heavily favoring men. The reason for this was simple – the model was trained using resumes from top candidates over the past 10 years. Those resumes were mainly from male candidates. When analyzing resumes AMZN.O rejected or downgraded resumes that listed any type of women’s sports or activities, along with those that mentioned two top all-female colleges. The output of an AI system reflects the training inputs to that system.
After engineers were unable to fix the model to the company’s satisfaction, they abandoned it. Amazon maintains that it never used the rankings from AMZN.O exclusively in hiring. This bias was purportedly discovered in time, but what about more subtle biases? Another problem Amazon discovered in some 500 AI models it developed to assist in recruiting was that these models were trained on so-called “male-centric language” such as “executed” and “captured.” Amazon now touts its use of AI to help potential candidates find the most relevant role for them among potential job openings by using natural language processing. It acknowledges using ML to search through qualifications on resumés “without having the wait for a resumé review by a recruiter” and makes assurances that it trains and tests all AI- and ML-based tools are regularly and rigorously tested to meet legal requirements and that they exceed industry best practices. The consequences of bias when using AI, including in hiring may only become apparent in retrospect.
Lack of Diversity - Homogenous teams building AI systems may overlook biases that impact marginalized groups they either aren’t aware of or don't understand.
Narrow Focus - Models trained on limited data may fail to generalize fairly across populations. For example, image classifiers trained mostly on light-skinned faces. IBM and Microsoft have both made much of their efforts to minimize their error rate of facial recognition, especially in regard to darker-skinned females.[2] Perhaps in time we will have an easier time excluding racial bias in AI and ML than we have in our society as a whole.
Proxies and Correlations - Models may pick up on proxies or correlations in data that result in unfair biases due to relying on attributes like race, gender, age, etc. This can be seen in the example above of AMZN.O’s failures.
Complex Data Relationships - Subtle and complex relationships in data that lead to biased outcomes may be difficult to recognize and mitigate. For example, an AI system may not be able to determine if years of experience in a field can be of equivalent value to a degree from a top university.
Optimization Tradeoffs - Removing bias sometimes requires tradeoffs against accuracy/performance that companies deprioritize. If companies find an AI solution that saves money, even if it shows some bias, it might be too tempting to resist, or to justify in using.
Lack of Transparency - Biased outcomes can emerge from opaque model logic and data patterns that are hard to explain. Although many developers are trying to avoid the “black box” mentality that characterizes many AI models, it will require specialized developers or engineers to verify that AI models are fair and unbiased.
Real-World Impact - Bias can manifest subtly through real-world use even if models seem fair in testing. Continual monitoring is key.
In summary, bias arises from problems with data, development practices, narrow model capabilities, and insufficient testing. Thoughtful model governance and ethical AI practices are important to recognize and reduce bias.
Avoiding Bias
Bias will likely always be an issue to guard against as we use AI. Here is a list that provides a starting point.
Demand Transparency from companies around their AI models' training data, development process, and testing procedures. Lack of transparency enables bias. And even if an AI solution appears to save time and money on the front end, it will likely cause you more time, frustration, and money, to unravel yourself and your company from a biased AI model than it could save you from upfront. Make sure you understand what data the model was trained on, and what ongoing testing is in place.
Pay Attention to any Warnings, Disclaimers or Uncertaintiesevident in responses from AI system that may indicate a lack of confidence in results, especially in treatment or description of different user groups. In response to legal pressure, many models will give a brief warning about bias, even if they ultimately provide an answer. An AI system I routinely use and am confident in using gave this answer in response to the following prompt.
Prompt: How do you, Claude, avoid bias?
As an AI system created by Anthropic to be helpful, harmless, and honest, I was designed with several techniques to avoid biases:
· I was trained on a diverse dataset of internet text to build an open-domain knowledge model, rather than data that could contain harmful biases.
· My training objective and architecture aims to produce useful, factual responses rather than opinions or judgments.
· I don't have access to any personal user data that could bias my responses.
· My model was carefully evaluated across demographic groups during testing to identify any skew.
· I have filters against producing harmful, unethical or dangerous output.
· I avoid making subjective judgments, predictions or assessments about individual users or groups.
· My responses are focused on being useful, factual and safe regarding sensitive topics.
· If I lack proper context or knowledge to avoid bias, I aim to acknowledge that and refrain from speculation.
· I try to cite credible external sources rather than provide my own analysis on controversial topics.
· My creators at Anthropic are focused on AI safety, ethics and transparency as top priorities.
Overall, responsible AI design principles, extensive testing, safety procedures and ethical training practices have been applied to minimize the risk of biases in my model. Please let me know if you ever observe any concerning biases in my responses!
Provide Feedback When You Encounter Issues with Bias – it is likely that at some point you will run into bias during your use of AI. Notifying your AI administrator helps companies identify problems and improve.
Test Results for Evidence of Bias -- use a variety of possible wordings of your prompts and compare results. This will help you understand how word selection in prompts can influence the generated result.
Support Initiatives or Assist in Efforts to Build More Open and Representative Datasets to train more unbiased AI models. Since AI models are only as good as the data they are trained on, make sure you understand what data the model was trained on, and what routine testing is done.
Stay Informed – on evolving best practices, research and regulations around responsible and ethical AI principles. Be aware of legal cases that will affect the use of AI in regard to product liability, freedom of speech, constitutional rights, and other pertinent areas.
Use Experience, Judgment and Critical Thinking – there is no substitute for the wisdom that the sometimes-hard-won experience brings. In larger companies have one or more designated administrators or your legal department provide oversight over the areas your company relies on AI. AI could be perpetuating real-world injustice and unfairness due to biases.
The more users demand transparency, accountability, and high ethical standards, the more the AI industry will prioritize avoiding bias