Eliminating AI Bias

Original Source Here

Eliminating AI Bias

Identifying AI Bias and knowing how to prevent it from occurring within the AI/ML pipeline

Photo by Sushil Nash on Unsplash

Evidence of AI Bias

The primary purpose of Artificial Intelligence (AI) is to reduce manual labour by using a machine’s ability to scan large amounts of data to detect underlying patterns and anomalies in order to save time and raise efficiency. However, AI algorithms are not immune to bias. AI Bias has presented itself in several forms, with some examples highlighted below:

  • In 2018, Amazon stopped using its AI Hiring Tool for being biased against women¹.
  • In July 2018, AI tools used in the US judicial system (e.g. COMPAS) were reviewed for showing racial bias when used to predict recidivism².
  • In November 2019, claims were made against Apple’s credit card for being inherently biased against women over men by providing different credit limits based on gender³.
  • In 2020, there was a rise in the number of councils in England using AI algorithms for decision-making about public welfare⁴.
  • Most recently, in September 2021, a report by Harvard Business School and Accenture found that approximately 27 million workers in the United States are filtered out of job applications by automated technology. These workers include caregivers and immigrants⁵.

As AI algorithms can have long-term impacts on an organisation’s reputation and severe consequences for the public, it is important to ensure that they are not biased towards a particular subgroup within a population. In layman’s terms, algorithmic bias within AI algorithms occurs when the outcome is a lack of fairness or a favouritism towards one group due to a specific categorical distinction, where the categories are ethnicity, age, gender, qualifications, disabilities, and geographic location.

How does AI Bias occur?

AI Bias takes place when assumptions are made incorrectly about the dataset or the model output during the machine learning process, which subsequently leads to unfair results. Bias can occur during the design of the project or in the data collection process that produces output that unfairly represents the population. For example, a survey posted on Facebook asking about people’s perceptions of the COVID-19 lockdown in Victoria finds that 90% of Victorians are afraid of travelling interstate and overseas due to the pandemic. This statement is flawed because it is based upon individuals that access social media (i.e., Facebook) only, could include users that are not located in Victoria, and may overrepresent a particular age group (i.e. aged between 20–30 years old over those aged 50 and above), race, or gender that is dictated by their use of Facebook.

To effectively identify AI Bias, we need to look for presence of bias across the AI Lifecycle shown in Figure 1.

Figure 1: The Five Stages of the AI/ML Pipeline and associated AI bias (Image by author)

Sources of AI Bias in the AI/ML Pipeline

A typical AI/ML pipeline starts with a business problem that requires the use of data and analytics to understand the drivers of the issue. This business problem is typically converted into one or many data hypotheses. A study is then designed to determine the data that needs to be collected, the process to carry out its collection, annotation, preparation, and transformation into a format that can then be used for model development.

Let us have a look at how different types of bias can be introduced at each stage within the pipeline.

Study design and hypotheses formulation

Sampling bias occurs when the study is designed in such a manner that only a subset of the population is sampled, or the sample selected from the population is non-random (see Figure 2). This type of bias is often prevalent in surveys and online polls where only a subset of the population is sampled because they have access to the poll or have willingly completed the survey or poll, resulting in voluntary bias, a type of sampling bias.

Figure 2: Evidence of sampling bias where only individuals that have access to paper-based surveys are considered (Image by author)

Data collection, pre-processing and exploration

Once the study has been designed, data is then collected, pre-processed, and explored in the form of preliminary graphs and tables summarising the dataset.

Data that is usually collected is a sample that is used to represent a typical target population. However, due to design bias, the sample may be misrepresentative of the population due to an imbalance in the dataset. For instance, machine learning models used in the Google search engine struggled to distinguish between human images and those of gorillas⁶. This is because the dataset did not include a fair representation of gorilla, chimpanzee, and monkey images.

When there is an imbalance in the dataset due to under-representation of the minority dataset, the algorithm performed well on the overrepresented classes such as pandas and poodles in this case, but performed poorly on the minority class (those belonging to the monkey family). This problem was further heightened by the fact that the labelling technology used in the pre-processing phase was not perfect. As such, this resulted in incorrect annotation of the dataset, which is a form of label bias called recall bias.

Measurement bias is a type of label bias that occurs when the training dataset differs from real-world datasets as the data has been distorted due to faulty measurements. An example of this occurred in a project where different brands of spectrometers were used to collect wavelength data for plant samples in the training set and the test set. These spectrometers recorded data on different wavelength ranges. Moreover, the older spectrometers, that were used for the training set, due to low battery power did not use a light source when capturing plant sample wavelength. As such, the test dataset needed to be adjusted to remove a light effect.

Exclusion bias occurs in the data cleaning step where the Data Scientist may remove features considered to be irrelevant. For instance, a dataset for a supermarket consists of sales data in both Australia and New Zealand. During the data exploration phase, it is found that 95% of the customers are from Australia. Thinking that the location field is irrelevant, this field is then deleted from the dataset. This means that the model may not pick up on differences between the Australian and New Zealand customers such as the latter spending more on online shopping.

Model development

Before developing models, datasets are often split into training, validation and test sets. This can result in time-interval bias, where the Data Scientist intentionally selects a specific timeframe to support hypotheses. An example is forming a conclusion that a particular swimwear line is profitable because only sales during the summer months were considered.

Figure 3: Evidence of time-interval bias where only Australian summer months are considered when evaluating profit for swimwear line (Image by author)

Survivorship bias is another bias that occurs during the model development phase when the Data Scientist only includes data that has ‘survived’ a selection process. A good example is where researchers working for Naval Analyses during World War II were asked to identify the weakest spots within the military’s fighter planes. To answer this problem, the researchers only examined planes returning from a combat mission for points where bullets had penetrated the aircraft (see Figure 4). Using this information, they made recommendations regarding reinforcing these precise spots in the fighter planes. The issue with this analysis is that the sample excluded planes that did not return from a combat mission. These planes would have provided more useful information than planes that returned as they had suffered extensive damage⁷.

Figure 4: Evidence of survivorship bias where only returned fighter planes were used to analyse information on weakest spots on planes during World War II (Image by author)

Omitted variable bias takes place when Data Scientists exclude one or more features from their model. This typically occurs in predictive machine learning models. An example is where a model was developed to predict how much nitrogen-based fertilizer should be given to wheat, barley, and canola plants. When developing the training set for this model, only features such as the type of plant, the date the plant was planted, the date the plant yield was measured, and UV light wavelengths emitted by the plant were considered for training the machine learning algorithm. The xgboost model (gradient-boosted tree regression model) that is known to perform well on most supervised learning problems produced a negative R-squared (goodness-of-fit statistic) value for model accuracy. The problem here is that the model only considered features that the stakeholders wanted to include. After some brainstorming on which other features could make the model better, additional features such as the amount of rainfall, amount of sunlight, wind speed, air temperature, and soil content were added to the model. This resulted in a remarkable improvement in model accuracy.

Confounding bias occurs because of having confounders in the model, which are variables that are correlated to both the response variable and the predictor variables. Confounders can significantly affect model accuracy. For example, in linear regression models, confounders can change the slope and direction of the regression line.

Model interpretation and communication

Confirmation bias or observer bias occurs during the model interpretation phase, where the Data Scientist interprets or looks for information that is consistent with their beliefs. For instance, in a model, a Data Scientist may assign higher weightage to features that they believe are better predictors rather than basing the weighting upon model results, or they may even go as far as excluding data that disagrees with their assumptions regarding certain features.

Funding bias occurs when the results of a model are biased in a way to support the financial sponsor of a project. It is a type of observer bias, which is the tendency to see what the observer wants to see.

This is quite often the case when communicating machine learning models to business stakeholders who are sometimes unwilling to accept the results from the model as it does not meet their expectations. For instance, the results of a customer churn model suggested that the recent marketing campaign had been ineffective at preventing customers from leaving the insurance company as it had targeted the wrong customer demographic group. This analysis was not looked upon favourably by the Head of the Marketing Department. As the results of the customer churn model did not support the funding initiative of the marketing team, the Data Scientist was asked to try different techniques and explore additional features until an “acceptable outcome” was achieved.

Cause-effect bias is the most common type of bias seen in data analytics. This is when correlation is mistaken for causation. A popular example is about an academic in the 1980s that was researching crime rates in New York city and found a strong correlation between the amount of ice-cream sold by street vendors and crime rates. A fallacy would be committed if it was concluded that eating ice-cream leads to an increase in crime rates. Rather, it makes more sense that crime rates are higher in the summer as it is the season when ice-cream sales are the highest. However, ice-cream sales do not cause an increase in crime⁸.

Model validation, testing, and monitoring

During the model validation and testing steps, the model output is analysed. Model underfitting occurs when there are not enough features in the model and as such performs badly on the training set. These models have less statistical variance (i.e., the model’s sensitivity to specific sets of data) and high statistical bias (i.e., the model has limited flexibility to learn the true underlying pattern of the dataset). Regression, Naive Bayes, linear, and parametric algorithms are prone to underfitting.

Culprits for overfitting are tree-based models, nearest neighbours’ algorithms, non-linear models, and non-parametric models. Overfitting occurs when the model can accurately predict almost all the values in the training set. However, when it comes to predictions for the test set, the model fails. This indicates that the model is unable to generalise, and as new observations are added to the dataset, it is likely to fail indicating that it has high statistical variance but low statistical bias.

Data leakage occurs when information is shared between the training and test datasets. Typically, when a sample dataset is split into training and testing, no data should be shared between the two. An example is where a predictive model used historical data (from January 2005 to July 2021) to predict product sales from August to October 2021. However, if the training dataset includes observations from the period that it is being used to predict, then temporal data leakage has been committed. This type of bias can easily be identified during the model validation phase as the model is likely to output very high model accuracy scores.

How to prevent AI Bias?

So, what can be done about AI Bias? The below sections detail how AI bias can be prevented during each of the phases of the AI/ML pipeline. This information has been summarised in Figure 5.

Figure 5: AI bias mitigation strategies in each phase of the AI/ML pipeline (Image by author)

Study design and hypotheses formulation

When deciding on what data and features to include in your research, focus on the design of the study. Aim to make the sample dataset representative of the target population. For instance, when running a survey to measure satisfaction of UberEats in metropolitan Melbourne, ensure accounting for individuals from diverse age groups, both genders, diverse cultural, linguistic, and educational backgrounds, all postcodes of metropolitan Melbourne, and customers that respond to email, postal, and social media surveys. The survey should also be administered online at different times of the day and various days of the week to be inclusive. Additionally, it is a good idea to run the survey multiple times during a year as customer satisfaction may change over a year.

The next step is to select a random and representative dataset from the sample of respondents to ensure that each respondent has an equal chance of being selected in the study. This type of research design eliminates sampling, voluntary, and time-interval biases.

Data collection, pre-processing, and exploration

The output from machine learning algorithms can be relied upon if the underlying statistical assumptions are satisfied. For many algorithms, the Central Limit Theorem assumption requires that the sample and population is normally distributed. A large enough sample size or transformation of variables can usually be used to affirm this assumption. So, if after the data collection step, it is found that the sample size is too small, it is recommended that the Data Scientist endeavour to increase the sample size, where possible.

To determine if a sample size is large enough, power analysis can be conducted that provides the minimum sample size needed to meet a predetermined statistical significance level.

During the data pre-processing step, it is important to document all data cleansing and transformation steps, this will help identify sources of bias such as exclusion of certain features (exclusion bias) and incorrect labelling of data (label bias).

Exclusion bias can also be prevented by investigating each feature before discarding them. This can be done through the assistance of Subject Matter Experts (SMEs) that can identify redundant features or by using machine learning methods such as Random Forests that output a variable importance list.

Measurement bias can be reduced by checking for outliers, which are abnormally small or large values and differ widely from the average, and calculating their degree of influence on the outcome variable using methods such as Cook’s Distance or Mahalanobis Distance.

During the data exploration stage, the Data Scientist may discover that the sample of respondents from a survey for example is imbalanced as there are more female than male respondents, resulting in design bias. The datasets can be balanced by considering down-sampling or over-sampling methods. Examples of python packages that do this automatically are SMOTE for over-sampling or imblearn that does both.

It is good practice to calculate pairwise correlations between all variables that are included in the model during the data exploration step, which is a statistical technique used to identify multicollinearity. This will assist in identifying confounding variables leading to confounding bias. Correlation thresholds can be set to determine which features to exclude from the model.

When confounders are found, it is best to engage with SMEs from the business to determine whether the variables should be included in the model. An example where confounders were avoided is during the development of a predictive model that used sensor data in an LNG (Liquefied Natural Gas) plant to predict LNG production. In the first few model runs, a very high model accuracy score was generated by the model. After looking at the top predictor variables during a model presentation, the production engineer identified a confounding variable. This variable in an LNG plant varies after LNG has been produced and as such is a proxy dependent variable. By removing this variable from the model, the model was able to generate more reliable results.

The removal of label bias may involve some in-depth data exploration. For instance, several university students were used to label plant samples as either wheat, barley, or canola. The predictive model output generated very low model accuracy scores. Unsupervised clustering (k-means clustering) on all features that were entered in the model was used to determine if the algorithm would cluster wheat, barley, and canola samples as separate clusters. The model output accurately classified most of the observations, but a few observations were found in the wrong cluster. This was attributed to one of the features that was entered in the model. The relationship between this feature and the probability of belonging to a specific class was used to re-label the dataset.

Model development

Feature selection is key when developing good models as it is important to filter out irrelevant or redundant features from the dataset. This can prevent omitted variable bias from occurring. Some supervised algorithms have built-in feature selection such as Regularised Regression and Random Forests.

Without in-built feature selection (e.g. nearest-neighbours), Data Scientists can employ variance thresholds. Features whose values remain constant should be excluded from a model as they do not have enough variance to explain the outcome variable. For example, a continuous variable with 95% of values as 1 is unlikely to be useful. Likewise, a categorical variable where all observations belong to one category will not provide information for the model.

When the dataset has a high number of features, Principal Component Analysis (PCA) can be used to reduce the number of features into principal components (linear combinations of features). However, this changes the scale of the output features. Principal components are also difficult to interpret.

Another option is Genetic Algorithms (GAs) that are a type of search algorithm that use mutation and crossover (biology and natural selection principles) to efficiently traverse large solution spaces; thereby, assisting in the selection of features from very high-dimensional datasets. Unlike PCAs, GAs preserve the output features.

Model interpretation and communication

When a model is being interpreted, it is important for Data Scientists to look at all of the information resulting from a model and present this information factually, even though it may disagree with the business hypotheses. Likewise, when communicating model output, it is important to have a fact-based conversation with the stakeholders.

To ensure that the audience can comprehend and trust the model output, rather than seeing it as a black box, Data Scientists should use techniques that can improve transparency and model explainability.

Models can be explained using global or local explainability⁹. Global explainability displays a high-level view of the model and how the features in the data collectively influence the result. An example is using the coefficients of a multiple linear regression model to explain the magnitude and direction of the features in explaining the variability of the outcome variable. Local explainability is used to explain each observation within the dataset using one feature at a time.

Partial Dependence Plots (PDPs) provide a global visual representation of how one or two features influence the predicted value of the model, whilst holding other features constant. These plots help identify whether the relationship between the outcome variable and selected feature is linear or complex. PDPs are model-agnostic.

Individual Condition Expectations (ICE) Plots provide a local display of the individual effect of a model’s feature with respect to the outcome feature. Unlike PDP, ICE plots are used to show separate predictions of the dependence of the outcome variable’s values on the feature’s value, by providing a visual representation of the individual observations within the sample. Like PDPs, ICE plots are also model-agnostic.

Accumulated Local Effects (ALE) plots are another type of plot used for local explainability. Unlike PDP, the approach focuses on a small range of values for each feature, varying the value of this feature whilst holding all other features constant. The differences in predictions between the start and end of this range are calculated. ALE plots are a faster and less biased alternative to PDPs as they focus on individual predictions rather than averaged predictions used in PDPs.

Another approach is Leave One Column Out (LOCO) that retrains a model after excluding one column. It then calculates the differences of each LOCO model prediction score to the original model prediction score. If the score changes significantly, it indicates that a feature important to the model was left out.

Finally, LIME (Local Interpretable Model-Agnostic Explanations) is another local explainability techniques that performs multiple perturbations on a feature’s values and measures the resulting impact on the output prediction.

There are several other techniques that assist with model explainability such as Shapley Values, Anchors, DeepLift, and ProfWeight. High model explainability can assist with interpretation and communication of model output leading to model transparency and trust.

When stakeholders are educated in common machine learning techniques and their pitfalls, they are more likely to look at model output in a critical manner. To educate stakeholders, change management within an organisation is required, where all employees within a business, ranging from graduate analysts to CEOs are encouraged to take online learning modules and attend workshops that can assist them with model output assessment.

Model validation, testing, and monitoring

Whilst validating and testing models, overfitting may be observed. It can be identified by a very high goodness-of-fit statistic on the training set but a low value on the test set. Underfitting can be spotted by a poor goodness-of-fit statistic on the training and test sets.

One powerful preventative measure against overfitting is cross-validation, where the initial training data is used to generate multiple train-test splits that are then used to tune the model (i.e., k-fold cross-validation). Increasing the size of the sample can also assist in generalising the model’s predictions to the real-world.

Techniques that are used to simplify models are classed as regularisation. The method used is dependent upon the algorithm used. Pruning a decision tree is one way to make it less complex and prevent overfitting. A penalty parameter can be added to the cost function in regression to shrink the value of coefficients, assisting in the prevention of overfitting. Dilution, which is the process of randomly excluding units (Both hidden and visible) during the training process of a neural network, is also known to prevent overfitting. Early stopping is another way to prevent model overfitting, where the model is stopped after a certain number of runs because additional iterations do not improve the model accuracy. These techniques are often available as parameters in model development that can be hyper-tuned.

Ensemble learning is a technique where predictions from multiple models are combined. Two most common techniques are bagging, which attempts to reduce overfitting by training many “strong” learners (high model accuracy models) in parallel and combining their predictions to provide final predictions of an ensemble model. Boosting aims to improve the predictive flexibility of simple models (i.e. where underfitting is an issue) by training a large number of “weak” learners in sequence, where weak learners are defined as constrained models (i.e. tree depth is limited in tree-based classifiers). Each consecutive model learns from the errors made by the previous one. The final step involves combining all the weak learners into a single strong learner.

After models are built, they need to be monitored on an ongoing basis to ensure they are still valid for the dataset that is being modelled. For instance, over time, the business may stop capturing some features or include new features within the dataset. Moreover, the distribution of a feature may change which will prevent the model from predicting outcomes that are outside of a pre-determined range of values. All these instances require new models to be built to replace the existing model, which will begin to perform poorly on any new data.

What tools are there to prevent AI Bias?

There are several existing tools, including Python packages, that can assist with AI Bias prevention¹⁰.

The What-if Tool (WIT) developed by the TensorFlow team within Google is a visual interactive interface that displays datasets and models. Users can explore model results without having to write code and can view plots such as PDPs¹¹.

FairML is another useful tool that audits the various types of bias in predictive modelling by quantifying a model’s relative predictive dependence on the input features. The relative importance of the model’s features are then utilised to assess the extent of discrimination or fairness of the model¹².

IBM’s AI Fairness 360 contains a comprehensive set of metrics, metric explanations, and algorithms to mitigate bias in datasets and models¹³.

Microsoft’s Fairlearn identifies model biases that impact people in the form of allocation harms and quality-of-service harms. The prior occurs when AI systems extend or withhold information, opportunities, or resources. Some use cases are in hiring, admission to schools or universities, and bank lending. The latter refers to whether a system works equally for all individuals, even if no information, opportunities, or resources are withheld or extended. This tool can be used to alleviate prejudicial and racial biases¹⁴.

Finally, Aequitas is another well-known open-source bias audit toolkit that can be used to audit machine learning models for discrimination and bias.

In addition to toolkits, there are various Python libraries for AI bias detection. Python libraries that are algorithm-specific when it comes to detecting AI bias are Fair Classification¹⁵, Fair Regression¹⁶, and Scalable Fair Clustering¹⁷. There are also python packages like Bias Detector¹⁸ and Gender Bias¹⁹ for detecting gender, ethnicity, and racial bias in models that can exist due to sample imbalances and lack of representation of the target population.


The blog post has identified several types of bias that can occur within the AI/ML pipeline and measures that can be used to reduce or eliminate them. Rather than reinventing the wheel, there are existing toolkits that can assist with AI bias mitigation. However, it is important that individuals throughout a business, from graduate employees to C-level executives, are trained in critically assessing model output to enable their understanding of AI/ML algorithms and question their validity.


[1] Dastin, J. (2018), “Amazon scraps secret AI recruiting tool that showed bias against women”, https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G

[2] Thomas (2019),“How can we eliminate bias from AI algorithms? The pen-testing manifesto”, https://fastdatascience.com/how-can-we-eliminate-bias-from-ai-algorithms-the-pen-testing-manifesto/

[3] BBC News (2019), “Apple’s ‘sexist’ credit card investigated by US regulator”, https://www.bbc.com/news/business-50365609

[4]The Guardian (2020), “Nearly half of councils in Great Britain use algorithms to help make claims decisions”, https://www.theguardian.com/society/2020/oct/28/nearly-half-of-councils-in-great-britain-use-algorithms-to-help-make-claims-decisions

[5] Jones, S (2021), “Automated hiring systems are ‘hiding’ candidates from recruiters — how can we stop this?”, https://www.weforum.org/agenda/2021/09/artificial-intelligence-job-recruitment-process/

[6] Simonite, T (2018), “When It Comes to Gorillas, Google Photos Remains Blind”, https://www.wired.com/story/when-it-comes-to-gorillas-google-photos-remains-blind/

[7] Agarwal, R (2020), “Five Cognitive Biases In Data Science (And how to avoid them)”, https://towardsdatascience.com/five-cognitive-biases-in-data-science-and-how-to-avoid-them-2bf17459b041

[8] Agarwal, R (2020), “Five cognitive biases in data science and how to avoid them”, https://towardsdatascience.com/five-cognitive-biases-in-data-science-and-how-to-avoid-them-2bf17459b041

[9] Onose (2021), “Explainability and Auditability in ML: Definitions, Techniques, and Tools”, https://neptune.ai/blog/explainability-auditability-ml-definitions-techniques-tools

[10] “Most Essential Python Fairness Libraries Every Data Scientist Should Know”, https://techairesearch.com/most-essential-python-fairness-libraries-every-data-scientist-should-know/

[11] What-If Tool, https://pair-code.github.io/what-if-tool/

[12] Adebayo, Julius, “FairML : ToolBox for diagnosing bias in predictive modelling”, https://dspace.mit.edu/handle/1721.1/108212

[13] AI Fairness 360, https://aif360.mybluemix.net/

[14] Fairlearn, https://github.com/fairlearn/fairlearn

[15] Fair Classification, https://github.com/mbilalzafar/fair-classification

[16] Fair Regression, https://github.com/jkomiyama/fairregresion

[17] Scalable Fair Clustering, https://github.com/talwagner/fair_clustering

[18] Bias Detector, https://pypi.org/project/bias-detector/

[19] Gender Bias, https://github.com/gender-bias/gender-bias


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: