Originally posted on HBR  on October 30, 2020. Written by Josh Feast


AI has long been enabling innovation, with both big and small impacts. From AI-generated music, to enhancing the remote fan experience at the U.S. Open, to managing coronavirus patients in hospitals, it seems like the future is limitless. But, in the last few months, organizations from all sectors have been met with the realities of both Covid-19 and increasing anxiety over social justice issues, which has led to a reckoning within companies about the areas where more innovation and better processes are required. In the AI industry, specifically, organizations need to embrace their role in ensuring a fairer and less-biased world.

It’s been well-established that machine learning models and AI systems can be inherently biased, some more than others — a result most commonly attributed to the data being used to train and develop them. In fact, researchers have been working on ways to address and mitigate bias for years. And as the industry looks forward, it’s vital to shine a light on the various approaches and techniques that will help create more just and accurate models.

Bias mitigation is a fairly technical process, where certain techniques can be deployed depending on the stage in the machine learning pipeline: pre-processing  (preparing the data before building and training models), in-processing (modifications to algorithms during the training phase), and post-processing (applying techniques after training data has been processed). Each offers a unique opportunity to reduce underlying bias and create a technology that is honest and fair to all. Leaders must make it a priority to take a closer look at the models and techniques for addressing bias in each of these stages to identify how best to implement the models across their technology.


First, we need to address the training data. This data is used to develop machine learning models, and is often where the underlying bias seeps in. Bias can be introduced by the selection or sampling of the training data itself. This may involve unintentionally excluding certain groups, so that when the resulting model gets applied to these groups, the accuracy is inevitably lower than it is for the groups that were included in the training data. Additionally, training data usually requires labels used to “teach” the machine learning model during training. These labels often come from humans, which of course risks the introduction of bias. For label data in particular, it is crucial to ensure that there is a diversity of demographics in the human labelers to ensure that unconscious biases don’t creep in.

Counterfactual fairness is one technique scientists use to ensure that outcomes are the same in both the actual world and in a “counterfactual world,” where individuals belong to a completely different demographic. A great example of where this is of value is in university admissions — let’s say William from Los Angeles, who is white, and Barack from Chicago, who is African American, have similar GPAs and test scores. Does the model process the data the same if demographic information is swapped?

When predicting outcomes or making decisions, such as who gets the final university acceptance letter of the year, the training data and resulting models should be carefully vetted and tested before being fully implemented. It is especially important to assess variance in performance across sensitive factors like race and gender.


When training a machine learning model, in-processing models offer unique opportunities to encourage fairness and use regularization to tackle bias.

Adversarial training techniques can be applied to mitigate bias, where the machine learning model is jointly trained to simultaneously minimize errors in the primary objective (e.g., confirming or rejecting university admissions) while also penalizing the ability of another part of the model to predict some sensitive category (e.g., race).

My company recently conducted research on de-biasing approaches for examining gender bias in speech emotion recognition. Our research found that fairer, more consistent model accuracy can be achieved by applying a simple de-biasing training technique — here we compared a state-of-the-art approach on adversarial training to an approach with no de-biasing. Without any de-biasing, we found that emotional activation model accuracy is consistently lower for females compared to male audio samples. However, by applying a simple modification to the error term during the model training, we were able to effectively mitigate this bias while maintaining good overall model accuracy.


Post-processing is a final safeguard that can be used to protect against bias. One technique, in particular, has gained popularity: Reject Option-Based Classification. This process assumes that discrimination happens when models are least certain of a prediction. The technique exploits the “low confidence region” and rejects those predictions to reduce bias in the end game. This allows you to avoid making potentially problematic predictions. Also, by monitoring the volume of rejected inferences, engineers and scientists can be alerted to changes in the characteristics of the data seen in production and new bias risks.

The Road to Fairer AI

It is imperative that modern machine learning technology is developed in a manner that deliberately mitigates bias. Doing this effectively won’t happen overnight, but raising awareness of the presence of bias, being honest about the issues at hand, and striving for better results will be fundamental to growing the technology. As I wrote a year ago, the causes and solutions of AI bias are not black and white. Even “fairness” itself must be quantified to help mitigate the effects of unwanted bias.

As we navigate the lasting effects of the pandemic and social unrest, mitigating AI bias will continue to become more important. Here are several ways to get your own organization to focus on creating fairer AI:

  • Ensure that training samples include diversity to avoid racial, gender, ethnic, and age discrimination.
  • Whether labeling audio samples or generic data, it is critical to ensure that there are multiple and different human annotations per sample and that those annotators come from diverse backgrounds.
  • Measure accuracy levels separately for different demographic categories to see whether any group is being treated unfairly.
  • Consider collecting more training data from sensitive groups that you are concerned may be at risk of bias — such as different gender variants, racial or ethnic groups, age categories, etc. — and apply de-biasing techniques to penalize errors.
  • Regularly audit (using both automatic and manual techniques) production models for accuracy and fairness, and regularly retrain/refresh those models using newly available data.

Ultimately, there is no way to completely eliminate AI bias, but it’s the industry’s responsibility to collaborate and help mitigate its presence in future technology. With AI playing an increasing important role in our lives, and with so much promise for future innovation, it is necessary that we acknowledge and address prejudice in our technology, as well as in our society.

Leave a Reply