A common adage in computational science is ‘garbage in, garbage out’. This means that whatever is put into computational systems is reflected back in the outputs of those systems. So, if you feed AI models with ‘garbage’, ‘garbage’ is what they will produce. Similarly, if you feed biased information into AI models, they will produce biased results.
For example, imagine that you are using an AI algorithm to help you select candidates for a job interview. The algorithm has been trained on real-world data, and it narrows your candidate pool by using information about who has historically held similar positions which, in Western cultures, has traditionally been white men. The AI algorithm therefore ‘thinks’ that white men must be the ideal candidates for the role, and narrows your interview pool to people who match this profile. You naively accept the algorithm’s recommendations, and thus propagate the bias that structures our society.
Unfortunately, this isn’t a far-fetched scenario. Indeed, in 2018 it emerged that Amazon had been using a candidate-selection algorithm that was inadvertently biased towards men. The algorithm had been trained on the CVs of past applicants, and because men dominate the tech industry, the algorithm taught itself that men were better suited to work at Amazon.
This is a big problem with AI and one that we touched upon in our previous What the Tech article, “Strengths, safeguards and ‘Shadow AI’”. AI is trained on real-world data, and real-world data holds the signature of all the prejudices that are baked into our societal structures. Moreover, implicit biases held by those creating AI algorithms can also introduce bias to the system.
This doesn’t necessarily mean that the output of AI algorithms is ‘garbage’, but it does mean that we need to be extremely careful in how we use AI algorithms, be cognisant of their flaws, and keep working towards minimising bias.
The first thing to understand when trying to find the cause of bias in AI, and thus how to tackle it, is the source of that bias. Is it the technology itself, the society in which the technology is developed, or the people building the technology?
Technology reflects the society in which it is built, and the fact that AI algorithms are biased is merely a reflection of the decisions made by those who have built it. Technology itself is not responsible for its outputs.
Instead, it is the responsibility of AI scientists to be aware of bias, and to actively work to make AI a fairer and more representative technology. But how can we do this?
Some of the first people to raise the alarm about bias in AI algorithms were from minority groups. For instance, Joy Buolamwini realised that the facial recognition algorithm that she was working with did not recognise her face. On further investigation, she found that the algorithm was much worse at correctly characterising darker-skinned faces than other groups. Indeed, the facial recognition error rate for darker-skinned females was 34.7%, while the error rate for white men was 0.8%. She wrote a paper about it, calling for increased demographic and phenotypic accountability in AI.
While it is by no means the responsibility of minority groups to dissect AI algorithms and diagnose their flaws, it is of vital importance that people of all races, genders, ages and backgrounds have a seat at the AI design and development table. Without diverse perspectives, AI risks becoming a tool that propagates the current power structures of the society it is built in.
In 2019, New York University’s AI Now Institute reported that over 80% of AI professors are men, while 15% of AI researchers at Facebook and 10% of AI researchers at Google are women. A lack of diversity like this increases the chances of this type of bias occurring. We must ask ourselves: What are the barriers that minority groups face when trying to enter the field of AI, and how can we try to deconstruct them? It is the responsibility of AI companies to address this, and to work to create diversity in their teams.
Removing existing bias
Having a diverse team will help to reduce bias being introduced into a system, but how can we attempt to remove bias that already exists? Some methods include making training data more representative, using synthetic data to train algorithms, and adjusting algorithmic parameters.
Making training data less biased
Bias can occur when algorithms are trained on data that has unbalanced representation of groups. When groups are underrepresented, AI algorithms are not as accurate at understanding the properties of those groups. This can lead to high error rates for specific groups, such as those discussed earlier, where facial recognition software mis-labelled black women thirty times more frequently than white men.
One way to tackle this is to ‘oversample’ underrepresented groups in the training data. Such oversampling can be used to ensure that the number of datapoints for each group is balanced, thus helping to remove bias.
Another method to remove bias from training data is to actively work to make training databases more diverse. This is the aim of the AIM AHEAD project, which is working to create more representative health databases.
Synthetic data is not ‘fake data’ per se. It is data that is generated to hold all the properties of real data, but is not produced through actual observation. Synthetic data can also be useful if certain groups are underrepresented in training data, because – rather than oversampling – synthetic data points can be used to balance the data.
Sometimes, decisions made during the design and development of the algorithm – beyond biases inherent to the training data – can result in bias. This is why it is so important to critically evaluate an algorithm; not just with an eye to the training data itself, but also to decisions such as what sort of data is included, and whether and how algorithmic bias has been tested. One example of the importance of critical evaluation at every stage of algorithm design and development comes from an AI algorithm used in American healthcare. The algorithm was designed to flag patients who needed more involved medical care. To train the model, developers chose to use the amount of money spent on healthcare as an indicator (or ‘proxy’) of medical need. This assumed that money spent and medical need are highly correlated. However, what was not taken into account is that healthcare spending is not only correlated with medical need, but also with income. People of colour are more likely to have lower income than other groups, and are also – because of a lack of trust in the system – less likely to visit doctors. This means that the amount of money that some groups spend on healthcare does not correlate well to medical need. The use of money as a proxy for medical need thus led to an algorithm that underestimated people of colour’s need for healthcare. If the ‘proxy’ used for medical need had been more thoroughly critiqued, perhaps this could have been avoided. What’s more, the developers of the algorithm did not discover this issue, indicating that more thorough testing for algorithmic bias should have been done before it was deployed and used for over 200 million people. Thus, all algorithmic decisions must be critically evaluated – not just at the level of the training data, but also – for example – in deciding ‘proxies’, and in testing for algorithmic bias. Harking back to our previous point, such critical evaluation is ideally done by a diverse team with diverse perspectives.
How to work towards a fairer future
AI is as biased as the society in which it is built. We can work towards creating a fairer and more representative AI technology, but ultimately we must stay cognisant of the fact that as long as human society contains discrimination and biases, the AI technology that emerges from that society will be flawed. Thus, while we must tackle bias in our technology, we must not stop tackling bias in our society, too. At Trilateral, we take both responsibilities seriously. If you would like to find out more, please get in touch.