Conducting a Bias Audit on Automated Employment Decision Tools

Reading Time: 5 minutes

Authors:  

Dr Zachary J. Goldberg
- Ethics Innovation Manager

Date: 2 March 2023

Every employer has to do it: recruiting and hiring new employees. Although it’s a sincere pleasure to welcome qualified, friendly people as new colleagues and co-workers, winnowing through a heap of candidates can be a tedious process filled with long hours poring over resumés and conducting interviews. Even worse, sometimes companies hire the wrong person, leading to wasted administrative onboarding effort, extra HR reviews, additional quality control checks by managers, and the burdensome steps leading to an employee’s dismissal.  

What is surely a relief to most HR departments, automated employment decision tools (AEDTs) can automate most of the steps in the recruiting and hiring process, resulting in several potential benefits. The use of AEDTs can help HR teams improve the quality and objectivity of recruitment, attract the right candidates and receive fewer irrelevant applications, avoid overlooking qualified candidates, fill vacant positions faster, and enhance the candidate experience and employer brand. The increases in efficiency and accuracy of recruitment and hiring make AEDTs a boon for organizations and job candidates alike.  

However, alongside these potential benefits lies a significant risk. Biased algorithms can exclude underrepresented, minority, and marginalized groups or individuals from hiring considerations. Bias becomes embedded in an algorithm when algorithmic models are trained on historical data, and the historical data reflects past discriminatory decisions, policies, and procedures. For example, if from 1970-2000 a company primarily hired men or disproportionately promoted men to senior positions, and an algorithm is trained on the dataset comprised of the company’s past hires and promotions to predict what makes a successful candidate, then the algorithm “learns” that being a man is a qualification for hiring or promotion. More insidiously, datasets can include proxies for protected characteristics. Language, demography, and even postcodes can reflect racial, ethnic and gender differences. Hiring and recruitment algorithms can “learn” to attach objective value to these subjective characteristics thereby producing biased outputs. Basing employment decisions on biased AI tools further entrenches social injustices, reduces the pool of qualified candidates, and puts companies at risk for not complying with government regulations including the Equal Employment Opportunity Act, the Americans with Disabilities Act, and an increasing number of regulations appearing at the state or city levels across the U.S.  

One of the most impactful new regulations is a law that enters into force in New York City on April 15, 2023 regulating  AI tools used in recruitment and hiring. Local Law 144 will prohibit employers from using AEDTs to evaluate candidates for jobs or employees for promotion unless (1) an independent third party has audited the tool for bias within one year of use of the tool, (2) information about the audit is made publicly available, and (3) employees and job candidates have been informed about the use of the tool. Adding some clarification to the laws, the New York City Department of Consumer and Worker Protection (DCWP) published proposed rules stating that (4) An independent auditor is a person or group that is not involved in using or developing an AEDT that is responsible for conducting a bias audit of such AEDT, and (5) at a minimum the audit must include (a) a calculation of the selection rate for each race, ethnicity and sex category; and b) comparison of selection rates to the most selected category to determine an “impact ratio” to ensure there is no adverse impact on a particular group.  

The Equal Opportunity Employment Commission (EOEC) defines adverse impact as “a substantially different rate of selection in hiring, promotion, or other employment decision which works to the disadvantage of a race, sex, or ethnic group.” Further, the EOEC defines which characteristics are protected from discrimination: “Employees and former employees are protected from employment discrimination based on race, color, religion, sex (including pregnancy, sexual orientation, or gender identity), national origin, age, disability or genetic information (including family medical history)”. 

Conducting a bias audit to ensure that an AI hiring tool does not have an adverse impact on any of these protected characteristics requires a two-pronged approach that merges data science with ethics and social science.  

First, the data science perspective: Assessing bias in an AI tool is not just about assessing the algorithm itself, but also assessing the data that is used in the algorithm. There are several steps necessary for a successful bias audit of recruiting and hiring tools.  

  1.  Assess whether any special categories (e.g., vulnerable populations, sensitive data such as medical data or biometrics, etc.) are likely to be involved in the collection, processing or use of data.  
  1. Examine whether an algorithm’s input data is appropriate (e.g., if a company’s dataset of successful hires or promotions includes mostly men, it is not an appropriate dataset). To make reasonable attempts to understand and mitigate for these biases, data scientists can produce visualizations of the data grouped by characteristics that may carry bias.  
  1. Examine correlations among diverse data points to check that selected features are aligned with the aim of the algorithm and not with other features (e.g., proxies for protected characteristics). To do so, it is essential to randomly split the data into at least two sets. One set (the training data) is used to explore the patterns in the data and train the model; the second set (the testing data) is kept aside until the end to assess the performance of the model on completely unseen data. It is very important that any quantitative measurements of the performance of the algorithm are calculated using the testing data that has never been used in the development of the model. 
  1. Review each state the system has been in to determine or predict what the system would have done at time t and, whenever possible, determine which training data was used. 
  1. Examine the “gross” or “surface” properties of the acquired data (such as format and quantity), and evaluate whether the data satisfies the relevant requirements. 
  1. Pay special attention to data mining questions that concern patterns in the data (e.g., distribution of key attributes, relationships between pairs of attributes, properties of significant sub-populations, simple statistical analyses), through queries, visualization, and reporting techniques.  
  1. Examine the quality of the data, including completeness, correctness, and missing variables. 
  1. Assess the accuracy of the algorithm, including inaccuracies arising from bias. There are many common metrics to use for this purpose such as false positive rate, false negative rate, sensitivity, specificity, and f1 score.  

Next, the ethics and social science perspective: While definitions and statistical measures of bias are absolutely necessary, they cannot consider the nuances of the social contexts into which an AI system is deployed. To complete the bias audit, the data science steps are merged with ethics and social science methods. Data selection and fair algorithmic design are coupled with an ongoing ethical need to understand the historical and social contexts into which the recruitment and hiring systems are deployed. The steps include: 

  1. Understand which proxies for protected characteristics could arise in the context of recruitment and hiring. 
  1. Assess whether ethical criteria to mitigate bias are considered in the modelling stage, and that the selection of the particular algorithmic model(s) (e.g., neural network generation with backpropagation, or decision-tree building with Python) are evaluated relative to these ethical criteria. 
  1. Examine the role of human judgment in the decision-making process to assess whether any operational biases could emerge with the use of the tool.   
  1. Perform an ethics assessment on the results. Possible outcomes are that bias has been mitigated in a satisfactory way, that further development is needed, or that specific restrictions on deployment and use need to be in place.   

Conclusion: 

With the widespread use of AI tools for recruitment and hiring coupled with the frequent, and often hidden, ways that bias can arise in such tools, companies face a difficult challenge to ensure compliance. Failing to do so brings sanctions, limits the pools of qualified candidates, and contributes to social injustice. Identifying and mitigating bias in such tools requires a method that merges data science with ethics and social science. A bias audit conducted in this way not only achieves regulatory compliance, but allows employers to reap the benefits of AEDTs while avoiding risk. 

Related posts