Data augmentation for fairness-aware machine learning: Preventing algorithmic bias in law enforcement systems


Researchers and practitioners in the fairness community have highlighted the ethical and legal challenges of using biased datasets in data-driven systems, with algorithmic bias being a major concern. Despite the rapidly growing body of literature on fairness in algorithmic decision-making, there remains a paucity of fairness scholarship on machine learning algorithms for the real-time detection of crime. This contribution presents an approach for fairness-aware machine learning to mitigate the algorithmic bias / discrimination issues posed by the reliance on biased data when building law enforcement technology. Our analysis is based on RWF-2000, which has served as the basis for violent activity recognition tasks in data-driven law enforcement projects. We reveal issues of overrepresentation of minority subjects in violence situations that limit the external validity of the dataset for real-time crime detection systems and propose data augmentation techniques to rebalance the dataset.

The experiments on real world data show the potential to create more balanced datasets by synthetically generated samples, thus mitigating bias and discrimination concerns in law enforcement applications.


Pastaltzidis, I., Dimitriou, N., Quezada-Tavarez, K., Aidinlis, S., Marquenie, T., Gurzawska, A., & Tzovaras, D.

Date Published

June 21, 2022

Let's discuss your career