Methods for Anonymisation in Sociotechnical Contexts (MASC)

Project overview

Through Trilateral Research’s work with law enforcement, statutory agencies, civil society organisations, and NGOs, it has become clear that effective anonymisation tools are needed to facilitate data sharing beyond traditional redaction techniques. Data sharing can provide economic and social benefits, allowing organisations to combine their datasets for a common goal. Being resistant to data sharing can also present its own risks. For example, being unable to access data about children at risk can prevent practitioners from taking steps to safeguard them 

Allowing another organisation to access personal data can also present privacy risks. The application of text analytics, such as regular expression and name entity recognition, to identify and remove personal data can reduce these risks substantially. It can be challenging to identify all types of personal data, especially indirect identifiers. These identifiers often need to be combined with other data to identify the individual such as a person’s age, occupation, or place of residence. Where individuals represented in the data are no longer reasonably likely to be identifiable, then the application of anonymisation techniques can be seen as successful.  

This project combines Trilateral’s extensive experience in privacy, ethics, social science, data science and engineering to develop a novel tool for anonymising personal and sensitive data in unstructured text. A key focus will be on using subject-matter knowledge of the public security context to capture both direct and indirect identifiers and recognise potential bias risks in the anonymisation process. Another key focus will be on building the tool to be explainable for end-users, so they can understand how and why personal data is recognised for anonymisation. This ‘human-in-the-loop’ approach will allow end-users to have complete control over a powerful anonymisation tool that can be adapted to their context. 

To ensure our tool is as useful as possible for end-users, we will validate our work through workshops with our partners in the civil society sectors, Everyone’s Invited and Causeway, as well as representatives from law enforcement agencies. 

Expected outcomes

  • A tool that can remove sufficient personal data from unstructured text to meet the standard of anonymisation in data protection legislation. 
  • Capability for the end-user to act as a ‘human-in-the-loop’. 
  • Methods for identifying personal data and possible bias risks in an anonymisation process. 
  • Approaches that allow for the methods and anonymisation techniques to be adapted for different contexts.

Applications and benefits

In MASC, we will be working with end-users from civil society organisations and law enforcement to: 

  • understand current processes for redaction and anonymisation, and how they can be improved upon 
  • Understand how both direct and indirect personal data present differently across distinct contexts, and how they can be best identified in particular contexts. 
  • Enable effective anonymisation of unstructured text collected in the context of public security. 

Law enforcement and public safety are key areas where anonymisation can be advantageous. Personal data about victims, survivors, suspects, witnesses, and persons convicted of crimes are highly sensitive. Being able to successfully anonymise these data would minimise risks associated with data sharing.

Get in touch

If you’d like to find out more about this project, or our work in this critical area, contact Dr Joshua Hughes, Research Manager and Cluster Lead for Law Enforcement and Community Safeguarding (Joshua.hughes@trilateralresearch.com) 

Learn more about our research in the field of

Law Enforcement and Community Safeguarding

Let's discuss your career