The European Data Protection Supervisor issues Guidelines on Generative AI and Personal Data for EU Institutions

Reading Time: 5 minutes

Authors:  

Benjamin Daley | Data Protection Advisor

Date: 24 June 2024

The European Data Protection Supervisor (EDPS) published their Guidelines on generative Artificial Intelligence and personal data for EU institutions, bodies, offices and agencies (EUIs) on 3rd June 2024. These Guidelines (Orientations) focus on Regulation (EU) 2018/1725 (EUDPR) and fall within the EDPS role as a data protection supervisory authority, but not as AI Supervisor of the EUIs under the AI Act. The guidelines are a noteworthy addition to commentary on Generative AI (GenAI) and General Data Protection Regulation (GDPR) implications – including our previous post on the topic. 

The Orientations are the first step towards more detailed future guidance by the EDPS on GenAI. They serve as a timely reminder for EUIs on their obligations under the EUDPR, and for all readers on organisational measures to support regulatory compliance throughout the GenAI life cycle. We recommend that in this fast moving space, readers monitor future EDPS and EDPB press releases.  This article outlines key data protection requirements for EUIs who are considering the implementation of GenAI solutions.  

Preliminary questions 

What is GenAI and can EUIs use it? 

The EDPS Orientations define GenAI as ‘a subset of AI that uses specialised machine learning models designed to produce a wide and general variety of outputs capable of a range of tasks and applications, such as generating text, image or audio.’ Large Language Models (LLMs) have quickly become the most popular example of one such foundational model, including OpenAI’s ChatGPT. However, business-oriented applications, including pre-trained models, may also be utilised. 

The EDPS confirm that ‘there is no obstacle in principle to develop, deploy and use generative AI systems in the provision of public services,’ noting that ‘EUIs must consider carefully when and how to use generative AI responsibly and beneficially for public good… in accordance with the applicable legal frameworks’. Therefore, EUIs should take a trustworthy and responsible risk-based approach to determine whether a GenAI system is predominantly beneficial, before progressing to train or deploy the solution. 

Data protection considerations for use of GenAI  

Does GenAI process personal data? 

GenAI may process personal data at various stages of its life cycle, for which the EDPS Orientations identify the following stages: inception, training, evaluation, deployment, and monitoring. To summarise EUDPR Article 3: if an individual can be identified, including through pseudonymised data or other indirect identifiers, then personal data is processed and the use of GenAI becomes relevant. 

EUIs should therefore consider the personal data threshold at the outset and monitor any developments throughout the life cycle. This includes effective due diligence on any third parties’ claims of anonymity, to obtain guarantees that this meets sufficient technical standards, particularly given EDPS’ cautioning against the use of web scraping techniques. 

How can EUIs ensure compliance with data protection regulations when using GenAI

  • DPIAs: The principle of data protection by design and default applies throughout the GenAI life cycle – highlighted by the EDPS as ‘starting from the inception stage’. EUIs could therein satisfy requirements to conduct a Data Protection Impact Assessment (DPIA) on high risk processing at the earliest opportunity. This exercise will notably consider, and demonstrate accountability for, the following key areas: 
  • Lawfulness and the right to information: the processing of personal data for GenAI is lawful if one of the EUDPR Article 5 legal grounds are met, with an exception to process special categories of personal data where applicable under Article 10. Although consent may apply in some circumstances, the legal requirements for valid consent must still be satisfied – including the right for subjects to withdraw their consent. This creates complications for GenAI activities, if a subject’s personal data has become commingled within training sets over time. 

EUDPR Article 14 provides a right to such information for subjects, which EUIs should support through effective and timely public notices. By providing detail on how, when, and why an EUI is, or intends to use GenAI, EU Institutions can build public trust and understanding through transparency. This information should be reviewed on a regular basis, or sooner for any updates that will alter how subjects’ personal data is processed. Individuals potentially affected by the proposed GenAI activity, or their representatives, may also be invited to express their views, allowing EUIs to document feedback and corresponding actions. 

  • Data minimisation: GenAI design and deployment should ensure that the processing of personal data is limited and does not exceed what is necessary for the purposes of processing. This is particularly relevant to training datasets – placing an emphasis on high quality data, through effective curation with accompanying documentation, supplemented with training materials for staff on their structure and maintenance. 
  • Data accuracy: EUIs should ensure that data remains accurate throughout the life cycle, implementing data protection by design and default to provide effective human oversight. This allows EUIs to use validation sets during training, and monitor outputs thereafter, to provide assurances that the model is not drawing independent adverse inferences against specific demographics, or producing inaccurate or false outputs.
  • Data security: The EDPS highlight that GenAI poses enhanced and new security risks, including the new transmission channels in which they operate, with specific vulnerabilities identified as ‘model inversion attacks, prompt injection, (and) jailbreaks’. ‘Red teaming’ is identified in the EDPS Orientations as one method to discover vulnerabilities to unknown privacy and confidentiality risks, including confirmation that ‘Retrieval Augmented Generation’ is not unwittingly leaking subjects’ personal data or EUIs’ confidential information from the GenAI knowledge base.
  • Exercise of individual rights and automated decisions: Data subjects’ rights to rectification, erasure, and objection to processing pose unique challenges when using GenAI, as data is processed in numerical “vectors” through “word embedding” rather than text, for LLMs such as ChatGPT. Furthermore, the EDPS Orientations acknowledge that ‘the exercise of certain rights, such as the right to erasure, may have an impact on the effectiveness of the model.’ However, individual rights can still be supported through effective dataset management, allowing traceability for EUIs to furnish individuals with relevant records where possible. 

Furthermore, GenAI tools may be used to assist with decision-making through automated means, including profiling or assessment of individuals. EUDPR Article 24 provides subjects with the right not to be subject to solely automated decision making, unless they have entered into a relevant contract, provided their consent, or such activities are authorised by EU law with suitable safeguards in place. For all bases, subjects have the right to obtain human intervention, express their opinion, and contest a decision. Thus, EDPS advises that EUIs consider the weight and influence of any GenAI automated information in final decision-making processes, particularly for vulnerable groups including children, and ensure subjects’ right to obtain human intervention can be exercised, before deploying the GenAI tool. 

Recommendations for EUIs considering the use of GenAI  

Considering the above, we suggest the following actions for EUIs considering GenAI solutions:  

  • Involve all relevant EUI functions, in a timely manner, including: the Data Protection Officer (DPO), legal service, IT Service, and the Local Informatics Security Officer (LISO), as varied expertise and viewpoints are a strong benefit in this process. 
  • Define the use cases and scope of the proposed GenAI model, to establish practical applications and identify associated data protection risks, during the inception stage.  
  • Conduct a documented analysis of data protection considerations for the proposed use of GenAI, including a DPIA for high risk activities, ensuring that data protection principles are integrated into the training stage and throughout. 
  • Provide individuals with concise, transparent, and easily accessible information on the use of GenAI and whether their personal data will be processed. This may include categories of personal data types being processed and privacy safeguards that are implemented, such as pseudonymisation or full anonymisation, particularly if this differs between life cycle stages. Privacy frameworks should also be in place to support subjects’ rights, where applicable, through specific procedures for traceable dataset management.  
  • Ensure fair outcomes arise from the use of GenAI. EUIs should be vigilant to ensure that biases are not introduced to their GenAI projects through the common sources identified by the EDPS: ‘existing patterns in the training data, lack of information on the affected population and/ or methodological errors’. This is possible through the implementation of operational safeguards, to ensure regular monitoring of the use of GenAI by EUIs and mitigate human users’ automation and confirmation biases.  

Trilateral’s DCS team includes EUI specialists with significant expertise in navigating the EUDPR and linked regulatory requirements. We demystify the risks posed by GenAI and emerging technologies and guide a range of organisations to high levels of compliance maturity. Trilateral also has a dedicated ethics, human rights and emerging technologies team who work to tackle these significant sociotechnological issues. If you would like to discuss your requirements, please contact us. 

Related posts

Let's discuss your career