On December 18, 2024, the European Data Protection Board (EDPB) published its much-anticipated Opinion on the processing of personal data in the context of AI models in light of the EU General Data Protection Regulation (GDPR).

This Opinion is pivotal for companies developing and deploying AI models. It confirms that legitimate interest can serve as a valid legal basis for training and deploying AI models, as long as AI companies implement appropriate privacy safeguards. While the EDPB offers examples of such safeguards, it refrains from prescribing specific measures. The Opinion also clarifies the conditions under which AI models may be deemed anonymous and the potential legal consequences of using models developed in breach of the GDPR.

Background

  • What triggered the Opinion? The GDPR allows any EU data protection authority (DPA) to request the EDPB to issue an opinion on a matter of general relevance in the EU or those affecting more than one EU country. In September 2024, the Irish Data Protection Commission made such a request, seeking the EDPB’s guidance on the processing of personal data in the training and deployment of AI models. The EDPB issued its Opinion in response to this request.
  • What is the main issue? In the EU, companies can only process personal data if they can rely on one of the legal grounds provided in the GDPR. However, only one of these legal grounds seems like a workable solution for AI developers that need very large amounts of data to train their models, and that is the so-called “legitimate interests” legal basis. The question of whether AI developers can rely on this legal basis has been debated in the EU, but the Opinion puts an end to this debate and clearly confirms that the answer is “yes,” subject to safeguards.
  • Is the Opinion binding? While the Opinion is not binding on companies, it does bind DPAs, as the EDPB can require compliance with it through the GDPR dispute resolution mechanism. Consequently, DPAs typically align their decisions with EDPB opinions, and failure to adhere to them increases regulatory risk.
  • What is the scope of the Opinion? The scope of the Opinion is deliberately narrow. It does not address certain key topics relevant to AI, such as the processing of sensitive data (e.g., political opinions, religious beliefs, sexual orientation), automated decision-making, data protection impact assessments, or privacy by design.

Key Takeaways

With AI technologies advancing rapidly, the Opinion offers a valuable framework to help companies navigate compliance and align their practices with GDPR requirements.

Below are the main takeaways of the Opinion.

  • Legitimate Interest Is a Valid Legal Basis for AI Model Training and Deployment. The key takeaway of the Opinion is recognizing that the development and deployment of AI models can rely on the legitimate interest legal basis under the GDPR. However, the EDPB states that, to be able to rely on this legal basis, companies must carry out and document a case-by-case legitimate interest assessment in three steps:
    1. Purpose testAs a first step, companies need to assess whether the interest they invoke is legitimate (i.e., lawful, clearly and precisely articulated, real and present). In that context, the EDPB recognized that “AI technologies create many opportunities and benefits across a wide range of sectors and social activities” and laid down some examples of legitimate interests, for example “developing the service of a conversational agent to assist users.”
    2. Necessity testAs a second step, companies need to assess whether processing personal data will allow them to achieve the desired goal and confirm that this goal cannot be achieved with less or no personal data. For instance, companies need to assess whether they could use anonymized or synthetic data instead of personal data to train their AI models.
    3. Balancing test. As a third step, companies need to confirm that their legitimate interests are not overridden by the interests or rights of individuals whose data is processed. In doing so, companies should take into account i) any advantages or positive effects on individuals, ii) any possible risks for individuals (e.g., bias or discrimination), and iii) whether individuals are aware or can expect that their data will be processed in this way. The latter also depends on whether the personal data is publicly available and the AI company’s transparency efforts.

      The EDPB provided a non-exhaustive and non-prescriptive list of measures that help tip the balance in favor of companies, including various technical measures, measures that facilitate the exercise of individuals’ rights, and enhanced transparency measures that go beyond disclosures in a privacy policy. At the training stage, this includes measures aimed at minimizing the amount of personal data processed or the risk of identification of individuals, such as respecting robots.txt or ai.txt files, pseudonymization, masking, and filtering. At the deployment stage, this includes measures aimed at preventing the storage, regurgitation, or generation of personal data, especially in the context of generative AI models (such as output filters), digital watermarking of AI-generated content, and providing individuals with the possibility to have their personal data erased from model output.
    The EDPB also opined that the lack of legal basis for AI model training may impact the lawfulness of the subsequent deployment of that model. In particular, the EDPB suggested that companies using AI models should assess whether the models they use were trained unlawfully, unless they are anonymous (see below).
  • Some AI Models Are Anonymous, Others Not. In its Opinion, the EDPB also addressed another (more factual) question that has also been subject to divergent views in the EU. This is the question of whether an AI model trained with personal data is to be considered anonymous once the training is completed. The EDPB did not take a firm stance on this issue, but rather considered that it has to be assessed on a case-by-case basis. According to the EDPB, an AI model is anonymous—and thus no longer subject to the GDPR—if there is an insignificant likelihood of either i) direct (including probabilistic) extraction of personal data used to train the model or ii) of obtaining, intentionally or not, such personal data from queries.

    The EDPB provided a non-exhaustive and non-prescriptive list of factors to be considered when assessing the anonymity of the AI model, including steps to limit the collection of personal data, or to pseudonymize or filter the data before the training begins; privacy-preserving techniques during model training (e.g., differential privacy); measures to prevent the model from including personal data in the output; effective engineering governance and document-based audits; and robust tests covering widely known attacks, including structured testing against attribute and membership inference, exfiltration, regurgitation of training data, model inversion, and reconstruction attacks.

What Should AI Companies Do Now?

The EDPB Opinion underscores the importance of proactive and accountable data protection practices for companies developing and deploying AI models. To align with the EDPB guidance and mitigate regulatory risk, AI companies could consider:

  • Preparing a Thorough Legitimate Interest Assessment following the three-step framework outlined in the Opinion—purpose test, necessity test, and balancing test—and document each step comprehensively. This documentation will be critical in demonstrating compliance during audits or regulatory inquiries.
  • Adopting Appropriate Privacy Safeguards during both the training and deployment phases, including techniques such as pseudonymization and data minimization, as well as measures to prevent regurgitation or unauthorized use of personal data in outputs. AI companies should also consider providing clear, accessible, and user-friendly information about data processing activities related to AI.
  • Assessing Current Practices. AI companies should consider assessing whether existing AI models were trained lawfully, particularly if personal data was used. Companies licensing AI models should conduct adequate due diligence on the training methods to verify compliance with GDPR standards.
  • Monitoring and Adapting to Evolving Standards. AI companies should expect and monitor further regulatory guidance and case law on these topics and adapt practices as necessary. The fast-paced evolution of AI technologies means that privacy and compliance strategies must remain flexible and forward-looking, and that hopefully regulators will be flexible in their interpretation to allow for responsible innovation.

Conclusion

The EDPB Opinion reinforces the EU’s commitment to fostering responsible AI development while upholding individuals’ privacy in accordance with the GDPR. Although the Opinion is not binding on companies, it offers valuable insights into regulatory expectations that DPAs are likely to follow. Companies developing or deploying AI should consider treating this Opinion as a roadmap to navigate the complex intersection of innovation and data protection. These obligations apply in addition to those introduced by the EU Artificial Intelligence Act (see here and here).

If you have any questions regarding the EDPB Opinion, the GDPR, or the EU Artificial Intelligence Act, please contact Cédric BurtonLaura De BoelYann Padova, or Nikolaos Theodorakis from Wilson Sonsini’s privacy and cybersecurity practice.

Wilson Sonsini’s AI Working Group assists clients with AI-related matters. Please contact Laura De BoelManeesha MithalManja Sachet, or Scott McKinney for more information.

Rossana FolRoberto Yunquera SehwaniMichael Kern, and Karol Piwonski contributed to the preparation of this post.