NEWS

Privacy and anonymization in language processing: Are we barking up the wrong tree?

30.04.2026

Ivan Habernal’s invited talk at CALD-pseudo 2026 explores how AI can handle sensitive data – and where current assumptions fall short.

Photo credit: LEGAL 2026 workshop website – https://legal2026.mobileds.de/

Artificial intelligence depends on data. The more it learns, the better it performs. But when that data includes personal or sensitive information, progress quickly collides with fundamental questions of privacy. In natural language processing (NLP), this tension is especially visible. Text data often contains traces of identities, contexts, and relationships–making it both valuable and difficult to use responsibly. Techniques such as anonymization and pseudonymization promise a solution. But how reliable are they, really?

Challenging assumptions about privacy in AI

At the Joint Workshop on Legal and Ethical Issues in Human Language Technologies (LEGAL2026) and Computational Approaches to Language Data Pseudonymization, Anonymization, De-identification, and Data Privacy (CALD-pseudo 2026), Prof. Ivan Habernal addresses this question head-on in his invited talk:

Privacy and anonymization in NLP: Are we barking up the wrong tree?

Taking place on May 12, 2026, in Palma de Mallorca, Spain, as part of the international LREC 2026 conference, the workshop brings together perspectives from computer science, law, and ethics.
Rather than offering simple answers, Habernal’s talk takes a step back. It questions whether current approaches to privacy in AI are built on solid foundations. What do we actually mean by privacy or anonymization? How realistic are current assumptions about privacy attacks? And are we focusing on the right problems–or overlooking more fundamental issues?
By combining empirical insights with theoretical reflection, the talk highlights a key challenge: debates about privacy in AI are often shaped by strong claims, but not always by clearly defined concepts.

Why this debate matters

These questions are not abstract. They directly affect how AI systems are developed and deployed in practice. If privacy risks are overstated, innovation may be unnecessarily constrained. If they are underestimated, sensitive information may be exposed. In both cases, the consequences reach beyond research–impacting regulation, public trust, and the responsible use of technology.
Workshops like LEGAL 2026 reflect the growing need to address these tensions in an interdisciplinary way. Technical solutions alone are not enough. They must be aligned with legal frameworks such as the GDPR, the Data Act, or the AI Act–and with a broader understanding of societal expectations.

Recognition on an international stage

Being invited to give a talk at this workshop highlights Ivan Habernal’s role in shaping discussions at the intersection of NLP, privacy, and law. He is Professor of Trustworthy Human Language Technologies at Ruhr University Bochum, where he leads the Trustworthy Human Language Technologies (TrustHLT) group at the Faculty of Computer Science. He is also a member of the Research Center Trustworthy Data Science and Security (RC Trust), which brings together interdisciplinary research on reliable and responsible AI systems.
His work focuses on privacy-preserving methods in NLP and on legal language technologies–exactly the themes at the core of the workshop.

A broader contribution to the field

In addition to the invited talk, Ivan Habernal is also involved in a research paper presented at the workshop: Towards Robust Evaluation for Privacy QA Systems, co-authored with researchers from Ruhr University Bochum (Erion Çano, RC Trust), Fraunhofer IIS, and FIZ Karlsruhe – Leibniz Institute for Information Infrastructure.
While details of the work will be presented at the conference, the contribution underlines the broader engagement of his research group in advancing how privacy is studied and evaluated in language technologies.

Author

Patrick Wilking

Back to news

Privacy and anonymization in language processing: Are we barking up the wrong tree?

Ivan Habernal’s invited talk at CALD-pseudo 2026 explores how AI can handle sensitive data – and where current assumptions fall short.

Category

Author