The Shifting Privacy Left Podcast

S3E14: 'Why We Need Fairness Enhancing Technologies Rather Than PETs' with Gianclaudio Malgieri (Brussels Privacy Hub)

June 25, 2024 Debra J. Farber / Gianclaudio Malgieri Season 3 Episode 14
S3E14: 'Why We Need Fairness Enhancing Technologies Rather Than PETs' with Gianclaudio Malgieri (Brussels Privacy Hub)
The Shifting Privacy Left Podcast
More Info
The Shifting Privacy Left Podcast
S3E14: 'Why We Need Fairness Enhancing Technologies Rather Than PETs' with Gianclaudio Malgieri (Brussels Privacy Hub)
Jun 25, 2024 Season 3 Episode 14
Debra J. Farber / Gianclaudio Malgieri

Today, I chat with Gianclaudio Malgieri, an expert in privacy, data protection, AI regulation, EU law, and human rights. Gianclaudio is an Associate Professor of Law at Leiden University, the Co-director of the Brussels Privacy Hub, Associate Editor of the Computer Law & Security Review, and co-author of the paper "The Unfair Side of Privacy Enhancing Technologies: Addressing the Trade-offs Between PETs and Fairness". In our conversation, we explore this paper and why privacy-enhancing technologies (PETs) are essential but not enough on their own to address digital policy challenges.

Gianclaudio explains why PETs alone are insufficient solutions for data protection and discusses the obstacles to achieving fairness in data processing – including bias, discrimination, social injustice, and market power imbalances. We discuss data alteration techniques such as anonymization, pseudonymization, synthetic data, and differential privacy in relation to GDPR compliance. Plus, Gianclaudio highlights the issues of representation for minorities in differential privacy and stresses the importance of involving these groups in identifying bias and assessing AI technologies. We also touch on the need for ongoing research on PETs to address these challenges and share our perspectives on the future of this research.

Topics Covered: 

  • What inspired Gianclaudio to research fairness and PETs
  • How PETs are about power and control
  • The legal / GDPR and computer science perspectives on 'fairness'
  • How fairness relates to discrimination, social injustices, and market power imbalances 
  • How data obfuscation techniques relate to AI / ML 
  • How well the use of anonymization, pseudonymization, and synthetic data techniques address data protection challenges under the GDPR
  • How the use of differential privacy techniques may led to unfairness 
  • Whether the use of encrypted data processing tools and federated and distributed analytics achieve fairness 
  • 3 main PET shortcomings and how to overcome them: 1) bias discovery; 2) harms to people belonging to protected groups and individuals autonomy; and 3) market imbalances.
  • Areas that warrant more research and investigation 


Resources Mentioned:


Guest Info: 

Send us a Text Message.

Privado.ai
Privacy assurance at the speed of product development. Get instant visibility w/ privacy code scans.

TRU Staffing Partners
Top privacy talent - when you need it, where you need it.

Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.

Copyright © 2022 - 2024 Principled LLC. All rights reserved.

Show Notes Transcript Chapter Markers

Today, I chat with Gianclaudio Malgieri, an expert in privacy, data protection, AI regulation, EU law, and human rights. Gianclaudio is an Associate Professor of Law at Leiden University, the Co-director of the Brussels Privacy Hub, Associate Editor of the Computer Law & Security Review, and co-author of the paper "The Unfair Side of Privacy Enhancing Technologies: Addressing the Trade-offs Between PETs and Fairness". In our conversation, we explore this paper and why privacy-enhancing technologies (PETs) are essential but not enough on their own to address digital policy challenges.

Gianclaudio explains why PETs alone are insufficient solutions for data protection and discusses the obstacles to achieving fairness in data processing – including bias, discrimination, social injustice, and market power imbalances. We discuss data alteration techniques such as anonymization, pseudonymization, synthetic data, and differential privacy in relation to GDPR compliance. Plus, Gianclaudio highlights the issues of representation for minorities in differential privacy and stresses the importance of involving these groups in identifying bias and assessing AI technologies. We also touch on the need for ongoing research on PETs to address these challenges and share our perspectives on the future of this research.

Topics Covered: 

  • What inspired Gianclaudio to research fairness and PETs
  • How PETs are about power and control
  • The legal / GDPR and computer science perspectives on 'fairness'
  • How fairness relates to discrimination, social injustices, and market power imbalances 
  • How data obfuscation techniques relate to AI / ML 
  • How well the use of anonymization, pseudonymization, and synthetic data techniques address data protection challenges under the GDPR
  • How the use of differential privacy techniques may led to unfairness 
  • Whether the use of encrypted data processing tools and federated and distributed analytics achieve fairness 
  • 3 main PET shortcomings and how to overcome them: 1) bias discovery; 2) harms to people belonging to protected groups and individuals autonomy; and 3) market imbalances.
  • Areas that warrant more research and investigation 


Resources Mentioned:


Guest Info: 

Send us a Text Message.

Privado.ai
Privacy assurance at the speed of product development. Get instant visibility w/ privacy code scans.

TRU Staffing Partners
Top privacy talent - when you need it, where you need it.

Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.

Copyright © 2022 - 2024 Principled LLC. All rights reserved.

Gianclaudio Malgieri:

We all, as a scientific and technological and policy community, should consider privacy announcing technologies bigger and shift from PETs to FETs. So fairness-enhancing technologies, which is not so difficult to reach. We just need to think a bit more critical and a bit broader. And what are the real goals? The real goals are protecting the most impactful groups, the most marginalized and impacted groups in the digital environments.

Debra J Farber:

Hello, I am Debra J Farber. Welcome to The Shifting Privacy Left Podcast, where we talk about embedding privacy by design and default into the engineering function to prevent privacy harms to humans and to prevent dystopia. Each week, we'll bring you unique discussions with global privacy technologists and innovators working at the bleeding edge of privacy research and emerging technologies, standards, business models and ecosystems. Welcome everyone to the Shifting Privacy Left podcast. I'm your host and resident privacy guru, Debra J Farber.

Debra J Farber:

Today, I'm delighted to welcome my next guest, Gianclaudio Malgieri, Associate Professor of Law and Technology at Leiden University; Co-Director of the Brussels Hub Hub; , Professor the Computer Law and Security Review Review; A of the book Vulnerability and Data Protection Law Law; and expert in privacy, data protection, AI regulation, law and technology, EU law and human rights. Today, we're going to be discussing his recently co-authored paper on the fairness of privacy enhancing Welcinem

Debra J Farber:

Welcome Jean-Claudio A!

Debra J Farber:

P

Debra J Farber:

H

Debra J Farber:

. Welcome

Debra J Farber:

,

Debra J Farber:

Jean-Claudio.

Gianclaudio Malgieri:

Thank you, Debra, I'm very happy to be here.

Debra J Farber:

Excellent. Well, I know we're going to have a really lively conversation, but before we get into your paper, I'd love for you to tell us a little bit about some of your academic work with the Brussels Privacy Hub, and you know how you got into privacy to begin with.

Gianclaudio Malgieri:

Yeah, the Brussels Privacy Hub is a special think tank based in the Freie Universität Brussels in Brussels, and the position of the hub in Brussels is really helpful for engaging with policy making, academia and all countries that in Europe are very active in terms of academic efforts for privacy technology, regulation and policy. But I'm also, as you said, working full-time in Leiden, which is the oldest Dutch university and one of the biggest law faculties in the Netherlands, so I'm trying to exploit these links. But I'm also Italian, so I'm connecting with the Mediterranean tradition on privacy and data protection, also trying to connect bridges with US because I'm part of the privacy law scholarship conferences in the States. The hub is trying to push on several different aspects during research, but also trying to push for important debates. Just to make some examples of our activity.

Gianclaudio Malgieri:

I would like to mention three main topics that we are exploring now. The first is impact assessments and fundamental rights, so how technologies can be considered from an impact assessment perspective. In particular, we looked at the Artificial Intelligence Act in the European Union, which was very important for the impact assessment discussion. The Brussels Privacy Hub pioneered a letter signed by 160 university professors to push the legislators to add a solid fundamental rights impact assessment in the final text of the European Union law on AI and we were successful. It's now there. So this is just one of the three main things we do.

Gianclaudio Malgieri:

A second example is vulnerability and data protection, which is also, as you said, the name of my book. We found, within the hub, a group called Vulnera. It's a research network and dissemination platform where we try to focus on vulnerabilities and different vulnerabilities in different situations and in different groups. And the last but not least, part of our research and activity is about data governance and data transfer, which is due to the tradition of the Brossel Privacy Hub that was funded by Paul D'Arte and Chris Kuhner. Chris Kuhner, in particular, was one of the great scholars on the topic of data transfer and data governance.

Debra J Farber:

Oh wow, I learned a lot. I had no idea about some of the things that had transpired. That kind of led to the Brussels Privacy Hub, and it makes sense that you're located in Brussels to have that kind of effect on policy in the EU. You recently co-authored this article and the title is the Unfair Side of Privacy Enhancing Technologies Addressing the Trade-offs Between PETs and Fairness. Maybe you could tell us what inspired you all to write on this topic, like fairness and PETs.

Gianclaudio Malgieri:

Yeah, sure, I think what inspired me and my two co-authors is mostly the narrative and how to say, distorted narrative about privacy and anti-technologies that big industrial lobbies are pushing for in the EU, us and in general in global discussion on technology regulation. We had a lot of emphasis on the importance and benefits of privacy enhancing technologies. In the last years we had important initiatives about privacy enhancing technologies and a lot of explanation on the importance, a lot of marketing, a lot of advertisement on how great privacy enhancing technologies are, and we could agree to a certain extent. The problem is that the narrative is incomplete because, as maybe we will say later, fairness and privacy and data protection are much beyond anonymization and pseudonymization. It's about power control, power management and also mitigation of power imbalance in the digital landscape. So it's not just about not identifying individuals. It's also about controlling and managing power imbalances for big dominant platforms, and so for us, the main trigger was privacy and antidepressant technologies are important, but are not the solutions for the digital policy challenges. It's not the solution we are looking for.

Debra J Farber:

So is it that it's not part of the solution, or is it not sufficient and we need additional areas to fill those gaps?

Gianclaudio Malgieri:

Yeah, of course they can help, but there are several problems. They can help in general to reduce the amount of data and so also to comply with some of the important principles in data protection law, both in Europe and US, for example, purpose limitation, data minimization, of course. But I would like to explore the two parts of your question First, why they're not sufficient and second, why they can also be somehow detrimental, at least in their policy impact. So for the first part, why they're not sufficient, as I tried to explain a few minutes ago, privacy and data protection are about power control. I can manipulate people and I can notch people and I can harm people online in their digital life, even if I cannot explicitly single them out, even if I cannot identify people. The problem is they're not sufficient because there's the whole harm problem that is not entirely solved just by anonymization, pseudonymization, federated learning, synthetic data and so on, and the whole problem of just pushing on privacy-enhancing technologies is that we are losing and we are missing the main part of competition and power. I would like just to explain in a few sentences this Basically, what's happening with privacy enhancing technologies is that big companies with great computational capabilities, with huge amount of training data sets are the companies best placed to practice and to implement privacy enhancing technologies.

Gianclaudio Malgieri:

They will also have legal benefits from it because if they can even anonymize their data processing, they might escape from most of the GDPR so General Data Protection Regulation duties. And it's a paradox that the biggest company will be the companies that will not be accountable for the GDPR because they will be able to anonymize or pseudonymize, etc. At the same time, smaller companies that will not have the power, the computational power, the policy power, the money to develop these privacy-enhancing technologies, will be the ones that will be mostly challenged by GDPR rules, so by data protection rules. Challenged by GDPR rules, so by data protection rules. In other terms, privacy enhancing technologies are not the solution because they will create a distortion effect also on the markets, where the less harmful actors, like small and medium enterprises, will be the ones that will still need to comply with the law, while the biggest players will probably be partially exempted from the rules.

Gianclaudio Malgieri:

And maybe just one final thing. We will explain it more. The use of mass privacy in anti-technologies might also be detrimental to some of the main values and principles in data governance and data regulation, which is diversity and fairness considered in a broad sense. For example but we will explain it later Synthetic data or differential privacy tend not to consider minority groups, and this is problematic for diversity and bias detection.

Debra J Farber:

Just fascinating. I mean you know I've been such a champion of shifting left into the product and development life cycles and that ways to do data minimization include privacy enhancing technologies. But if you look at it as a monolith and as just one big thing that just maybe takes organizations outside of being covered by regulations, then you kind of miss the forest for the trees that maybe potentially it can be abused or monopolistic power could be abusive by using these technologies. So I'm really excited to dive in, if you don't mind, telling us how you and your team approach this topic in your paper, and then we'll dive into the specifics, sure into the specifics?

Gianclaudio Malgieri:

Sure. So in this paper we tried to address the topic of unfair side of VETs from two perspectives the legal one and the computer scientist one. From the legal perspective, we address mostly the concept of fairness in its evolution and development, starting from the law, so from the General Data Protection Regulation, from per-information practices, from consumer protection definition in Europe and beyond Europe, the two legal authors, so me and my great co-author Alessandra Calvi from Freie Universität Brussaux, who was also the main driver behind the paper, we tried to analyze the concept of fairness and how fairness has been developed. First we have fairness as diversity, so fairness as non-discrimination, which is the most accepted meaning that computer scientists seem to adopt when they mention fairness. But there's also a concept of fairness related to power imbalance and power control and imbalance mitigation, which is a concept that has been growing a lot from consumer protection and now also data protection.

Gianclaudio Malgieri:

I wrote the paper about the concept of fairness in the GDPR four years ago and the conclusion from a linguistic analysis of fairness in many different systems of legislation in many different legal countries, legal frameworks was fairness as loyalty and fairness as equal arms power control In parallel. The technical co-author, professor Dimitris Kotsinos from University Sergi in Paris. He analyzed with us the different privacy enhancing technologies, looking at their limits from also the perspective of fairness that we try to develop in the legal terms. So it was kind of a dialogue between different disciplines trying to understand first fairness and second, how PETs are not really fair, friendly, let's say.

Debra J Farber:

Fascinating. At first glance the concept of fairness seems kind of straightforward to most people, but your paper really highlights that the concept can mean different things to engineers versus sociologists, you know, with potential fairness problems that include, like you said, bias, discrimination, social injustices and market power imbalances. I know you talked a little bit about it already, but can you unpack maybe each of those fairness problems and how they link to privacy?

Gianclaudio Malgieri:

The link with privacy is both in the law and in a logical reasoning as a consequence of the concept of privacy. So in the law we have fairness as one of the principles of data protection. It's, for example, focusing mostly on European Union law because we know that in the States, for example, we don't have a federal law on privacy and data protection. It's, for example, focusing mostly on European Union law because we know that in the States, for example, we don't have a federal law on privacy and data protection. But in the European Union, the fundamental rights to privacy and data protection is mentioned in Article 8 of the European Charter of Fundamental Rights, and that article refers to fairness. So there is a logical link between fairness and privacy that even the legislators several years ago because the article I'm referring to in the charter is from 2000, so 24 years old the legislators already identified these links and also the GDPR, the General Data Protection Regulation, has an explicit reference to fairness in the guiding principles of data protection. As you said, and as I said before, there are different declinations, different interpretations of fairness. We have fairness as bias mitigation, fairness as fight against discrimination, fairness as equality against social injustices, and so not just equality but equity and fairness as market power, imbalances, mitigation. I think all of these interpretations are correct and they do not contradict with each other. They respond to the same challenge, which is mitigating harms that algorithms and data technologies can produce. Fairness is kind of a safeguard against these harms and also, fairness is the if you allow me to list legal concepts is mostly an ethical concept. It's mostly an ethical concept, sure, because fairness is not a clearly definable concept that lawyers can clearly define. Indeed, yeah, as you said, you asked me to unpack it. It's not easy to unpack it, but I can say that bias, for example, and discrimination are inherent in data processing. Because, because, of course, the effects of data processing is that there might be an incomplete or non-diverse enough data sets that can produce two unfair conclusions and unfair automated decisions. But what about social injustice? Social injustice is a consequence of this. If I process data in a way that is incomplete and doesn't take into account minorities, marginalized groups, people at margins, social and economic minorities, I will be processing data and taking decisions that will be unfair, and we have a lot of examples.

Gianclaudio Malgieri:

I am in the Netherlands now. In the Netherlands, we had a lot of scandals based on social injustices based on inaccurate and unfair data processing for public administration. There was a scandal about child benefits, but we don't have time to address this now. Just to say this is important and the other part. Just to conclude fairness, as market power imbalance mitigation is also connected to data processing. Why? Because the big power imbalance that we observe between individuals and companies and big techs in the digital environment is based on the huge amount of data that big techs can process upon us. I can just mention, very briefly and simply, shoshana Zuboff's work the Age of Surveillance Capitalism. Basically, what we observe now is that capitalism is based on data and surveillance and behavioral surveillance. Exactly, data protection is the tool to look at power imbalance, because data is power.

Debra J Farber:

Again so fascinating. In the United States, we talk about privacy, but we often don't talk about data protection as a whole, where in the EU, privacy it's a piece of the data protection mandate, with privacy being an enshrined right. I think a lot of these big tech companies that you reference are run by people and then have employees who also are not thinking in terms of data protection, thinking larger than how do I make sure that this person has control over their own choices about how their data is used? Right? It is really great to hear from you this reminder to think larger about societal impacts, the socio-technical understanding of fairness, and especially wanted to also mention that in the EU, the AI Act also has a requirement around fairness, which kind of leads me into the next question, where let's dive into some of the analysis of the paper.

Debra J Farber:

But the first section was on PETs for machine learning and AI, and then you know, how does that relate to fairness? So let's first talk about data obfuscation. That would be anonymization, pseudonymization, synthetic data, differential privacy, each of which builds upon the concept of data alteration. How are they, as a group, relevant as solutions, privacy-enhancing solutions for AI and machine learning needs? And then maybe we could go through them more specifically in my next question.

Gianclaudio Malgieri:

Sure. So I think you addressed the main point. Data obfuscation has been considered one of the most important privacy preserving practice for AI-driven technologies. You mentioned anonymization, pseudonymization, synthetic data and differential privacy. They are different but of course they react to the same challenge, which is reducing the identifiability of single users, single individuals, single data subjects in the digital environment. So there is an overarching issue, which is privacy.

Gianclaudio Malgieri:

Harms are not just individual harms, they can be collective harms. Privacy harms, not just in Europe but also in wonderful scholarship in the States, has been identified as harm not just to my private life, my personal life in my toilet or in my bedroom, but also my work life, also democracy and freedom of speech as a connection to my informational freedom. So just to say anonymizing, pseudonymizing, obfuscating data etc. Is not maybe the solution to collective harms to privacy, because even if I cannot identify you, I can identify your group or I can identify the best ways to target you or to limit your freedoms in connection with your digital life, limit your freedoms in connection with your digital life. So even if I don't exactly know your data, your personal data, your identifiable data, I can still target you. This is something I think mostly relevant for this discussion about synonymization, synthetic data, etc. Something else I wanted to say and I already mentioned it before is that usually if we, for example, focus on synthetic data and differential privacy, which are very different practices because the first synthetic data is based on, as we can simplify, a reproduction of a data set, so it's not based on real individual data. But this synthetic data has a lot of a data set, so it's not based on real individual data. But this synthetic data, as a lot of computer scientists have already identified, tend to ignore minority groups, tend not to look at minorities and outliers, and this is also for differential privacy.

Gianclaudio Malgieri:

Differential privacy is something else. Differential privacy is looking at aggregated data and making analysis on aggregation. But the statistical aggregation, in order to protect privacy and to limit re-identification of single individuals in the aggregation, need to delete outliers, need not to consider the upper and the minimum outliers, so they cannot consider different groups. They need to look at the average. So this is the main problem. Right Data obfuscation tends to simplify all the humanity or all the data sets to an average person, and this doesn't help to mitigate biases but also to represent society. If we have to take a decision, even a democratic decision based on AI and we cannot really know what are the single groups and the different minorities and outliers in the group, because we cannot identify them and we don't want to re-identify them. And outliers in the group, because we cannot identify them and we don't want to re-identify them. We might have problems of representation problem to mostly collective harms of privacy and data protection. I hope this answered your question. Of course it's not easy to answer in a few sentences.

Debra J Farber:

Yes, no, that was really helpful. Let's go through some of those data obfuscation PETs, maybe briefly explain their intended benefit, maybe from a GDPR perspective, and then, if there's anything specific about each one of them that ties to fairness, that'd be helpful to understand the context around that. But if it's already the summation you just gave us, I don't want you to repeat yourself, so just let me know. But let's start with anonymization.

Gianclaudio Malgieri:

Anonymization is, you know, a bit of an illusion.

Gianclaudio Malgieri:

We know it's very hard to anonymize data if we still want to use data right, and then of course it depends on which is the purpose of our data processing activity.

Gianclaudio Malgieri:

But in general in the GDPR, so in the European Union data protection law, it is very hard to reach the anonymization level. There is a big discussion about what is anonymization, because the GDPR seems to take a risk-based approach, while the guidelines of the European Data Protection Board which actually dates back to the previous entity, the entity before the data protection board was founded so the Article 29 Working Party opinion these guidelines generally refer to anonymization as a zero risk of identification approach. So basically, if there's even a minimum risk of identifying someone, it's not anonymous. Of course it's impossible to reach that level and that standard right, because in today's data processing environment it's very easy to identify someone based on some proxies, based on a lot of aggregated data that we can use to infer who is a specific individual. So we know there's a lot of scholarship on that. Just let's say that anonymization is a theoretical concept but not a practical one, if you agree.

Debra J Farber:

Yeah, no, in fact, it is kind of fascinating because it is one of the few techniques that's written into GDPR and yet it is not that effective, because you could combine a bunch of data sets that can re-identify. So anonymization techniques can easily be broken these days Not all of them and not all of them easily but is not the panacea that many in corporations thought it might be to take companies out of the regulation? What about pseudonymization? So things like tokenization, masking, generalization and other techniques.

Gianclaudio Malgieri:

Pseudonymization is much easier to beat because pseudonymization doesn't mean that we cannot identify individuals anymore, cannot identify individuals anymore. Pseudonymization means that we protect data in a way that privacy attacks are less harmful. Why? Because the identification of a dataset is kept separate from the dataset itself. At least this is the GDPR definition, so the European Union definition of pseudonymization. There is a legal difference and a legal implication if we have anonymization or pseudonymization. If we apply anonymization, which I said is very hard in practice, the GDPR, so the European protection law does not apply at all, and also the United States law, like the national laws, for example, colorado, washington, virginia, different laws that we have in the States wouldn't apply because anonymization doesn't allow us to identify people.

Gianclaudio Malgieri:

For pseudonymization, the situation is more complex because the GDPR applies, because the GDPR applies. So even if we pseudonymize data through tokenization or masking etc. We should still comply with GDPR rules. So pseudonymization doesn't solve the compliance problem. But if the pseudonymization is in practice, the data controllers or companies that decide how to use data and why they can prove that they protected data, and this is helpful for daily compliance. So if the regulator wants to check about compliance, they can always say yes, I applied a good protection, which is pseudonymization. Of course it depends on which kind of pseudonymization. Of course it depends on which kind of pseudonymization. Just to summarize, anonymization in case of anonymization, we are out of the GDPR. In case of pseudonymization, we still need to apply the rules of the GDPR, but we have sorts of safeguards in place that will excuse us and will protect us from a regulator perspective.

Debra J Farber:

And then what about synthetic data? Yeah, synthetic data.

Gianclaudio Malgieri:

well, it really depends on the purposes for our data processing. We can say that synthetic data are a form of, let's say, data obfuscation. That might be very useful if we want to train algorithms without using training data sorry, without using personal data, personal identifiable data. So synthetic data is a form, we can say, of data minimization that is very useful for, for example, reducing the legal risks and so the possible sanctions if we do data scraping. So you know, most of data, most of training systems, training systems for AI are based on scraping data from social media, from big databases. It's basically the download or the processing of huge amount of publicly available data on Facebook, instagram, twitter, google. Whatever Synthetic data might be a solution to avoid the harms produced by scraping, but it's not harms to individuals, it's harms to business interest mostly, and also privacy harms, yeah you know, and it really depends on how we process.

Gianclaudio Malgieri:

what is the purpose for this synthetic data? I think there's no single definition of synthetic data from a legal perspective.

Debra J Farber:

Yeah, that makes sense. It's a relatively new designated, you know, privacy enhancing technology, so I don't think it made it into the regulation. And then the last for that subheading would be differential privacy and then, if you want to also link it back to fairness, that'd be helpful. Yeah.

Gianclaudio Malgieri:

So, as I already said, differential privacy is a very problematic practice because, in a sense, it reduces a lot the risks of identification. So this is good in terms of the traditional view of privacy, right, the computer scientist view of privacy, privacy as known identification. But, as I said before, differential privacy is mostly based on aggregated analysis of data. The aggregation of data can be useful for companies because, for example, they don't need to identify individuals. Sometimes, if I just need to understand how effective was my marketing activity on social media, I can just consider differential privacy aggregation. So, basically, I just analyze how my behavioral advertising were translated into some benefits or time spent by my users online. I don't really need to identify individuals for that my users online I don't really need to identify individuals for that.

Gianclaudio Malgieri:

The problem is that if differential privacy, as I already said, is considered an anonymization technique an anonymization technique, sorry it might exclude the full application of data protection rules, which has anti-competitive consequences in the digital market, in particular against smaller enterprises. And, on the other hand, in order to reduce identifiability, differential privacy needs to cut the outliers. And so, as I was saying, differential privacy might be problematic for representation of minorities and marginalized groups, a disclaimer that I am trying to add and I emphasize now, is that all these technologies cannot be considered in silos. So we are speaking a bit transversely now, but it really depends on what is the specific business application of these technologies. So my statements might be very different if we consider one aspect or another, one application or another, one case study or another.

Debra J Farber:

That makes a lot of sense. No, definitely. And then there's also the paper goes into detail on encrypted data processing tools, as well as federated and distributed analytics. And you know, in interest of time and instead of going through each of those specifically, do you want to make any connections for the audience about those privacy-enhancing technology categories and fairness and what you found in your research?

Gianclaudio Malgieri:

Sure, yeah, I mean, I think an important aspect of the paper, as you also suggest, is that we do not say that PET should be avoided. There are some benefits in privacy-enhancing technologies. We just say that they should be just considered one of the possible safeguards in place, together with many others. So for encrypted data, which is also considered in legal terms a form of enhanced pseudonymization, we suggest that privacy-enhancing technologies are a good safeguard. We just say that the whole fairness discussion, as I said before, in terms of bias, detection, diversity, representation, power mitigation, is not addressed by, for example, encryption.

Debra J Farber:

Awesome. Thank you for that. So we kind of just went through an exploration of specific groups of privacy enhancing technologies, but now I want to turn to some of the technical and regulatory solutions that address some of these PET shortcomings. Your team lists three main PET shortcomings in its research. When it comes to PETs and again you've alluded to these, but I'll restate them Bias discovery, harms to people, protected groups and individuals, autonomy and market imbalance. What technical and regulatory solutions do you propose to address each of these shortcomings? First, let's start with PETs and bias discovery.

Gianclaudio Malgieri:

We are not sure that we can really really propose immediately applicable solutions. But of course, I think, as I said before, privacy and anti-technologies should not be the sole safeguards in place. So for BIOS discovery, there's a lot that we can do. First of all, we shouldn't look always for automated solutions. So I think this is important from also a legal scholar as me, as a message that automation is not always the solution to automation problems. If some problems were inherent in automation, the solution might be just different, like social business, et cetera. I will try to explain better.

Gianclaudio Malgieri:

For bias discovery, for example, one of the most interesting ongoing discussion is involvement of impacted groups in the assessment of a technology, in the assessment of arms and in the assessment of impacts of technologies and fundamental rights. If we need to discover biases of AI, which now are also problems, for example for generative AI, like hallucinations or misalignment, et cetera, we need impacted groups to stand up and to help the AI developers to identify gaps and issues. Basically, what I'm saying here is that we should look at business models, not just the technologies. We should look at how different business models address solutions and decisions and how these decisions can be modified and improved and how we can empower impacted groups. I don't think we will ever have an automated bias discovery solution, but of course there are very good bias discovery solutions that might benefit from participatory approaches, from participation of impacted individuals in the impact assessment.

Debra J Farber:

Fascinating. What about people belonging to protected groups? What are some of the? You know that is a shortcoming that was highlighted with PETs, that it doesn't appropriately address those marginalized groups or protected groups. Would you suggest a similar technical and regulatory solution as you just did with bias discovery, or is there something else?

Gianclaudio Malgieri:

Yeah, I mean, as I said before, the biggest problem about impacted groups is that they are underrepresented and they are the most impacted groups, so the groups that have the most adverse impacts in terms of technology applications. So there's a problem here, which is a problem of democratic participation, but also a problem of decision-making and fairness in practice. Some of the solutions is indeed participation and multi-stakeholder participation. I'm just publishing now I mean next month a co-author and I will publish an article about stakeholder participation. The co-author is Margot Kaminski from Colorado Law School and the journalist is the Yale Journal of Law and Technology.

Gianclaudio Malgieri:

We are trying to discuss how privacy governance so data governance and AI governance can be improved by multistakeholder participation, in particular, for people belonging to protected groups. There's a problem, of course, and the problem is how to define these groups. Should we just rely on undiscrimination laws defining protected groups, or should we rely on something else? This is an ongoing discussion. We don't have time now to address this, but we can, of course, start from the most vocal and most visible groups impacted by technologies. I can make three, four examples Children, older people, racialized communities, victims of gender-based violence, lgbti plus communities, and I could go on, but we could start from these groups and look at how, together with privacy enhancing technologies, diversity of these groups could be considered. So, just to be very practical, we apply privacy enhancing technologies, for example, in a business model, but then we check the impact with impacted groups. So, basically, we put the privacy enhancing anti-technology's effects into a bigger and broader multi-stakeholder decision-making where impacted groups' representatives can express their views.

Debra J Farber:

That's awesome. I really look forward to reading that paper when it comes out, In addition to the paper we're discussing today, which I will include in our show notes. I will also update the show notes to include a link to your future paper once you publish it. The last but not least area where there's a shortcoming would be individual autonomy and market imbalances. Talk to us a little bit about what potential solutions to this shortcoming would be.

Gianclaudio Malgieri:

Yeah, of course we cannot discuss, as I said before, privacy and anti-technologists in general. We should always look at how single privacy and anti-technologists practices are affecting some of the fairness components in practice. But what I might say is that market imbalance should be regulated not just for privacy in the narrow sense, but we should consider a lot of different obligations that can reduce market dominance. I will make a simple example In the European Union, two years ago, the Digital Markets Act was approved. The Digital Markets Act is an important power rebalancing tool, imposing a lot of duties in terms of competition law and fair access to data and also consent to data processing. So, just referring also to individual autonomy that you mentioned and the DMA, the Digital Markets Act is an important tool that complements privacy and data protection.

Gianclaudio Malgieri:

Just to say, privacy and anti-technologies are a great tool that should be complemented by specific rules in terms of market control. This is clear, for example, in reducing abusive practices. That can happen when big techs might manipulate individuals or might exploit dependencies, because this is another problem. I didn't mention that term so far, but dependency is the problem that we really want to address. We depend on social media, we depend on big techs, we depend on social giants, and this dependency is the real power imbalance problem. So the states should take a position against these dependencies, either imposing rules and fundamental rights enforcement duties on big tax or prohibiting some abusive practices.

Debra J Farber:

It's a lot to think about. I'm not sure there's the political will to make it happen, but we'll see if we can get a federal law that embodies all of that. What were some of your team's conclusions at the end of writing this, and where might there be some areas where you might want to do some more research, or more research is needed?

Gianclaudio Malgieri:

Sure, I think a lot of research is still needed. Just to make some examples, we couldn't go deeper on each single privacy-enhancing technology in practice, and also we should look at how, for example, generative AI is altering the discussion. So our paper didn't consider generative AI challenges, but of course this is perhaps chapter two of our activity. How can privacy-enhancing technologies help or not help for hallucinations and misalignment of generative AI systems, where fairness is a big problem, because we know that hallucination and misalignment can produce discrimination on generative AI. For example, chatbot or image search engines can produce stereotypes, can induce harms. So of course, these are some of the areas that we need to investigate in the future and it's just part of the problem.

Debra J Farber:

I think that really sums up a lot of what's needed.

Debra J Farber:

In fact, I'll be on the lookout for some working groups or standards or just more research coming out on the topic. I think one of the things I've been thinking about either doing myself or kind of surprised I haven't seen much out on the market around them but is a listing of all of the privacy enhancing technologies based on different use cases, but also based on what is the privacy guarantee that the organization wants to ensure by using that PET and then working backwards to see or, I'm sorry, by using a PET and then working backwards to see which PET or set of PETs would get that job done. But this conversation has made me really think about. We need to think broader than just can we do the thing? Can we achieve this end goal and instead broaden it to also include are we being fair to the individual and to, like society generally, the group of individuals? So really a lot to think about. Thank you so much for your time today. Are there any words of wisdom that you'd like to leave the audience with before we close today?

Gianclaudio Malgieri:

I think we all, as a scientific and technological and policy community, should consider privacy announcing technologies bigger and shift from PETs to FETs. So fairness announcing technologies, which is not so difficult to reach, we just need to think a bit more critical and a bit broader. And what are the real goals? The real goals are protecting the most impactful groups, the most marginalized and impacted groups in the digital environments.

Debra J Farber:

What a great idea and really elevating it beyond just privacy to meet fairness. So you'll meet a lot of goals there, right Including just, especially if you apply to AI. Well, thank you so much, jean-claudio. Thank you for joining us today on the Shifting Privacy Left podcast. Until next Tuesday, everyone, when we'll be back with engaging content and another great guest. Thank you so much, bye-bye. Thanks for joining us this week on Shifting Privacy Left. Make sure to visit our website, shiftingprivacyleftcom, where you can subscribe to updates so you'll never miss a show While you're at it. If you found this episode valuable, go ahead and share it with a friend, and if you're an engineer who cares passionately about privacy, check out Privato, the developer-friendly privacy platform and sponsor of this show. To learn more, go to privatoai. Be sure to tune in next Tuesday for a new episode. Bye for now.

Shifting Privacy Left Podcast
Unpacking Fairness and Data Protection
Navigating Privacy-Enhancing Technology Shortcomings
Modernizing Privacy With Fairness Technology

Podcasts we love