The Shifting Privacy Left Podcast

S2E8: Leveraging Federated Learning for Input Privacy with Victor Platt

February 28, 2023 Debra J Farber / Victor Platt Season 2 Episode 8
The Shifting Privacy Left Podcast
S2E8: Leveraging Federated Learning for Input Privacy with Victor Platt
Show Notes Transcript Chapter Markers

Victor Platt is a Senior AI Security and Privacy Strategist who previously served as Head of Security and Privacy for privacy tech company, Victor was formerly a founding member of the Risk AI Team with Omnia AI, Deloitt’s artificial intelligence practice in Canada. He joins today to discuss privacy enhancing technologies (PETs) that are shaping industries around the world, with a focus on federated learning.

Thank you to our sponsor, Privado, the developer-friendly privacy platform

Victor views PETs as functional requirements and says they shouldn’t be buried in your design document as nonfunctional obligations. In his work, he has found key gaps where organizations were only doing “security for security’s sake.” Rather, he believes organizations should be thinking about it at the forefront. Not only that, we should all be getting excited about it because we all have a stake in privacy.

With federated learning, you have the tools available to train ML models on large data sets with precision at scale without risking user privacy. In this conversation, Victor demystifies what federated learning is, describes the 2 different types: at the edge and across data silos, and explains how it works and how it compares to traditional machine learning.We deep dive into how an organization knows when to use federated learning, with specific advice for developers and data scientists as they implement it into their organizations.

Topics Covered:

  • What 'federated learning' is and how it compares to traditional machine learning
  • When an organization should use vertical federated learning vs horizontal federated learning, or instead a hybrid version
  • A key challenge in 'transfer learning': knowing whether two data sets are related to each other and techniques to overcome this, like 'private set intersection'
  • How the future of technology will be underpinned by a 'constellation of PETs' 
  • The distinction between 'input privacy' vs. 'output privacy'
  • Different kinds of federated learning with use case examples
  • Where the responsibility for adding PETs lies within an organization
  • The key barriers to adopting federated learning and other PETs within different industries and use cases
  • How to move the needle on data privacy when it comes to legislation and regulation

Resources Mentioned:

Guest Info:

Follow the SPL Show:
Privacy assurance at the speed of product development. Get instant visibility w/ privacy code scans.

Shifting Privacy Left Media
Where privacy engineers gather, share, & learn

Buzzsprout - Launch your podcast

Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.

Copyright © 2022 - 2023 Principled LLC. All rights reserved.

Debra Farber  0:00 
Hello, I am Debra J. Farber. Welcome to The Shifting Privacy Left Podcast, where we talk about embedding privacy by design and default into the engineering function to prevent privacy harms to humans, and to prevent dystopia. Each week we'll bring you unique discussions with global privacy technologists and innovators working at the bleeding-edge of privacy research and emerging technologies, standards, business models and ecosystems.

Debra Farber  0:27 
Today, I'm delighted to welcome my next guest, Victor Platt, a senior AI security and privacy strategist. Victor previously served as Head of Security and Privacy for, a privacy tech company, and was a founding member of the Risk AI team with Omnia AI, Deloitte's artificial intelligence practice. Today, we're going to discuss PETs with a focus on federated learning.

Debra Farber  0:56 
Welcome, Victor.

Victor Platt  0:57 
Thanks, Debra. Great to be here.

Debra Farber  0:59 
I am so excited to talk about something as geeky as PETs, and really just learning about kind of how they work, right, like demystifying...I know, federated learning is something I've heard a lot about, but I have to admit, I haven't done a deep dive into that space and so, you know, looking forward to our conversation today. Just to start, why don't you give us a brief intro to your background. How did you end up working in privacy enhancing technologies, and maybe tell us about your journey, and then what privacy problems you focus on?

Victor Platt  1:32 
Yeah, absolutely. Thank you, Debra. So, we'll kind of frame it how I describe myself now and then what kind of brought me to where I am today. First is I consider myself kind of a translation layer, and that's kind of threefold. So, between privacy, law and engineering; and then as well as engineering and data science translation into kind of different types of security and privacy controls, bit of a retrospect journey, or circumspect journey that brought me here. So in undergrad I studied philosophy, started to think a bit more about cybersecurity and the political aspects of that and then went to grad school to learn a little bit more about the technology, started my career in government, and then joined Deloitte. I got really interested in artificial intelligence and the security aspects of that, at which point I started, moved over to our Omnia AI group with within Deloitte, Canada, and really started to focus full-time on security and AI, both helping organizations capture the value of this technology as well as navigate the different risks that AI can pose in more modern organizations. What I really kind of noticed as I shifted from a security focus into a more privacy focus is two kind of key gaps. I started to realize that security was often being done for security sake, and in most cases, the why of security is privacy. And that's how I kind of made the link. And then, as I kind of delve more on to the privacy side, there seems to be a huge gap in terms of technological understanding of how we can implement privacy by design within systems; and so those are kind of the two perspectives or the two gaps that I tried to fill in my kind of current focus.

Debra Farber  3:28 
Thanks for that. That's awesome. I love to hear about your journey. I love that you were studying philosophy and then ended up doing federated learning. You know, it's emblematic of a shift left mindset, which I share. You know, I had an English degree undergrad - business minor but an English degree really. And now I'm, you know, also deep in this stuff trying to get companies to understand how to put privacy and embed it into their organization at the outset so you don't have this huge compliance burden and trying to put a box of privacy on at the end, which doesn't work - you can't really do that. So, I wonder if our messaging and our ability to communicate - because I also consider myself a...I've never heard that before, but it's a "translation layer" between privacy law and engineering. It's kind of how I summed up a lot of my career of doing operational privacy work as I have a law degree.

Victor Platt  4:22 
Absolutely. I think maybe we can call ourselves the API's for privacy.

Debra Farber  4:29 
Oh, my gosh, I really love that. I don't know what I'm going to do with that. But I'm definitely going to tell other people that we're APIs for's so...I don't know, it conjures up a good understanding of like, I guess what we really do within organizations. Like oh, okay, that's what you do. Okay, why don't we get more into some substance. Given your most recent role, which was at, why don't we start talking about federated learning. And federated learning is used at the edge as well as across data silos. So, can you explain exactly what federated learning is and how it compares to traditional machine learning?

Victor Platt  5:10 
Certainly, federated learning is often best understood in, as you mentioned, comparing it to traditional machine learning. So in traditional machine learning, you gather a bunch of data, you centralize it, and then you execute algorithms on top of it to train machine learning models. So that the key point there is all of the data is in one place, you can think about it as a big database full of columns and rows. Within federated learning, it's a little bit different, where different datasets can remain where they live. So, you don't have to actually do that centralization function. And essentially, what you do is trained small models on each of those datasets. And then basically, there's some fancy math involved, but basically add those models together to create what we would call a global model. Yeah, so that's kind of like the the key facts about federated learning.

Victor Platt  6:04 
I think there's a bunch of different types of federated learning both in terms of how they're applied - so the context in which they're applied - and then also the different types of things that you can do that the different natures of federation, and maybe I'll kind of expand on both of those in order. So, you mentioned kind of federated learning at the edge. This is probably the one that people are most familiar with almost all of us, if we have either a iPhone or an Android device, we use this every day. So essentially, what they do what these companies do, so Apple and Google and others, is train little tiny machine learning models on your smartphone, send those back to the mothership - those models, not the actual data - add those all together, and then send a bigger model back to you. The key place where this is used is text prediction. So, when you're whenever you're typing on your phone, you'll get predictions as to what the next next word will be. That's essentially what they're doing. They're not taking all of your messages, sending them back to the mothership, so to speak, and reading them and then creating models there. They're doing it in a federated manner. So that's federated learning at the edge, we're actually seeing some really cool innovation on that side, were broadening the our conception of what the edge means. So you can start to think about similar type capabilities on any kind of smart device itself. So this could be sensors and factories, or your refrigerator, should it be a smart, connected device. And this is where we start to get into conversations around the Internet of Things and the different kind of capabilities, we have to embed machine learning capabilities on those types of devices. That's on the edge side.

Victor Platt  7:57 
On the across silo side, this is much more about enabling organizations to collaborate on data without having to share data with each other. And the logic is the same. We're training small machine learning models on disparate or decentralized datasets, and then essentially adding them together. The use cases here, I would talk about things like and have worked in the past with, let's say, two hospitals from anywhere in the world are both trying to train a machine learning model, which helps predict a very rare disease. Neither of them have enough data points or like clinical records in and of themselves to train a performance machine learning model. But together they do. Now, given kind of healthcare regulations and privacy regulations, sharing data directly with each other is often not feasible. And so what a lot of what a number of innovative healthcare organizations are doing, are implementing a federated learning layer on top so that they can both train those local machine learning models, add them together to create the global model, and be able to both use that global model without ever having shared the actual patient information. And this is important work. So, if you have a rare disease, and doctors need help predicting or identifying it, this can have real impacts in terms of health care outcomes, and really saving lives.

Victor Platt  9:30 
Now, I mentioned there's a kind of another distinction to be made. So that's the kind of operational context we talked about. So edge versus silos. There's another one which is the types of things you can do with federated learning. So what is the nature of the federation? So we call this vertical federation or horizontal federation and you can think about it in terms of how a dataset is partitioned. So, in horizontal fit narrated learning, you basically just got a big long data set where rows one to 100 are owned by one organization or one device, and rows 101 to 200 are owned by another, and you're basically just putting those two datasets together through the modeling process to create a global model. Whereas vertical federation, on the other hand, takes column-wise partition. So columns A to C are owned by one organization, but they have all the same rows, and then columns D through E, for example, are owned by another organization. And this allows you to do some really interesting things. There's also federated transfer learning, which is basically a mix of both, but I think that's probably a rabbit hole we don't need to go down to today, Debra.

Debra Farber  10:48 
Yeah, yeah. Well, so how about we instead reframe that as when does an org use vertical versus horizontal and then when would they use that hybrid that you just mentioned?

Victor Platt  11:00 
Sure. So for example, I talked about two hospitals that want to predict a rare disease. So, they both have the same input data. So they both have, let's say, CT scans for 5000 patients, and they're trying to predict, and they have labeled those with a doctor's best guess, or confirmed cases of this rare disease, they would both be trying to predict the incidents of that rare disease. And that's where you would use something like horizontal federated learning, where you have essentially two of the same looking datasets trying to predict the same thing, and you're basically just adding rows to an existing data set. That's conceptually what's happening.

Victor Platt  11:45 
Now more interesting scenarios is where the vertical federated learning comes in is where, let's say one hospital just to continue on this common example, one hospital has the images, whereas the other hospital has the incidents of this rare disease for the same patients, but they can't bring them together. So that's a case where you would use vertical federated learning where the columns that each hold are different while the rows are the same.

Debra Farber  12:18 
That makes sense. So how do you know that you're referring to the same data in a row? Maybe it's horizontal federation? Like, if it's two different datasets, but you can't join them, how do you know that the datasets are relevant to one another?

Victor Platt  12:36 
Yeah, so that's actually a really hard question to answer; and I'll answer it in the first the theoretical sense and then second what actually happens in the real world. So, the first is, knowing whether two data sets are related to or should be related to each other is kind of the core problem of like transfer learning - trying to get an AI model to be able to do things in various contexts. So, it can be a bit challenging and there's like many papers written on that.

Victor Platt  13:05 
Now, in practice, this is typically solved by kind of a few techniques and it depends on the level of privacy that you want to have, in the most basic sense, just have a key. Like it could be an identifier like someone's name or their their number. That's how you would map rows to rows. There's also much more interesting ones where you can actually preserve privacy. So both parties that both hospitals would never know the overlap of their patients and that sort of thing. This is where techniques like "private set intersection" would come in, where you can basically blindly identify common rows. So the same person in each data set without either party actually knowing that the magnitude of that overlap or the specifics of that overlap.

Debra Farber  13:54 
I just have more questions. So private set intersection, I have to admit is an area I don't know if I've even heard those words before.

Victor Platt  14:03 
It's cool stuff.

Debra Farber  14:05 
Yeah. Can you expound upon what that is a little more? And if you can, if it's not too difficult to explain just how it works?

Victor Platt  14:15 
Yeah. I can speak at a high level about it, it probably would lead us down a bit of a rabbit hole, but....

Debra Farber  14:22 
Sure, sure I understand

Victor Platt  14:24 you're trying to identify in a mostly-blind way. If for example, let's say you've got two datasets, does this person exist in both and if it does, we're at the machine learning layer; we're going to kind of rank them in a way such that they align when those datasets are analyzed together to create, for example, a federated machine learning model. It's basically just a way to identify common rows in datasets while retaining as much privacy in the process as possible.

Debra Farber  15:03 
Okay, that makes sense, that's really helpful. Definitely gonna go learn a little more about private set intersection. Okay, so how does one know when to use federated learning? There's multiple privacy enhancing technologies, which we'll talk a little bit about later. But you know, when do you use this particular tool in your toolbox?

Victor Platt  15:22 
Yeah, so there's kind of two paths, in my experience as to when you should consider federated learning for your use case. The first one is very similar to the example that I talked about earlier - two hospitals trying to do the same thing, they have the same datasets, that sort of thing. So that's when you actually you have a well-defined machine learning problem, and you know, the data you need. You just don't have access to it. And so, you create a partnership with some other organization or with some sort of endpoint device, and you start to do that as like, "We know what we're trying to do. We just don't have the data We don't have access to the data." Federated learning can enable you to overcome some of the barriers to getting access to that data because, at the end of the day, you don't need access to the data. So, it makes a lot of Chief Security and Privacy Officers feel much better about this type of partnership. It's also, which we can get into later, one of kind of set of technologies, which are starting to be recognized by regulators as good things and can kind of change some of the traditional conceptions of how data can be used and shared.

Victor Platt  16:39 
And then the second path there is just like totally new use cases that would never have been possible without something like federated learning. So, you could think of another example where, and this is where sometimes the different types of federated learning matter. But in an example where...the second one is totally net new use cases. This is where something like vertical federated learning tends to be kind of a little more interesting. So, let's say a credit card company, and a bank has two sets of information that are both used to predict fraud. Neither of them can share with each other because of kind of privacy legislation, internal policies, where basically, the credit card company would have the fraud labels. So, this actual transaction was fraud versus the bank that just has their list of transactions. They could create new machine learning models based on both of those datasets that can't be shared, that actually benefit both. And that's typically where we see the most impactful kind of use cases is where the different organizations with the different valuable data where their interests align on some particular outcome. In this case, the example that I gave would be fraud.

Debra Farber  18:04 
So Victor, how do you see the landscape for other PETs shaping up like differential privacy, homomorphic encryption, you know, multi party computation? What does the market look like for that and, you know, what are you excited about?

Victor Platt  18:20 
Yeah, absolutely. So, I'm extremely excited about that kind of spectrum of pets that that you mentioned, I think it's important to disaggregate them because they do different jobs, and they are at different levels of maturity, just in terms of like how easy it is to implement them and how reliable they are in practice or reliable, but also feasible. The reason that I'm so excited is because I think the future of technology is going to be underpinned by a constellation of privacy enhancing technologies. In most use cases, you're never going to use just one, and the reason being is because they do different things. For example, federated learning is a great way to kind of keep datasets themselves private from different parties or different people that you want to be able to collaborate with. This is what we would call kind of "input privacy." So you don't actually have to release control of the data set that you are in charge of or that you are the custodian for. What it doesn't do a good job at is "output privacy." So can we learn things from the machine learning...the global machine learning models that were trained, either through direct access to those models or querying them? This is what we would call "output privacy."

Victor Platt  19:43 
And in order to protect output privacy in a lot of cases, especially with particularly sensitive underlying data, you would need to add a different privacy enhancing technology. In this case, one of the most common ones that's applied in coordination with federated learning is differential privacy, which would actually give you both input privacy and output privacy, and that will be kind of key to unlocking the most kind of important and impactful use cases in my mind. And similarly, each of the other PET's that we talk about, things like homomorphic encryption and secure computation, they all have their different strengths and weaknesses whether they're focused on dealing with input privacy, output privacy, or other kinds of privacy objectives, as well as particular tasks and contexts in which they're quite useful. And as I mentioned, I think the future is we're going to see a constellation of these privacy enhancing technologies to unlock new and really interesting use cases. They're going to be kind of combined in different and unique ways depending on the kind of specifics of the use case we're trying to achieve.

Debra Farber  20:58 
Fascinating. So where do you think the responsibility for adding these privacy enhancing technologies lies within an organization? Especially when we're talking about input and outputs? Is that the data scientists? Is it developers? You know, who are the people within an organization who should be looking into this technology on behalf of their org?

Victor Platt  21:23 
Yeah, so I think as most things privacy and security, this is definitely a team effort. Like the short answer to your question, Debra is it's everyone. Now, I think that means different things to different groups within within an organization. So for example, I view the privacy team, for example, their role to be kind of two-fold. So, the first is communicate and build awareness around the art of the possible. So make sure that your engineering teams and data science teams are aware of these technologies, when they're useful when they're not. Those types of things. And then the second is what we would more traditionally see as the privacy team's role, which is kind of reviewing solutions, making sure they're compliant with laws and kind of doing your kind of standard privacy operations work.

Victor Platt  22:14 
From an engineering perspective, I think these are different kind of patterns and tool sets that can be implemented in the course of kind of regular development work. And as long as they're kind of aware of those and and they have kind of easy access to them, it's incumbent upon engineering teams to also think about how we can enhance privacy through the quality of our designs.

Victor Platt  22:42 
On the data science side, absolutely probably means a little bit something a little bit different to them is really is like how does this impact the performance of my machine learning model, for example, the data that's available to me, and I think things like data minimization are a hard conversation to have, and speaking from personal experience with with data scientists. And so I propose kind of like a reframe for interacting and communicating with data science teams. It's not like "Oh, well, we can only ask for this data because that's what it's going to be used for, and that's all we need, and we're required under GDPR to minimize the data that we collect on individuals," for example. Whereas the reframe is we have the tools to get us access to way more data to get you access to way more data if we do it in these kind of privacy-enhancing ways. And I've found that gets folks pretty excited. And, as long as the barriers to usage of those kind of PET's are reasonable, oftentimes the the value of the new data that we have access to enough of a carrot to get people to start thinking about PETs and adopting them in practice.

Debra Farber  24:02 
That makes sense. So what are some of the key barriers to adoption of federated learning and other PETs within different industries?

Victor Platt  24:14 
Yeah, for sure. And I think they're pretty common across industries. What I would say is, the skillsets needed to understand and implement different PETs, and this is largely for almost all of them, is quite high. And it's not something that's in every engineer or data scientists' mind. So, there's some upskilling, I think, that needs to happen. And then there's the maturity of the technology itself. So, a lot of these tools are currently an open source. They are libraries that you can use and build on implementing your solutions, but because they're not as mature, there's no like, well, there are a few and I was involved in at least one of them, the commercial solutions for a lot of these privacy enhancing techniques or technologies are emerging. So it's early. I see a lot of activity happening there, which will start to overcome some of these kind of skill barriers as well as technology, maturity barriers.

Victor Platt  25:19 
And I think the last and most important point, especially for folks who who think about regulations and laws with respect to engineering privacy-forward solutions, is regulatory ambiguity. Thinking about machine learning and privacy enhancing technologies, introduces a bunch of concepts and ways of thinking about data and privacy, which just aren't reflected in the current kind of swath of privacy legislation that we have in the world today. And even something as simple as is a machine learning model - can it be considered personal information? Should we be regulating it as data or is it a special class of data? How do we start to think through how we can therefore share machine learning models? Can we sell them? If they're considered personal information, are we therefore selling personal information? So there's a lot of really interesting questions that come up when you start to think about kind of machine learning and PET's, especially because they're just they're different ways of framing privacy problems that we've thought about for 20 - 30 years in a specific way and challenges us to think outside that kind of framework. And yeah, just questions like, okay, if we apply differential privacy, what value of epsilon, which is kind of the level of noise you implement in differential privacy, actually meets the bar of anonymization or de-identification in law?

Debra Farber  26:58 
Yeah, that makes sense, especially, you know, I'm thinking HIPAA, for instance. You know, you were talking a lot about the health care context earlier on, and, you know, the statutory de-identification requirement is really, you know, you get a statistician to go and certify you can't really identify the data set. Right? Like, that's one of the two ways. The other is just drop all the personally identifiable or the PHI, the personal health information, just dropped from your tables, right, which is, which is pretty restrictive, so there isn't anything in between those two. So, it sounds to me like we'd have to update HIPAA, for instance, in the United States in order to move forward with using PETs.

Victor Platt  27:40 

Debra Farber  27:41 
Do you have any kind of ideas as to how to move the needle, or how to make the argument, to update some of these laws so that we can get political will to do so?

Victor Platt  27:52 
Yeah, I think there's, basically there's a couple of fundamental questions that I think we as an industry need to answer; and once we agree and collaborate on getting to those answers, a lot of kind of legal reasoning becomes unlocked. Right? So, if we all agree, and this is what I'm kind of seeing in industry, and in my interactions with regulators from around the globe, particularly in Europe, is that we need to start thinking about machine learning models, for example, which are essentially just encodings of data as data. And if we agree that that is the case, or we treat it as like an aggregation, for example, under CCPA, or something like that; if we agree that, and we have confidence that regulators agree, we can start to apply the same types of reasoning we have in the past to those artifacts themselves. And that would be kind of my my main kind of contribution to this discussion is like, let's answer some fundamental questions and as an industry and as a society, and then use the reasoning that we've built up and the precedent that we built up over time to apply that reasoning to these kinds of new and emerging artifacts and kind of privacy considerations.

Debra Farber  29:13 
Yeah, that makes sense. I'm definitely going to be following along to see how that moves. I'm not a pessimist, but having watched regulations and hoping that there'd be some like around something as basic as data breach notification laws, you know. We can't have one in the entire country; we can't have a federal law because every state, you know, started to implement their own and no one could agree upon what would be in the federal law, and that we're now we're starting to see the same things happen with a privacy law. I don't think there's going to be a federal privacy law in the United States, you know, a comprehensive omnibus federal privacy law that supplants the state ones like CCPA and such for a long time. There's just too many things that have nothing to do with privacy that are keeping that law from moving forward. So, I do hope...I do have more optimism that by demonstrating that there are assurances that like, if you do, you know, X, Y happens. Right? And then therefore, we can prove Y is happening. That level of transparency and, again, assurance is something you can test to, something that can preserve privacy, and unlock data value for companies. I feel like political will would be there to unlock that value, right? And, I can't imagine anyone who would be fighting against the use of privacy enhancing technologies, especially since it's going to just increase ROI, right. So you've got the business lobby is going to be for it. It's going to help preserve privacy for individuals. So, you're gonna have pretty much everyone on board, I think. So, now the question is how and when, and I do know that the White House has looked at, you know, a push for privacy enhancing technologies and as asked for contribution when they put out their initial paper, I know, The Rise of privacy Tech, we responded to that. So, you know, it's government and it moves slower than you'd like. But I am optimistic, as I think you are, that, you know, this is this is going to be embedded by design into, you know, kind of technology from now on as we build it.

Victor Platt  31:31 
Absolutely. And I would agree. And my hope for legislation is equally optimistic, but I'm not holding my breath. I think a couple of kind of lights, shining lights to call out are the supplementary guidance outside of hard law that comes out. So, a couple you just mentioned, the White House has been doing work on this, NIST is doing work on this, there was just a big competition by the both the American and British governments to develop federated learning and solutions that protect input and output privacy through the PETs, which is great stuff. And then the the other one that I would call out would be the ICO in the UK has recently published a whole kind of book on how to implement, when to implement, and how to think about PETs from a regulatory perspective. So, there's stuff happening; it just takes a long time for it to work its way in into law. And so as practitioners, we have to make decisions with limited information, and I think the fact that these regulators are certainly talking about it can start to give us a little bit more confidence to articulate positions, implement them in the real world, and then hope to not get slapped on the wrist later, which I think is that kind of trepidation is holding a lot of companies back.

Debra Farber  33:00 
Is that holding back for federated learning specifically, or PETs generally.

Victor Platt  33:05 
PETs generally. I think federated learning, as well as things like differential privacy are the ones that I've had most conversations about. And certainly they're like, "Okay, yeah, we get in talking my previous role talking to customers, like, we get how this protects privacy. We believe you. We're on board. We're not confident that the regulator does. What evidence do you have that the regulator has the same perspective as yoy? And that's really a kind of a key blocker that I've experienced in trying to kind of push adoption of PETs, particularly federated learning and differential privacy in various industries. It's the lack of confidence that what we all think is right will also be agreed upon by the regulator. And, and these are highly regulated industries. So I certainly understand that. And I think that's why we need to look both to hard law, but also the different things that regulators around the globe are saying and where they're focusing their energies.

Debra Farber  34:14 
Yeah, that is fascinating. And, you know, I wonder, are there standards in development because maybe that can move a little faster and gain consensus across industries around some PETs, you know, move faster than law itself and provide, I don't know, maybe the consensus of constraints around using the technology or what is needed in order to, I don't know, implement it or systematize it into organizations? Is there anything out there like that?

Victor Platt  34:46 
For sure. There is. I'll highlight a few and then talk about a bit of caveats there. So, the first is I'm a member of IEEE, and there's a flurry of projects that are underway to start to do exactly what you just said. So let's build global standards that we can all agree on through our collective wisdom to like, how do you deploy a secure and private federated learning system, for example. So encouraging kind of stuff on the go there. The caveats there is just like regulations, these take a long time. There's a lot of parties involved. There's very rigorous processes, which are good, but they take a long time. So, I wouldn't expect to see anything, like a truly global standard, for a number of years. And then, the kind of second component is, because these are essentially general purpose technologies. The way the context in which they're implemented really matters for answering questions around, is this secure? Is it private? And so I think we're gonna have some work to do to create something that's detailed enough to be prescriptive, whereas not overly detailed such that it just doesn't apply outside of very specific circumstances.

Debra Farber  36:06 
Got it. So do you have any specific advice for developers and data scientists as they implement privacy by design into their organizations?

Victor Platt  36:16 
Yeah, I think I'm going to shamelessly plug the title of your podcast: shift it left. We should be having this discussion and thinking about how to make sure that our solutions are private as on the same level as what is this trying to achieve from a business perspective. I view privacy requirements as functional requirements in this day and age, they're not kind of buried in your design doc as non functional requirements. And so really thinking about it at the forefront; and, to also get excited about it, because we all have a stake in privacy and in kind of creating underpinning the future of technology that respects humans at the end of the day. One great kind of resource that I like to direct people towards is an organization out of the UK and at Oxford called OpenMined. They have a great set of free courses that can really help you kind of pattern match. Okay, this technology does this, why is that important? Where can it be used? And to really get excited about that it's called there, the main, the kind of flagship course is called "Our Privacy Opportunity," and it's a really great way for anyone, regardless of your background, whether you're an engineer, a machine learning scientist, or a privacy pro, or just a business leader who deals with sensitive information, go check that out. It's free and it's really interesting. And I find it does a good job of communicating both the core concepts, their application, but also the so what. Like, why is this important? And what kind of value can we achieve by implementing these types of technologies?

Debra Farber  38:04 
I'm definitely gonna go check that out. And then, all the resources you just mentioned, I'm going to add in the show notes, so anybody can go check those out as well. And before we conclude, do you have anything else you want to plug to the audience today, any conferences, talks, research, open source projects that you think might be helpful to the technical privacy community.

Victor Platt  38:25 
The big one that I wanted to mention was the OpenMined stuff. And then, anyone who's interested in diving deeper into these topics or learning more, feel free to reach out to me on LinkedIn.

Debra Farber  38:38 
Excellent. Well, Victor, thank you so much for joining us today on Shifting Privacy Left to talk about federated learning and privacy enhancing technologies that are shaping the industries around the world. And thank you for joining us today, everyone, until next Tuesday, when we'll be back with engaging content and another great guest.

Debra Farber  39:01 
Thanks for joining us this week on Shifting Privacy Left. Make sure to visit our website shifting where you can subscribe to updates so you'll never miss a show. While you're at it, if you found this episode valuable, go ahead and share it with a friend; and, if you're an engineer who cares passionately about privacy, check out Privado: the developer-friendly privacy platform and sponsor of the show. To learn more, go to Be sure to tune in next Tuesday for a new episode. Bye for now.

Victor's 'origin story' and how he got interested in PETs like 'federated learning'
Victor describes what 'federated learning' is and how it compares to traditional machine learning
Victor explains the use of federated learning 'at the edge' & example use cases
Victor explains federated learning 'across silos' & example use cases
Victor explains when an org would consider using vertical vs. horizontal federated learning or a hybrid of both
Victor outlines a key challenge in 'transfer learning': knowing whether two data sets are related to each other. He also explains how 'private set intersection' techniques can help blindly identify common rows between data sets
Victor describes the two paths to when you should consider federated learning for your use case
Victor shares his excitement for the future of technology, which he believes will be underpinned by a 'constellation of PETs'. He also distinguishes between PETs that assist with 'input privacy' vs. 'output privacy'
Victor shares his view on where the responsibility for leveraging PETs lies within an org
Victor describes key barriers to adoption of federated learning & other PETs within different industries
Victor shares how we can move the needle on updating laws that have antiquated approaches to anonymity / 'de-identification' (e.g. HIPAA) and to include the use of new PETs
Debra & Victor discuss standards in development related to PETs like federated learning, differential privacy & homomorphic encryption
Victor shares his advice for developers and data scientists as they implement privacy by design into their organizations. He recommends taking OpenMined's (free) course: 'Our Privacy Opportunity'

Podcasts we love