The Shifting Privacy Left Podcast

S2E13: Diving Deep into Fully Homomorphic Encryption (FHE) with Kurt R. Rohloff (Duality Technologies)

April 04, 2023 Debra J. Farber / Kurt R. Rohloff Season 2 Episode 13
The Shifting Privacy Left Podcast
S2E13: Diving Deep into Fully Homomorphic Encryption (FHE) with Kurt R. Rohloff (Duality Technologies)
Show Notes Transcript Chapter Markers

I am delighted to welcome this week’s guest, Kurt Rohloff. Kurt is the CTO and Co-Founder of Duality Technologies, a privacy tech company that enables organizations to leverage data across their ecosystem and generate joint insights for better business while preserving privacy. Kurt was also Co-Founder of the OpenFHE Homomorphic Encryption Software Library that enables practical and usable privacy and collaborative data analytics.

He's successfully led teams that are developing, transitioning, and applying first-in-the-world technology capabilities for both the Department of Defense as well as for commercial use. Kurt specializes in generating, developing, and commercializing innovative secure computing technologies with a focus on privacy and AI/ML at scale. In this episode, we discuss use cases for leveraging Fully Homomorphic Encryption (FHE) and other PETs.

In a previous episode, we spoke about federated learning; and in this episode, we learn how to achieve secure federated learning using fully homomorphic encryption (FHE) techniques.

Kurt has been focused on and supported homomorphic encryption since it was first discovered, including his involvement in one of the seminal projects, funded by DARPA, where he ran an implementation team, called PROCEED.

FHE, as opposed to other kinds of privacy technologies, is more general and malleable. As each organization has different needs when it comes to data collaboration, Duality Technologies offers three separate models for collaboration, which enable organizations to secure sensitive data while still allowing different types of sharing.

Topics Covered:

  • How companies can gain utility from a dataset while protecting the privacy of individuals or entities
  • How FHEs help with fraud prevention, How FHEs help with fraud prevention, secure investigations, real-world evidence & genome-wide association studies
  • Use cases for the three collaboration models Duality offers: Single Data Set, Horizontal Data Analysis, and Vertical Data Analysis
  • Comparison & trade-offs involved between federated learning and homomorphic encryption
  • Proliferation of FHE Standards
  •, the leading open source library for implementations of fully homomorphic encryption protocols

Resources Mentioned:

Guest Info:
Privacy assurance at the speed of product development. Get instant visibility w/ privacy code scans.

Shifting Privacy Left Media
Where privacy engineers gather, share, & learn

Buzzsprout - Launch your podcast

Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.

Copyright © 2022 - 2023 Principled LLC. All rights reserved.

Debra Farber  0:00 
Hello, I am Debra J. Farber. Welcome to The Shifting Privacy Left Podcast, where we talk about embedding privacy by design and default into the engineering function to prevent privacy harms to humans, and to prevent dystopia. Each week we'll bring you unique discussions with global privacy technologists and innovators working at the bleeding edge of privacy research and emerging technologies, standards, business models, and ecosystems.

Debra Farber  0:27 
Today, I'm delighted to welcome my next guest, Kurt R. Rohloff, CTO and Co-Founder of Duality Technologies, a privacy tech company that enables organizations and regulated industries to leverage data across their ecosystem to generate joint insights for better business while preserving privacy. Kurt has an impressive list of accomplishments. He co-founded the open source, OpenFHE homomorphic encryption software library that enables practical and usable privacy and collaborative data analytics. He's successfully led teams that are developing, transitioning, and applying first-in-the-world new technology capabilities for both the Department of Defense and commercial use. And, he specializes in generating, developing, and commercializing innovative, secure computing technologies with a focus on privacy and AI/ML at scale. Today, we'll be discussing the use cases for leveraging homomorphic encryption and other PETs.

Debra Farber  1:35 
Welcome, Kurt.

Kurt Rohloff  1:36 
Hi, Debra. Thank you for inviting me. I'm very happy to be here.

Debra Farber  1:40 
Excellent. I'm really excited. Last week, we learned a little bit about federated learning, and this week, I know we're going to be learning more about maybe how you could do secure federated learning using homomorphic encryption; and, you know, a few other really interesting applications. So, I want to set up the problems that homomorphic encryption solves for as a PET before we dive into what it is and how and when we deploy it. So, in most organizations, data remains largely underutilized, often restricted by privacy concerns, and a major conflict remains between gaining utility from a dataset and protecting the privacy of the individuals or entities. So, first question to you, what approach can companies take to overcome this challenge?

Kurt Rohloff  2:29 
Sure, that's a great question. We see very regularly that larger organizations, whether it's governments or companies or other kinds of entities, are typically sitting on piles and piles of data, typically, you know, in different silos; and, there's a lot of value for, whether it's different departments in a company, to be able to collaborate in ways they previously haven't been able to, or, for example, different companies or public private partnerships between companies and government to collaborate on data to unlock value. And there's, you know, of course, I don't think I needed to convince you that there's plenty of use cases for these; and I'm happy to dig into a few of them. You know, and I think you hit the nail on the head overall: that there historically has been a big tradeoff between the use of data versus privacy associated with data, where either data needs to be shared in the clear, historically, or not shared at all, and all the protections have been primarily legal production through NDAs, data use agreements, things like that, where...or lawyers typically need to get involved to write rather sometimes onerous and heavy agreements overall.

Kurt Rohloff  3:39 
And as I'm sure you and your audience is aware, there's been an explosion of various kinds of privacy technologies over the past days, weeks, years, decades to help to solve this problem overall. I've been particularly working on a set of use cases that are addressed with a technology called 'homorphic encryption,' also called 'fully homomorphic encryption,' where the base capability is to take data, encrypt it, and then this data well, if it's encrypted with these fully homomorphic encryption techniques (also called 'FHE') can then be computed on a while it's encrypted and further processed. And FHE is one of the nice, really nice things about it, as opposed to other kinds of privacy technologies, is that it is quite general and quite malleable. Where you can often take datasets from multiple sources, and encrypt different keys and still run on joint computations on this data, distribute the computations, distribute the data, and have all different kinds of interaction models and different kinds of computations that could be supported overall. And so, to kind of put a finer point on it, you know, one of the reasons I've really focused this phase of my career and Duality as technologies as a start up on fully homomorphic encryption is this very kind of general sets of functionalities...very malleable sets of functionalities enabled by FHE or fully homomorphic encryption. So I'll pause there.

Debra Farber  5:19 
Awesome. No, thank you for that. I appreciate it. So, since each organization has different needs when it comes to data collaboration - it looks like Duality offers 3 separate models for collaboration, enabling an organization to secure sensitive data whilst still allowing different types of sharing. And, you list these three separate models on your website, so how about I list each one out, and then you tell us a little more about it?

Debra Farber  5:46 
The first one is analytics of a single data set. When would you use that model?

Kurt Rohloff  5:52 
Sure thing. There's a couple different use cases for this. This is what in this privacy space we often call 'the move to cloud' type use case where someone in an organization might be running their own server - and they run their own server specifically so they can have a high degree of control and protection over a set of data, but they don't necessarily want to have a cost associated with running their own server or even the liability associated with if the server goes down, do they have adequate backups, all these other kinds of more pragmatic issues associated with running one's own set of services. And so, an organization would want to move their dataset from on-prem to a cloud environment and obtain the associated cost reductions with that. And so, that's in some sense, kind of the canonical model that's been used for fully homomorphic encryption is to take data, encrypt it, throw it up on some sort of cloud service, and then use the magic of the homomorphic encryption to maintain the same set of services, computation capabilities, and analytics on this data set as what had been run for the data when it was on-prem and not encrypted.

Debra Farber  7:07 
Great. And then the second model would be where there's an analytics of a union of datasets that you refer to as 'horizontal data analysis.'

Kurt Rohloff  7:18 
Right. Right, right, right. And we see this often in any number of use cases, domains, and I'm going to quickly look at the set of medical domains. We're running set of projects funded by the NIH (National Institute of Health) where they would want to take patient data to better analyze things called 'rare diseases.' And so, rare diseases are rather problematic in that, you know - I think the formal definition, I could look it up, but it's something like, you know, rare diseases affect something like 1 in 100,000 people over a given year. And so, these kinds of diseases, a hospital might see - any kind of random hospital might see - a specific kind of rare disease maybe once over the lifetime of the hospital, you know, once every several decades. But, the problem with these rare diseases is that if you look at all of the rare diseases over the entire population in the United States, or even entire population in the world, rare diseases writ large are shockingly common, where it actually affects, you know, multiple percentages of the entire population, whether you know, United States or the world. And so, the trouble with these rare diseases is that they might be eminently treatable if only researchers, clinicians, medical professionals could get adequately-large datasets to treat these diseases.

Debra Farber  8:47 
And is that because the privacy challenge there is just even knowing that there is one disease at one hospital, you know, that's just too much data points where you could track it back wouldn't be anonymized. You can figure out who it is if there's only like one person in a geographical area with that particular disease? Is that what makes it much more difficult to do, or is it that you need a lot more data about the general population to kind of filter out why they don't have the disease as well? Like, I'm just trying to understand the challenge there.

Kurt Rohloff  9:18 
Yeah, no, great question. And the really pragmatic question that's been driving this from an NIH perspective is that in order to have an adequately large data set to to see, like, what are symptoms, what are, you know, genetic mutations associated...leading indicators of disease is that they would need to basically join data across many different hospitals across the United States in order to, you know, come up with... have a hope of coming up with treatments for rare diseases. And so, is that really the issue...not so much the re-identification issue that you alluded to, but the really the issue of actually having adequate data to do any kind of reasonably scientifically accurate or scientifically trusted analysis of what are the correlations of mutations or otherwise associated with the prevalence of certain kinds of disease to even hope to begin treatment studies, as it is. And so this is, for example, joining datasets of patients over many hospitals, whether it's all the hospitals in the state or all the certain classes of hospitals across the country. And this combining of patient data across these many hospitals is what we call the 'horizontal join' for what it is, where you're know, the simplistic version is aggregating data over many, many sources in a way such that no one individual at any one point is able to access the population of data, for example, across the United States.

Debra Farber  10:45 
Got it. That makes sense. And then, it seems that that's where they share the same kind of data schema, right? In the horizontal data analysis.

Kurt Rohloff  10:52 
Yep, basically right. We're great...basically think about a spreadsheet. It'ss have all spreadsheets that have the same columns just over different people, and you're just adding more people to to a spreadsheet overall.

Debra Farber  11:04 
Got it. Okay.

Kurt Rohloff  11:05 

Debra Farber  11:06 
And so then the third model is analytics of join datasets, aka 'vertical data analysis.'

Kurt Rohloff  11:13 
Right. And this is a study that we did also for NIH in partnership with with Dana-Farber Cancer Institute in Boston and Harvard Medical School.

Debra Farber  11:22 
No relation to my last name.

Kurt Rohloff  11:25 
Yeah, so for this use case, in particular, you would have, for example, someone sequencing genetic data - a pile of patients would have the genetic data sequenced and, you know, just have it stored in someplace like The Broad Institute or 23andme, or someplace like that. And then, you would have a pile of clinical data, and this is data that doctors might take, like, you know, blood test data and things like that. And, if you have this dataset from from two different research centers, with basically two different schemas, but there's commonality where there might be some patients that are in common between these two datasets, how do you join them in a way such that the patients that are common to both their data basically lines up, so you can start to add in columns to data as opposed to adding more rows to data. And this is also what we call the vertical join problem and this is prevalent, not just in medical, but also financial services and other kinds of related areas, too. So, you know, that has been kind of a very big growth area for us also.

Debra Farber  12:27 
Awesome. So, help me disambiguate in my mind. Last week, we learned about federated learning and that you could have vertical federated learning as well as horizontal federated learning. And here, I see the terms, you know, vertical and horizontal data analysis. How would you compare the terminology, and is it have to do with the fact that it's on an encrypted data set, or is it something else?

Kurt Rohloff  12:50 
That's an excellent question, Debra, and there's a few aspects to this. And this gets a little bit more about why I'm particularly a fan of homomorphic encryption, which is basically that the homomorphic encryption techniques do allow these kinds of analysis that I outline. They also allow functionality for even training of machine learning models and things like that. And, the generality also comes where the actual training and analysis computation. I've outlined a model where data is aggregated from multiple sources and the aggregation could happen with a couple of different security models. It could happen, that the data is aggregated where all the data is encrypted with the same key. It could happen that the data is aggregated, an enriched data set is encrypted with different keys. And so, for example, you know, Harvard Medical School could have one data set that they encrypted with their key and only they can decrypt and Broad Institute or Dana-Farber could encrypt with their key set, and only they can decrypt that data. But, these two encrypted datasets could be joined together and then computed upon also; and so, allows this kind of very fine-grained set of guarantees that that data can't migrate from Dana-Farber to Harvard Medical and vice versa. And, the computation doesn't have to be at a single point. The computation could be distributed across various nodes - for example, some at Dana-Farber, some at Harvard Medical School; and, it gets very confusing, because, in some sense, all the various options that we present for computation models that allows us this really broad sense of generality for what it is, in terms of how homomorphic encryption goes beyond some of the limitations of other kinds of PETs operations, like federated learning. And of course, there's tradeoffs. I'm happy to talk about those too.

Debra Farber  14:45 
Yeah, that was my next question. I would assume the tradeoff is the type of data that you're able to extract or compute. Am I correct in that assumption? And then, when would you, besides the access control aspect of like who has the key and who's has access to the raw data, why would you use secure federated learning on encrypted data rather than typical federated learning? Like why use fully homomorphic encryption? And the second part of that is, if you're using it, what are the tradeoffs?

Kurt Rohloff  15:14 
So, one of the things that, you know, the bit of a truism in life is that 'there's no free lunch.' And, as we also also like to say is kind of like, you know, "a privacy technologist is like a cook." You know, when you want to cook a meal, you use whatever tool you have in the kitchen to get it done. And the tradeoffs between federated learning versus homomorphic encryption is...yes, homomorphic encryption has all these nice generality properties to it that you can, in theory, support general computation. Homomorphic crypto is quite good when you're dealing with highly structured, tabular know, the kinds of things you could put into a spreadsheet and compute on, you know, over things that look like matrices and things like this.

Kurt Rohloff  15:56 
The other side of it is that federated learning-type techniques, they don't quite have the generality homomorphic encryption but what they do well they do relatively quickly, as compared to homomorphic encryption. So...and this is something that's known about homomorphic encryption in the broad is that there are slowdowns associated with it, and it is computational...can be computationally very heavy. I don't think it's anywhere near as bad as it definitely used to be, and I can talk theoretically about the history of where it where it is, and where it's been, and where it's going. But, like I said, it does come with its own computation workloads, but you also get these extremely high levels of security with these cryptographic mathematical guarantees; and, it's even what we call 'post-quantum,' meaning resistant to quantum computing attacks. So, very, very, very high-degrees of security and generality associated with it, but of course, it comes with the cost of computational load and speed and things like that.

Debra Farber  16:54 
That makes sense. I would love to hear, you know, how it started and how it's going because the last I had really dived deep into this space, specifically around homomorphic encryption has's been awhile, and it was generally thought of as really great in the lab, but maybe not ready for primetime. I know it's a few years ago, so like, I'm not talking about like, you know, last month or anything.

Kurt Rohloff  17:15 

Debra Farber  17:16 
Yeah, I'd love to understand your perspective.

Kurt Rohloff  17:18 
Yeah, no. So, I've been supporting homomorphic encryption since it was really first discovered, and I got involved in one of the seminal projects for this...was a project that DARPA funded that...I was running when the implementation teams for DARPA called 'Proceed.' And, you know, when this first happened...the very, very first implementation of homomorphic encryption that I'm aware of was was done by two gentlemen, phenomenal researchers by the names of Craig Gentry and Shai Halevi. And their first implementation of fully homomorphic encryption, they took basically two bits, encrypted two bits, and then ran a bitwise AND operation in a special operation called 'bootstrapping' and just this encrypted bootwise AND operation with bootstrapping took an entire half hour of wall-clock time, which is just crazy, you know, computational load.

Kurt Rohloff  18:10 
And what PROCEED did - this DARPA program - was really to focus on accelerating that by many orders of magnitude. And over the course of four years in this DARPA PROCEED program where a lot of the real seminal work for homomorphic encryption was done both in theory, design, implementation, and so forth, we improved performance over 4 years by 6 orders of magnitude. So that's, you know, a million times faster over the course of 4 years, so faster than Moore's Law performance improvement. And at the end of the program, we were able to deploy what was then the cutting edge of homomorphic encryption on stock iPhone 5Ss - so 2013 timeframe, exactly 10 years ago.

Kurt Rohloff  18:58 
And what we were able to do was to run encrypted 'voice over IP' operations. We were able to encrypt audio conversations like the one you were I are having right now on the stock iPhone 5S's and then start doing teleconferencing with audio quality comparable to what what we're hearing right now on this call back in 2013 with homomorphic encryption, which is first of all, I think it's a wonderful data point of how...yeah, homomorphic encryption has a slowdown. but it's really pretty darn good to be able to do voice over IP on commercial data networks with no real perceptible lag, at least to a human ear. And that, like I said, was 10 years ago.

Kurt Rohloff  19:39 
Since then, it's been leaps and bounds where we, you know, about 4 or 5 years ago, we did this study for NIH where you're basically taking real-world workloads from Dana-Farber Cancer Institute and we're running these operations for things like 'high-squared analytics' and whatnot, which know, computations actually used for gene-wide association studies and running in the order of seconds and things like that. And we're currently in this commercial sets of activities that we're doing a Duality, where we can start doing things like running encrypted queries on data with responses running, you know, hundreds of queries and responses on the order of seconds and things like this. And there's even, you know, the future beyond this, which is to...we're we're currently working on a project for the DOD that is running convolutional neural net training, you know, training of neural nets, on encrypted imagery data and the vision is that we're going to be running at line-speed with no slowdown at all on encrypted imagery data.

Debra Farber  20:45 

Kurt Rohloff  20:46 
We've gone like tremendously faster, and there's a heck of a lot left to go. That's really just in the making right now.

Debra Farber  20:55 
That's fascinating. And what determines increasing the speed? I mean, is it just is it a matter of hardware? Or is it software?

Kurt Rohloff  21:03 
It's a little bit of everything. You know, a lot of it came from just better crypto protocols, better math. A lot of it came from...a lot of homomorphic encryption is an encryption scheme, which means you have to generate a lot of high-entropy noise, a lot of random numbers. And, we think special kinds of random numbers with with kind of unusual distributions, and we figured out how to implement random number generation for this special kind of random numbers very, very, very quickly. And that was huge for us.

Kurt Rohloff  21:33 
And then there's a lot of software of algorithm design of how to parallelize. And the other side is understanding the workloads better. One thing that's really interesting about homomorphic encryption and about privacy technologies in general is they're basically like a fancy compute engine, where homomorphic encryption has instruction set - you know, similar to the instruction set that you would have for like an Intel chip or something like that - but the instruction set is kind of unusual. It's something that looks like a vectorized addition and something that kind of looks like a vectorized convolution. And you need to use these unusual operations to do all different operations that I talked about, like neural net training, logistic regression, high-squared, analytics, and so forth. And so, the mapping of how one takes existing workloads and puts that into things that would run on homomorphic encryption has been a huge deal overall. And understanding how to do that much, much better has been a huge push in further performance improvements associated with applied homomorphic encryption, and how we just optimize workloads upon on top of that FHE.

Debra Farber  22:42 
Wow, yeah, and I appreciate the detailed answer because I know there's plenty of listeners who who are know, you're scratching their itch for information at a technical level. So, let's dive into some of the use cases for homomorphic encryption. I know that a Duality you focus on organizations that are highly-regulated, and honestly as a initial sector to go after that's a pretty common one for enterprise and I think it makes a lot of sense, since there's such a great need there. And I know you mentioned a little bit about, you know, a healthcare use case before.

Debra Farber  23:17 
I'm gonna list out some of the use case categories, and I'd love for you to expound upon each one if that's possible?

Kurt Rohloff  23:25 
Yeah, happy to Sure.

Debra Farber  23:26 
Okay. So first, let's start with financial services. You say that it helps with fraud prevention, anti-money laundering, and trade financing. So how about let's start with fraud prevention. Is a homomorphic encryption by itself or with in combination with some other PETs?

Kurt Rohloff  23:41 
Fraud is a many-headed beast. And there's all different kinds of fraud, and there's all different kinds of fraudsters, you know, criminals that perform fraudulent activities. Kind of the very simplistic one is basically scammers who will basically might send you email or text message or things like that, the proverbial 'Nigerian prince.' Typically, the scammers, they don't go after just one individual; they go after many different individuals to try to victimize them. And, typically that, you know, in order to kind of get around financial reporting laws, they maintain many bank accounts across many banks, sometimes all the same address, sometimes all with the same phone number, and they really know no geographic boundaries for what it is. And so, what will happen is that these fraudsters will typically, you know, take money from someone, you know, the proverbial, you know, grandmother sitting at home and getting, you know, scammed by someone, and as financial crime investigators - whether it's within a bank or a credit card company or in law enforcement - will want to ask the banks information about where these accounts are based, you know, what addresses are associated with them, what phone numbers, and even little bit beyond that, how many accounts are at this address? How many accounts are affiliated with a certain kind of phone number?

Kurt Rohloff  25:11 
And so typically, what happens is that when law enforcement or fraud investigator or something like that, you know, wants to run these investigations, these investigations, you know, at least at banks will span multiple banks. And so, the banks might coordinate to perform these investigations - both to fight fraud and also to reduce their loss, financial losses to scammers. So, they want to collaborate on these kinds of things but don't necessarily want to reveal customer information, you know, legit customer information to their notional competitors. So, what will happen is that a fraud investigator at a bank might run queries to colleagues at other banks to say, "Hey, you know, do you have any accounts at this address, or do you have any accounts that that address?" without wanting to reveal any information about legit accounts. And so, we have a set of capabilities that's used in financial services to fight financial crime in general that allows a fraud investigator to take a query - you know, a query about an address, a name, a phone number, things like that - encrypt that query, send the encrypted query off to...potentially to a competing bank or even a different division and a bank and I'll get into that in a little bit - and then, you know, get back results without revealing to the other bank or the division what the subject of investigation is. And then, you know if they get sufficient hits, then they have evidence to start, you know, operating in the clear and coordinating in the clear with other banks, other divisions to do basically, de-bank - to remove the scam accounts from their books and reduce the ability of the scammers to perpetrate their crimes.

Debra Farber  26:55 
Awesome. And so then the next category is government. I assume that there'd be like a large number of use cases that you'd be working towards, but here you have secure investigations. So, is it that that's such a large area that the government is just you know, prioritizing over others; or is that really just, you know, kind of the best use case for fully homomorphic encryption?

Kurt Rohloff  27:18 
It's a great early use case.

Debra Farber  27:20 
So, there's probably going to be more?

Kurt Rohloff  27:22 
Oh, yeah.

Debra Farber  27:22 
Oh, okay.

Kurt Rohloff  27:24 
I'm very proud that the work that the team has been doing for DOD and DARPA over the years, and there has been a number of government kind of motivations associated with this, including moving to the cloud and, you know, coordination across government entities to have better, for example, policy positions and grant-making activities, you know, census, the famously...Census Bureau rather famously started using privacy technologies and things like that for privacy protected analytics and things like this. But, you know, fraud analytics has been a pretty big deal for us overall, associated with secure investigations.

Kurt Rohloff  28:01 
And there's, there's actually like a big, you know, financial motivation, like a tax efficiency kind of motivation for this, where, you know, right now, if federal law enforcement or just law enforcement in general wants to run an investigation, typically they will have to go and get something like a warrant from a judge to run investigation. And this will often take days, weeks months because there's, you know, some of the privacy implications associated with asking questions about, you know, certain addresses and what information that might reveal and things like this. So what we can enable law enforcement to do is to reduce the legal burden associated with with asking banks information so that they can basically move much more quickly and more cost effectively. So they don't necessarily have to involve as many lawyers, they don't have to necessarily spend as much time in front of a judge, and also for a high-degree of really operational security where when they run and send a query over to a bank, if there were organized crime insiders within a bank that organized crime insiders won't necessarily get ticked off about what law enforcement is trying to ask about so that both the law enforcement move more quickly and more skillfully against financial crime perpetrators. And so this has been, you know, a big win-win all the way around in that, you know, less tax money goes to being spent on these investigations; the investigation has moved much, much more quickly; and the investigations are at much less risk of tipping off and organized crime about the existence of these investigations also. So, you know, there's there's a multitude of benefits for using privacy technologies in these ways.

Debra Farber  29:51 
That's fascinating. And I think everybody loves when the government is saving our money.

Kurt Rohloff  29:58 

Debra Farber  29:59 well as de risking and respecting privacy, so like, huge win. And then, the third category and use cases that you've deployed of these technologies is you have as 'real-world evidence' and then, I think we talked a little bit about healthcare before, but maybe in this context that is more specific, 'genome-wide association studies.'

Kurt Rohloff  30:19 
Right, right. Real-world evidence, also called 'RWE,' is kind of a really interesting one. And this, this also gets, again to things like governmental efficiency and things like this. So, what happens is that when the FDA wants to approve a medication for use, they have the best interest of our citizens at heart and they really only want to approve medications for public use if and only if it's been shown to be safe and effective for the broad population. And so, what happens is that pharmaceutical companies and otherwise spend a tremendous amount of time basically gathering evidence associated with the effectiveness and safety of medications. And so this results in the aggregation of a lot of really sensitive medical information, a lot of really sensitive demographic information also about individuals.

Kurt Rohloff  31:16 
And, it's no secret that often that once medicine is approved, it's sometimes later found to potentially be effective for other kinds of diseases. And we'll see this often, you know, with where there was a lot of trying, you know, trial and error associated with to see what what kind of medications in the early days of the pandemic might have been effective against COVID-19, for example, but what happens is that, like I said, these these processes are extremely time consuming because they much data has to be to be be collected. And so, there's always been this really interesting question, "How can data be reused?" That if data, for example, has been used to prove the efficacy and safety of a heart medication, can we also test if the same medication might be useful for some generic, I don't know, dermatological disorder or something like this. And, rather than going in generating new data associated with people that might have a certain skin condition, we can start to reuse the data using this notion of real-world evidence in a privacy protected manner to develop treatments for, you know, a skin condition and things like this much, much more quickly, and get an effective medication out to market much more quickly to...for the betterment and well being and health of our fellow citizens.

Debra Farber  32:37 
That's compelling, especially as I'm dealing with an eczema flare up as we speak. Yeah, I know, probably TMI, but, you know, I just think it's relevant. I don't think it's going in any random database right now. But those are really compelling examples. And so, it makes me wonder, are you seeing like a proliferation of standards all over the place to, you know, each of these verticals, you know, like financial services within the government at NIST, you know, the healthcare space? Are there standards being developed so that there's a systemized way and a best practice of deploying these technologies?

Kurt Rohloff  33:15 
Uh, yeah, their has been. There actually has been quite a bit of movement lately. One thing that's really interesting about standards is, you know, first off, I'm gonna say things that might seem kind of obvious when I say them, but sometimes there are epiphanies for engineers that are kind of like down in the thick of things. You know, standards are really about how to enable interoperability and trust in technologies and that things will work and whatnot. And this is especially important for privacy technologies because if you think about privacy technologies, privacy technologies inherently are supposed to secure sensitive information. And so, if you want to secure sensitive information, the best way to do that is to use trusted technologies that have been vetted, whether it's through academic peer review or government review or whatnot.

Kurt Rohloff  34:04 
And there's kind of 3 aspects of that. And like for one, you know, we have Duality, we only use encryption protocols that have gone through very intense peer review in going through the academic publishing process; we only use open source implementations of cryptography; and, we only use security settings of the cryptography that had been vetted and moved into international standards. And these 3 things are just so fundamentally critical for trust and privacy of operations on very, very sensitive data. And the other side, of course, as I say, is interoperability. The notion of using privacy technologies is really about sharing and collaborating on data in some sort of fundamental way, which leads directly in a kind of like a bright flashing line to interoperability, and of course the need for standards for interoperability also.

Kurt Rohloff  34:56 
So, there's been a number of rather very healthy standardization processes that have been ongoing. You know, probably the earliest one for homomorphic encryption has been an organization called, which we co-founded with the teams from IBM and Intel and Microsoft, that has since grown. And we have a standards meeting every every 6 months or so, and our next one is actually going to be in March in Seoul, South Korea, posted by Samsung and Seoul National University. And, it's been a very, very healthy activity with participation from large industry, from startups, from government, and so forth. And that really sparked a number of other efforts, where there's been engaging interest from from NIST and NIST as part of the Department of Commerce in the United States is the national standards body for the United States, and they really only get involved in standardization processes.

Debra Farber  35:50 
Sorry to interrupt. So, when you say the United States, what we're talking about is for the U.S. agencies that the US government oversees. Just wanted to make that clear to the audience.

Kurt Rohloff  36:00 
Um, yes, and no, actually. I'll come back to it, actually, in a little second. You actually make a really nice point that's a little subtle about that, but I'll come back to that in a second.

Debra Farber  36:07 
Okay, great.

Kurt Rohloff  36:08 
But you know, also, there's been a push in ISO and IETF and ITU, which are 3 major standards organizations that do drive international standards associated with privacy, security and other kinds of things. And there was recently a draft ISO standard, specifically for fully homomorphic encryption that came out of the activities. So you know, really great stuff because there are these international standards bodies.

Kurt Rohloff  36:36 
You know, to come back to NIST and I think this is great the asked about this, NIST is National Institute of Standards, technology is a component part of Department of Commerce in the United States. NIST basically defines the public standards for cryptographic protocols notionally for U.S. government, but it's basically NIST standards drive, you know, for lack of a better term rule commerce, that NIST standards are basically taken as the gold standard internationally. They're taking the gold standard for, you know, even areas that they don't necessarily have a regulatory domain over. And so, NIST being kind of perceived as the end all be all of standards for highly, highly, highly trusted technologies. And they've recently put out things like post-quantum crypto standards and things like that, and have started then...basically started the process of engaging in privacy technologies such as homomorphic encryption, secure multi party and zero knowledge proofs, and a few other things related to that also, which is great, because you know, when NIST gets involved, that means basically, they only get involved if they think that there is real commerce behind something; and, so it's a great indicator that the market is really picking up when your organization as big and as heavy as NIST really takes close attention to something.

Debra Farber  37:55 
Absolutely. And you also make a great distinction there with a difference that that is important that they are not just the standards org. They are, you know, setting the standard and a best practice that is picked up not only in the U.S., but can be globally as a standard to build towards. I guess, what I was trying to get at is that it is almost like the internal policy for all the departments and agencies within those departments that when NIST puts a standard in place, they're compelled to follow those standards as if it's their company mandate. Whereas, it could be put in contract with someone who's doing business with the government that you have to comply with them, or it could be put in private sector contracts that someone's conforming to NIST standards, but it's not like a force of law, the requirement for the most part, and I guess that was the distinction I was trying to make. But, you make an excellent point, too.

Kurt Rohloff  38:47 
No point very well taken that you know, NIST in some sense, I think the...I forget the exact terminology that you use, but what's coming to mind, you know, is that it's not so much NIST standards, always. They're often NIST recommendations, like for privacy frameworks and things like that. So, you make an excellent point in those regards. Thank you.

Debra Farber  39:06 
Yes. In fact, at one point, there was more of...I forget what it was, but when they came out with their guidance on privacy engineering, it ended up being more of a guide to, you know, as opposed to a requirement-style like framework, risk framework, and that it seemed almost like the initial body of knowledge as they were building towards embedding those concepts and evolving what privacy engineering means into, you know, future frameworks and standards.

Kurt Rohloff  39:33 
For sure, for sure.

Debra Farber  39:34 
So you're right. There's always these different things they put out that have different forces of law, so to speak. Great. Well, so just 2 more questions for you. I mean, the first is I'd love to know about OpenFHE and just a little more as to who led the efforts to build it, for what purpose, and then, you know, what are the what are the benefits of others using OpenFHE.

Kurt Rohloff  39:57 
So, OpenFHE - find the website is really the leading open source library for implementations of fully homomorphic encryption protocols. It's a project that I've been involved with for the past decade, and I was one of the co-founding people behind it, including my long-time research collaborator, research partner, Yuri Polyakov, and David Cousins and several others. It really grew out actually spiritually, and actually practically, out of this very early DARPA PROCEED program I referred to earlier. I've been very fortunate that...and I think it's actually very, very fortunate for us in general, that, you know, DOD and DARPA in particular has been very interested in the propagation of privacy technologies. And there has been a series of DARPA funded efforts that has led to increasingly higher quality implementations of and higher performance and higher capable implementations of fully homomorphic encryption.

Kurt Rohloff  41:01 
Library was something that was called Palisade that that we worked on for many years. A year or two ago, we decided to basically re-engineer what was Palisade, which is also a homomorphic encryption library - an open sourced one. And, with collaboration with the team from Intel, and the team from Samsung, and also some participation from MIT and a few others, we rebuilt the Palisade into OpenFHE as an open framework that supports all the major fully homomorphic encryption protocols. It's really intended to be a production-ready implementation of way more than encryption, one of the troubles with a lot of the prior open source implementations and still actually, you know, the vast majority of current open source homomorphic encryption implementations is that the libraries are really more research projects, you know, driven by the intent to kind of write a few papers and something like that. Not really hard in production-ready libraries. And, what we want to do with Palisade and OpenFHE, you know, typically, with a lot of very generous support we've been getting off from our sponsors is to build an open source library that that is production-ready; that is performant; that that is ready for use; and, in some sense, you know, I say this somewhat tongue-in-cheek, to be as boring as possible because it just works. So OpenFHE is the genesis of that...or is the result of that, excuse me, for a general set of fully homomorphic encryption implementations that are standards compliant; that can be tied in with hardware accelerators; that have general high-level SDKs that conform to to standards so that they're basically just easy-to-use in dropping for use cases as much as possible for privacy technologies.

Debra Farber  42:45 
That's pretty awesome. What other resources would you recommend to privacy technologists and data scientists who want to learn more about this space and, you know, any training or fellowship opportunities or upcoming conferences or community groups to plug into?

Kurt Rohloff  43:00 
There's a lot. This is one thing that's actually kind of fun about the privacy community in general, is there's a lot of stuff going on and there's different ways of getting into the community. And I'm very much in the applied space, for example, implementing privacy technologies and applying privacy technologies, and there any number of open source projects that exist for this. You know, there's there's all kinds of some open source projects for secure multi party and federated learning. OpenFHE is course a big one for homomorphic encryption. And then there's also work on libraries that layer on top of libraries like OpenFHE for the integration of these capabilities.

Kurt Rohloff  43:40 
And so, for example, Google recently put out what they call a 'transpiler,' which takes C++ programs and compiles them to circuits that run on top of OpenFHE. And this has been a real fun project for us to collaborate on with with Google, too. You know, there's piles of academic work in the space associated with writing papers associated cryptography. And, my partner and co-founder, Vinod Vaikuntanathan and Shafi Goldwasser at MIT and Berkeley respectively, know, have been very big in this space. And, you know, if you want to work in academic cryptography, you know, math is a huge deal for that. I'm more of a systems person, and so my background is more in engineering and getting things built and making things run fast; and so you know, experience with embedded systems is a huge deal for actually applied cryptography and implementing that stuff.

Kurt Rohloff  44:35 
And then, even at the application layer for the, you know, the formulation of privacy tools from privacy technologies, there are plenty of events like that, like the PETs conferences and whatnot. One that's very near and dear to my heart is a workshop called WAHC - Workshop on Applied Homomorphic Cryptography that I've been running with with friends and colleagues for the past 10 years now, give or take; and we're always affiliated with or academic conference called CCS, which is one of the leading cybersecurity conferences. So, that's a big one that always has a lot of good papers and things like that, too. Admittedly, more on the academic side,but we do talk quite a bit about implementations and software and how software privacy should and could work.

Debra Farber  45:22 
That's awesome. That's actually a more full answer than I was expecting, which I think is going to be helpful to, you know, how do you plug into the space? Like, how do you you know, maybe even change careers from one focus on, you know, within privacy to another. And, you listed some really great entry points. So, thank you for that, and thank you for your time today. This has really been a fascinating conversation.

Kurt Rohloff  45:43 
Debra. Thank you very much. This has been a very fun conversation. And, one of the nice things about having an unusual name like I do is I'm very easy to find online; and, if any of your listeners want to reach out, I'm happy to chat.

Debra Farber  45:54 
That is right. And I will put your contact information in the show notes along with some of the many resources you listed. There's only so much space that they give me in the platform. But, there's so much information here. So, I'll also say that in the transcript, I'll have hyperlinks to definitely all of the resources that are stated here to make sure everyone, you know, can easily find it. You know, thank you, Kurt.

Kurt Rohloff  46:19 
Thank you.

Debra Farber  46:19 
We'll be back with engaging content and another great guest.

Debra Farber  46:25 
Thanks for joining us this week on Shifting Privacy Left. Make sure to visit our website where you can subscribe to updates so you'll never miss a show. While you're at it, if you've found this episode valuable, go ahead and share it with a friend. And, if you're an engineer who cares passionately about privacy, check out Privado: the developer-friendly privacy platform and sponsor of this show. To learn more, go to Be sure to tune in next Tuesday for a new episode. Bye for now.

How and why companies have been struggling to find ways to conduct analytics while preserving privacy
Introducing 'homomorphic encryption'
Duality Collaboration Model 1: Analytics of a single data set
Duality Collaboration Model 2: Analytics of a union of datasets (Horizontal Data Analysis)
Duality Collaboration Model 3: Analytics of join datasets (Vertical Data Analysis)
Comparing 'federated learning' to 'federated analysis' using FHE
Tradeoffs involved with FHE
Kurt describes his work with DARPA's PROCEED program using FHE a decade ago
FHE Use Case 1: financial services - fraud prevention
FHE Use Case 2: government - secure investigations
FHE Use Cases 3 & 4: real-world evidence and genome-wide association studies
Proliferation of FHE Standards
Kurt discusses, the leading open source library for implementations of fully homomorphic encryption protocols
Resources and conferences that Kurt recommends to privacy technologists and data scientists who want to learn more about this space

Podcasts we love