S2E31: "Leveraging a Privacy Ontology to Scale Privacy Processes" with Steve Hickman (Epistimis) Artwork

The Shifting Privacy Left Podcast

Shifting Privacy Left features lively discussions on the need for organizations to embed privacy by design into the UX/UI, architecture, engineering / DevOps and the overall product development processes BEFORE code or products are ever shipped. Each Tuesday, we publish a new episode that features interviews with privacy engineers, technologists, researchers, ethicists, innovators, market makers, and industry thought leaders. We dive deeply into this subject and unpack the exciting elements of emerging technologies and tech stacks that are driving privacy innovation; strategies and tactics that win trust; privacy pitfalls to avoid; privacy tech issues ripped from the headlines; and other juicy topics of interest.

All Episodes

The Shifting Privacy Left Podcast

S2E31: "Leveraging a Privacy Ontology to Scale Privacy Processes" with Steve Hickman (Epistimis)

October 10, 2023 • Debra J. Farber / Steve Hickman • Season 2 • Episode 31

This week’s guest is Steve Hickman, the founder of Epistimis, a privacy-first process design tooling startup that evaluate rules and enables the fixing of privacy issues before they ever take effect. In our conversation, we discuss: why the biggest impediment to protecting and respecting privacy within organizations is the lack of a common language; why we need a common Privacy Ontology in addition to a Privacy Taxonomy; Epistimis' ontological approach and how it leverages semantic modeling for privacy rules checking; and, examples of how Epistimis Privacy Design Process tooling complements privacy tech solutions on the market, not compete with them.

Topics Covered:

How Steve’s deep engineering background in aerospace, retail, telecom, and then a short stint at Meta, led him to found Epistimis
Why its been hard for companies to get privacy right at scale
How Epistimis leverages 'semantic modeling' for rule checking and how this helps to scale privacy as part of an ontological approach
The definition of a Privacy Ontology and Steve's belief that all should use one for common understanding at all levels of the business
Advice for designers, architects, and developers when it comes to creating and implementing privacy ontology, taxonomies & semantic models
How to make a Privacy Ontology usable
How Epistimis' process design tooling work with discovery and mapping platforms like BigID & Secuvy.ai
How Epistimis' process design tooling work along with a platform like Privado.ai, which scans a company's product code and then surfaces privacy risks in the code and detects processing activities for creating dynamic data maps
How Epistimis' process design tooling works with PrivacyCode, which has a library of privacy objects, agile privacy implementations (e.g., success criteria & sample code), and delivers metrics on the privacy engineering process is going
Steve calls for collaborators who are interested in POCs and/or who can provide feedback on Epistimis' PbD processing tooling
Steve describes what's next on the Epistimis roadmap, including wargaming

Resources Mentioned:

Read Dan Solove's article, "Data is What Data Does: Regulating Based on Harm and Risk Instead of Sensitive Data"

Guest Info:

Connect with Steve on LinkedIn
Reach out to Steve via Email
Learn more about Epistimis

Send us a text

Privado.ai
Privacy assurance at the speed of product development. Get instant visibility w/ privacy code scans.

Shifting Privacy Left Media
Where privacy engineers gather, share, & learn

Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.

Steve Hickman: 0:49

You can infer, for example, somebody's race just based on their zip code. If you're making decisions about who can get credit, the federal law on credit is that you can't use race as a criteria. But, you might end up completely, incidentally because the ML you might incidentally be using things that are proxies for that; and so, if we have tools that can identify these risks in the models, now we can start to see, "can we develop workable laws that actually achieve our goal of privacy, as opposed to what we're doing now. Because what we're doing now, particularly with the advances in ML, it doesn't work anymore. It's not achieving its goal.

Debra J Farber: 1:38

Welcome everyone to Shifting Privacy Left. I'm your host and resident privacy guru, Debra J Farber. Today I'm delighted to welcome my next guest, Steve Hickman. He's the founder of Epistimis, a privacy-first process design tooling company, and Epistimis provides tools that can check your process design as the privacy rules change, so you know if you need to fix your process before the rule changes go into effect. Today, we're going to talk about the need for having a "privacy ontology, in addition to a privacy taxonomy, and the types of privacy modeling tools that can solve current privacy scalability problems. Welcome, Steve.

Steve Hickman: 2:20

Thank you for having me.

Debra J Farber: 2:22

I am so glad you're here. I know we've had previous conversations and I've really been fascinated with your approach to thinking about the problems with scaling privacy today in large organizations. I know, most recently, you worked at Meta and you also left to start Epistimis to scale privacy. Why don't we start off with a little bio from you? You have such an interesting background as an engineer. Just give us an overview of your background and how you came to focus on privacy process design rules at Epistimis today.

Steve Hickman: 2:58

Okay, sure. I have a confession to make. I've done so many different things. I have to read my own resume to remember them. So, I worked - it turns out that that matters because I worked for quite a while as a consultant in a lot of different industries; and one of the things that comes out of that is this you begin to see the same problem in many different ways. You know, I worked in - besides aerospace, I worked in retail, helping Target set up their first website, and I worked in telecom a number of years, and pollution control, and many other things. What happens is, when you see that many different things, you start to see the abstractions behind them; and this is where ontologies come in, because ontologies are about conceptual abstractions.

Steve Hickman: 3:55

I did a lot of different things in a lot of different industries. I ended up - actually just before Meta, I ended up in aerospace. I was working at Honeywell and I was the Technical Lead on a project for the U. S. Military; and that actually ended up providing a lot of the technical foundation for what we're doing at Epistimis. Now, what does that mean? Well, two things. One, your tax dollars at work, so that's good, I guess; and, the other one is it's "military grade, if that matters to you. So after that, then I went to Meta. It was kind of interesting that I was being recruited by them and I did not have a Facebook account. Just so you know, I'm kind of a private person to begin with. I did not actually have a Facebook account until I had to get one to accept my job. But, I was being recruited and it so happened that the person that was talking to me initially was the director of the privacy organization. I was just very frank with them. I said, "here's where I'm at. I'm not really interested in working at Facebook unless it's on privacy, because privacy matters a lot to me." and part of that, in my career my prior life, is just. . . I guess it's part of it being an introvert.

Steve Hickman: 5:17

I'd also gone to law school while I was at Honeywell, and my focus there was on intellectual property. I'm not a practicing lawyer. I had become aware of things from a legal point of view then, so when I was being recruited, if I'm going to work here, this is the organization I'm going to work in. So, I went and started working in their privacy organization, and all of the background that we have informs each step that we take in life. So, as I was looking at what was being done in the privacy organization, the problems that they were attempting to solve, all of this background of working in many different industries and dealing in abstractions and working then on this military project - which also had quite a bit of abstractions - and the tooling to support them, I looked and I said, "If you're going to solve this privacy problem, you really need all of this stuff. And so I spent some time there and tried to help them as best I could.

Steve Hickman: 6:25

Meta is a very large organization. I worked with a lot of people who are very smart. They have a lot of momentum. That momentum was not going in the direction that I thought was necessary to solve the problems that they needed to solve, and so I thought, "well, I'll just go solve this on my own. And that's where Epistimis came from.

Debra J Farber: 6:48

Thank you. That really does help connect the dots, because we weren't doing privacy for many years. Right? But, bringing your entire, varied background to the privacy problem, it's amazing how you were able to do pattern recognition to understand what are the problems you're continuously seeing and then apply it to privacy. I know we're talking in generalities because we haven't gotten to the meat of the conversation yet, which is privacy ontologies, but let's see, before we get to the exciting solutions that you're working on related to privacy tooling, privacy by design and engineering, let's unpack some of the major challenges that you've seen when it comes to scaling privacy. So, in your opinion, why has it been such a challenge for companies to get privacy right at scale?

Steve Hickman: 7:35

Fundamentally, I think the challenge comes from the fact that software developers like to write code first, and privacy (if it was thought of at all in the past) was an afterthought. As a result, there's also this other tendency that most software developers have; and that is that they don't bother to document their code. What happens then is that you lose all of the semantics, all of the thought process that was going on when the code was being written. A lot of that information gets lost once the code's written. That developer walks away. They're doing something else. It's not well- documented. When you look at that code and you're trying to figure out "what does this mean, that becomes difficult. What you see, and this is what I saw at Meta, and you see the same thing with tools out there - Privado is one, and there's other companies that do this data mapping where companies will try to reverse engineer - they'll do code scans and try to reverse engineer and identify. "Here's what we think the semantics are.

Steve Hickman: 8:47

In Meta's particular case, it was a real challenge because they have multiple technologies. The front end's written in JavaScript. The back end's written in Python. There's some C++. Then, they bought Instagram and all of Instagram stuff is written in Rust. And, there's a little Java. When you're trying to do this analysis, not only do you have to scan the code and try to figure out what it means in one particular language, then you have to try to connect the dots between "okay, this function is written in JavaScript and then stuff gets written into a data store and it's read out in Python. . . and what concepts?

Steve Hickman: 9:28

When you're trying to track a concept through this entire data flow model, it's difficult because of all these language and technology changes. They were putting a tremendous amount of work into this and trying to account for this while the code's being written. Counting for it while the code's being written is better than trying to backtrack, but the conclusion that I came to was it's really better just to do it right the first time. A phrase that I like to use with people is "if you don't have time to do it right, how will you ever have time to do it over?"

Debra J Farber: 10:09

It makes a lot of sense, whether you're coming to market and you're using VC funding or you want to build it right. You don't want to just build it just for product- market fit, and get out there and then find out that you have to re-architect everything because then you didn't plan that into your product roadmap necessarily, and it just becomes compounded technical debt that someone eventually has to address. Or, it's an uphill battle because everyone wants to work on revenue generation rather than cleaning up problems in the architecture that you created because you didn't think about things early on. So, I totally agree.

Steve Hickman: 10:46

Yes, and privacy has a unique set of problems in terms of technical debt, because there's not just standard technical debt (if there is such a thing). You create a problem; you know the problem exists; you know you need to fix it. The issue with privacy is, sometimes the problems are created not by you, but because the law changes and something that used to be okay is no longer okay. S o, you can go through this whole process of doing the code scans and figure out and say, "kay, I know what this data means, I've figured out the semantics, I've analyzed how this is flowing through my process, I know what the functions are, I know what the purpose is. You figure all that out, but then, if you change jurisdictions, or the law itself changes, you got to do it again, assuming you even notice. So, there's this additional dimension that is beyond the kind of debt that you get with just you wrote sloppy code.

Debra J Farber: 11:49

Right, and sometimes it's like you don't even know where to start, to clean it up. Right? I mean, just even cataloging where are the problems is a big effort in and of itself. Then, you have to actually fix it; and so, one of the reasons that I am so excited to have you on the show is that in our previous personal conversations, I've heard you speak about privacy in a way that I had not heard anyone else. It had been giving me a great like 'aha' moment, kind of like forgetting about the technical stuff for a moment and just going back to more of like a socio- technical "how do we approach problems?" kind of thing.

Debra J Farber: 12:25

And we had discussed why having a taxonomy for privacy in an organization is important, but it's insufficient to managing privacy at scale. You've stated that, first and foremost, an organization really needs to have a known and understood privacy ontology, and so this is where I really want to expound on this, really where I want to focus our conversation for most of the episode. We'll start with why. Why is a privacy ontology necessary for scaling privacy processes? What is an ontology?

Steve Hickman: 12:56

Let's see, I'll start with the definition, because that's important. So a taxonomy - most people are aware that it's a hierarchy of concepts; and, if you took biology in high school, it's like genus and species, that kind of thing: general, specific, and you can have as many layers as you want. And so, that's a good starting point. What you get with an ontology is more than that, because you get, not just general and specific, but you also get relationships relationships, whole whole or the role that a particular piece of information plays in a given context. So, you're able to work with these other kinds of relationships when you're dealing with various concepts, beyond just the general and specific that you get with a taxonomy. So, that's really key because, if you think about how software development has evolved in the 1970s, we figured out that having data structures was a good idea prior to that. And, here's a variable. "I got this one variable, I got this one other variable, I got this other variable, and then maybe then we said, okay, well, we want to group these variables together, and so we started giving them similar names, like you know, first name and last name or something like that. But they were still not grouped. And then eventually we figured out we need something better. We need actual structures of data. So, an ontology is basically that. It's something that enables you to create these data structures and the relationships between them.

Steve Hickman: 14:38

The key distinction that you have with an ontology versus what you have in source code is that in an ontology, I don't care about the physical storage that I'm using. I don't care if, let's pick money, the concept of money. I don't care if I'm storing that in a floating point or a string or an integer. What I care about is that it's money. In many cases, I don't even care if it's dollars or euros or pounds or whatever. I don't care about units, I just care about the concept. So, it's that conceptual level where the ontology lives; and you can have that, not just for individual concepts, but entire structures of concepts.

Steve Hickman: 15:25

So, if you think about an address, we might say, "okay, it's city, state, zip. If you're in the United States, well, again, I don't necessarily care about the storage, the details, I just care about these underlying concepts. And by defining these at this conceptual level, now we've got something that gives us the flexibility. It gives us a couple of things. One is, first of all, that's where the rules live. If you think about how the law is written, the law doesn't care about those details either. . . [Debra: Well, it depends on the law.]

Steve Hickman: 16:06

Yes, I mean, it does depend on the law. Some do. Yes, I agree, but in the cases where it doesn't, then you're not dragging along detail that's unnecessary, and maybe I should word it that way. [Debra: That makes sense.] And so, there's that; and then, you have the ability later on, if you wanted to associate this, if you either want to do code generation, for example, you can add that in the detail later on if you want to.

Steve Hickman: 16:32

But, if you work with things at a conceptual level, then you're not tied to specific technologies, so that in the case of a company the size of Meta, where you're using many different technologies, you don't run into this impedance mismatch that occurs because you're trying to keep track of things, and you're switching technologies, and you can't figure out "Okay, is this field in this data structure written in JavaScript? Does it match this other field that's in a Python structure?" o you end up with a common language that can be used across all of these; and it also turns out that that becomes very helpful, again, going back to the rules - that you're not forcing your lawyers to learn how to code because they don't care about those code- level details either. What they care about is whether or not you're following the rules.

Debra J Farber: 17:28

So, by having an ontology, it's kind of bringing the business together so that all the various stakeholders have a common language, it sounds like, to talk about privacy at a more abstracted level before you get into the more technical applications of data and approaches and architecture and all that. Is that what you're saying. Is that how it's. . .?

Steve Hickman: 17:49

Yes, exactly, and having a common language is really, really important. Communication is difficult. I remember years ago I was in a conversation in some meetings for some project and there were, I think, eight people in the room. Four of us were software developers and the other four were mechanical engineers. This was for a manufacturing tool that we were trying to develop, and I came out after four hours trying to figure out if we'd actually communicated with each other. They were all engineers. It's just some of them were software and some were mechanical. You can certainly see that the further people get apart in terms of their backgrounds, the harder and harder communication is; and so, having a common language becomes really, really valuable. It has the highest level of endorsement. I'll put it that way. If you remember the story of the Tower of Babel, this is the whole issue that God gets irritated at the Babylonians about, in saying that if they speak the same language, nothing will be impossible to them; and so, I can't think of a more ringing endorsement.

Debra J Farber: 19:02

It's a good point, and it just shows that how, even back then, how - having to take the religious stuff out of it - just societally, how important it is to have common ontology for so many things, for understanding and communication between different groups of people, and how, if all of a sudden everybody spoke different languages like the Babel story, how hard it would be to scale things across society or societies. For sure. So, let's get into a little more, I don't want to say specifics, but some kind of use cases that help crystallize what we're talking about here. Talk to us about how Epistimis is using semantic modeling for rule checking and how this helps with scaling privacy as part of this ontology approach.

Steve Hickman: 19:48

Okay. There's two fundamental things that we're doing. 1) is the ontology, identifying these semantic concepts for the data that's being used? And then, the second part of that is, once you've identified this, defining your process, 2) what the data flow is, through your process at this conceptual level, using these concepts that we've already agreed on. If you think about it, in a process, basically you have two things going on. You have some kind of function that has a purpose and then you have some kind of intermediate storage for information where data is going to be at rest. So, you may receive data in; it may go directly into storage or it may go into a function that has a purpose; and then eventually it gets stored; and then it may get read out again and processed for some different purpose; and it gets written out somewhere else; and then you can continue that read-write process cycle however many times you need. If you do this at a conceptual level, then all you care about is, conceptually, here's the data, the data structures, the relationships that I care about that's going into a function. Here's the purpose of this function and here's the data that comes out.

Steve Hickman: 21:13

You don't really care about anything else, because then what you can do is. . .the rules, whether they're legislated or regulatory rules, whether they come out of your privacy policy, or maybe they're contractual, like you've gotten data from a third party broker and so you've got some contractual limitations on your use of data. Wherever the rules come from, now you can take those rules, you can encode them in the tool and actually evaluate this process design that you've got against those rules; and you can see "m I breaking the rules? I'm good, I'm not good? You may end up with some situations where "it depends. Of course, that's the area that lawyers love.

Debra J Farber: 22:02

Well, I don't know if they love it, but it's the area they live in, for sure.

Steve Hickman: 22:09

It's certainly the area they live in. What happens is, if you're breaking the rules, that's an engineering problem. If it's an absolute rule, and you're doing something in your process design that breaks it, then the engineer just needs to go fix it. If it's one of these, "it depends areas. That's where you need to bring in the Legal team and say "okay, what do we need to do? Is this really a problem? Maybe the law is just a little bit vague, or how do we address it?

Steve Hickman: 22:36

The whole idea there, then, is because the tooling allows you to evaluate these rules against this process design - the process design is all independent of specific code implementation. If the rules change, you can just update the rules and reevaluate as part of your standard build process. If you go into new jurisdictions and there are new sets of rules, you just add those in. You can be constantly up- to- date. In fact, you can do this evaluation before the rules actually take effect. If, for some reason, the rules have changed and what used to be okay is not, you're going to know that and you have time to fix your process before it becomes an issue. Does that make sense?

Debra J Farber: 23:26

It does. It's definitely part of the "shift left mindset De-risk earlier on, fix the problems in engineering to reduce your compliance burden later on. Absolutely -makes sense to me. Maybe talk about how this would fit into the agile development process. How would an organization. . . I s Epistimis good for any sized org? How does an organization fit this into their current agile processes?

Steve Hickman: 23:56

Okay, there's a couple of things. Yes, potentially, any sized organization could use this tooling. Now, having said that, very small organizations - I have one customer, for example; it's one person's shop. She's a healthcare concierge. She's not a developer. She's never going to be a developer. or people like her, she's not going to learn how to use the tooling. Instead, I've worked with her. I'm doing the design work, basically on a consulting basis, based on her input, and say, "okay, here's your design, let's evaluate. In her case, it's HIPAA, saying "Hey, what HIPAA rules are you compliant? And that kind of thing, or is anything being violated?

Steve Hickman: 24:40

In very small business scenarios, we fully anticipate that we'll work with consulting partners. It could be large companies like Ernst Young or PWC, or small. . t here's a ton of privacy consulting companies. You go to the IAPP tech vendor list, there's 400 companies. Many, many, many of those are consulting companies. I've been in conversations with a few and continue to do more of that to work with them as consulting partners with very small businesses. Larger businesses, if you've got your own in-house developers, you're probably going to want to learn how to use the tooling. Sure, we can train you how to do it, we can get you kickstarted and let you take over. We can help you as much as you need and let you go on your own as much as you want. That's up to you. Very large companies like Meta could use this tooling. They're probably going to develop stuff in-house. I'm a realist.

Debra J Farber: 25:44

Yeah, the really large companies, almost all of them, take the perspective that "we are so unique, so big and unique in what we do, that there's going to be nothing off the shelf that we can just come and customize appropriately. We'll need to just build our own. I'm not surprised you feel that way. I would advise any company, it might be hard to ever sell to an enterprise like them. What advice do you have for designers, architects, and developers when it comes to creating and implementing a privacy ontology, taxonomy, and semantic model to get started in their orgs?

Steve Hickman: 26:20

Well, one of the things that. . .going back to the whole "common language, I think is very important. . . I actually think that IOPD would be a great organization to do this. We should have. . .

Debra J Farber: 26:34

&t;T

Steve Hickman: 26:37

Yes, correct. It would be a great organization to create and manage a privacy ontology. Now, the W3C has some stuff. There's been some attempts in this direction in the past, but really we should have a common language. It really makes sense that a standards organization should be leading that, so that not just Epistimis, but Epistimis, Privado, PrivacyC ode, Big ID, whoever these companies are, we should all be speaking the same language, because part of what that ends up enabling is "tool interoperability." Also, it's the end user doesn't have to keep switching their mental model each time they're going from one tool to the next, because that would result in a lot of mental gear grinding. That's the first thing. Now, having said that, for example, the Epistimis modeling tool, EMT (and the pun is intended, by the way), comes supplied with a base-level ontology that's pre-defined. The rules in all of the GDPR and CCPA and things like that, those rules are defined against that base ontology. So, that's already available so that end users don't have to go and try and figure that out. One of the things. . . okay, this is going to probably get a little bit technical.

Steve Hickman: 28:17

When defining an ontology, there's a couple of things, in order to make sure that it's usable, you need to be able to do. That is: 1) you need to accommodate the realities that sometimes people are going to call the same thing by different names. So, you need to support aliasing. That's just a practical thing. Whatever tool you have has to be able to do that. EMT can do that. Since I knew that was an issue, I just built it in. That's one thing. 2) Another thing is that the ontology will be the union of all the different things that you need to represent, but not every user will need everything. You need to have a way for people to use only slices or subsets of the ontology for their particular application. That's also built in. Now, for example, in EMT,

Steve Hickman: 29:22

the way we do that is, if you're familiar with SQL for relational databases, there are select statements in there. You can say, "elect this field, this field, this field off of this table. The idea there, in databases, is that you get back actual data. You're querying a database, you get back actual data. We actually use the same syntax, but what we're doing is saying, "here's the subset of the concepts from this conceptual data structure that I want to use in this particular function." You're not querying a database; you're just taking a slice off of your ontology. Does that make sense? It does.] Okay. You need stuff like that because, like I said, the ontology has to be the union of all these concepts. You don't want to force people to actually use everything because in many cases they don't need to.

Debra J Farber: 31:09

So, it's also about making it simpler to make choices. So, you're only surfacing at the higher level - he important choices that you need to make. Is that correct?

Steve Hickman: 31:21

Well. . .right. Yes, the fundamental challenge with tooling, particularly tooling at this abstract level, is of use. It's a mindset shift. Developers are used to writing code in JavaScript, Java, whatever; and now you're saying, "Okay, this is similar, but it's not identical, and so you want to make the tooling use familiar paradigms where possible, but also identify all of the pain points, all of the places where the differences could cause them to stumble or be less productive than they could be. And that's actually true for any tool. I mean, this is not true just for this, but it's certainly true for this, because you're asking people to think at a more abstract level.

Steve Hickman: 32:21

Now, the payoff here - one of the nice things about generative AI is that we are not going to write code in the future like we have in the past. I don't know if you've looked at things like GitHub Co-Pilot. There's a lot of different tools out there now where all you have to do is you go into your favorite code editor and you write a comment says, "his function does X, and then GitHub Co-Pilot will just generate the code for you that does what you just described. [Debra: that's pretty cool.] Yeah, it is pretty cool. But, what that means is, as developers, we're not going to write code the way we used to in the past, so adopting tools that work at an abstract level, that work at a semantic level, is really where we're going to end up. So, EMT is developed with that in mind, so that you can work at this abstract level, and then, when it gets to the point of actual code, a lot of that code is just going to get generated.

Debra J Farber: 33:27

So, you don't have to focus on that part because it'll just be auto-generated from the prompt.

Steve Hickman: 33:32

Right. Exactly, because that won't be where the fun is anymore.

Debra J Farber: 33:37

Makes sense. I really like use cases that help crystallize a larger idea, and so I'm going to mention some other privacy tech companies right now that are in the privacy engineering space, because I want to really get clear on where Epistimis' privacy- first tooling can fit in with other tools on the market, rather than replace them. What you're suggesting is not, you know, buy Epistimis and then you don't need these other tools; but it's also, I think, at first glance not clear, as we have all these new privacy engineering platforms coming to market that do different things. I think if we go through this exercise, where I will mention a particular type of tool or platform out there and what it does, how you could wrap Epistimis' privacy- first tooling around these other tools, does that sound good?

Steve Hickman: 34:29

Makes perfect sense.

Debra J Farber: 34:30

Okay. I think it'll make sense to the audience, too, as it helps to make clear how you can help with privacy and scaling it. First I'd like to start with data discovery and mapping platforms like a Big ID or a Secuvy. How does Epistimis' process design tooling work with discovery and mapping platforms?

Steve Hickman: 34:51

Okay, so for companies like that, in particular, those two companies have a lot more than just that. So, I want to talk about some of the other things that they do.

Debra J Farber: 35:00

Sure, absolutely. They do more than just data discovery and mapping. But, that's kind of how we're branding them, because once you do the discovery part and the mapping, you could do so many other privacy, security, and governance things.

Steve Hickman: 35:14

Right, right. So, those tools or those companies, the tool suites from those companies, can be both input to . . .and it can be a symbiotic cycle. So, for example, if you're doing data mapping from one of those. . . using a data mapping tool from one of those companies, then that can be input into building your basic conceptual model in EMT, because any time you do this kind of mapping, at some point human beings got to look at that output and say, "yeah, that's correct or no, I need to adjust this or whatever. And when you're doing that, you are effectively building your conceptual data model, which is the key pieces that EMT uses. Okay, so in that sense, those tools can provide input into EMT. Now, the other side of this is if you look at those companies and they do other things like Ropas and PIAs and stuff like that, the output from EMT can then be fed into those tools so that if you need evidence for a PIA, then the results that you get from running the rules in EMT can be the evidence that you use for those PIAs. Or, in the case of, you've got a process to handle consent management or Ropas or something like that EMT, because it's a process design tool. One of the things that it will do is it will detect oh, you need to have the ability to process RoPAs; you need to have consent management. I don't see it in your process anywhere.

Steve Hickman: 36:55

Well then, what you can do is that we'll have it. We don't have it yet. The idea is that you'll have a graphical editor to draw out your process design and you'll be able to just drop widgets in. "Okay, I'm using Big ID, so I'm going to use their Consent Manager, I'm going to use their Ropa tool, and you're just going to be able to drop that into your design and saying okay, here's my Ropa tool, it's Big ID, we're going to wire it all up. Here's my Consent Manager, you wire that up, and then what can happen is that EMT, as part of his output, can actually generate all the configuration information you need to configure those tools so that they are doing what they're supposed to do. Does that make sense?

Debra J Farber: 37:38

It does make sense, yeah. So, what about platforms like Privado, who is also our show sponsor, which scans a company's product code and then surfaces privacy risks in the code; and they also detect processing activity for creating dynamic data maps? How does Epistimis' process design tooling work along with a platform like that?

Steve Hickman: 38:01

Well, Privado, I think, is a great example of what I like to think of as insurance; and there's a very positive relationship between that kind of. . .what you're doing with that code scanning and what EMT does, because one of the things that we know about human beings is that they don't always play by the rules. You can have a design tool. You might even generate code from your design tool. But, you really should have a way to check the actual code that's out there to make sure it's actually following the rules that you thought you were following. You can use, for example, if you're just scanning your data stores, you can use something like Privado and a code scan to do an initial pass to create a semantic model and your basic process design.

Steve Hickman: 39:01

Okay, that would be used in EMT. So it can - that part could be input into EMT and then you can do all the rule evaluation in EMT. If you choose to do, as you update things, you might want to standardize; or if you decide to switch technologies - you know, this was written in JavaScript and we're going to switch it over to Kotlin or we're going to do this in TypeScript now instead of JavaScript, or whatever, you could switch technologies. If you want, regenerate code in a different technology that goes back out of your code base, well then you can have stuff like Privado there doing those checks on the actual code base again to say, "okay, well, we verified this at design level, but what did we actually get and does it actually match the design? Does that make sense?

Debra J Farber: 39:48

It does make sense. Thank you. And then, lastly, what about a company like Privacy Code, Michelle Dennedy's company, which transforms written policies into consumable tasks for developers and business teams? So, for instance, they have a library of privacy objects; implementations for agile privacy, like success criteria and sample code; and then they deliver meaningful metrics on how the privacy engineering process is going. So how would Epistimis' process design tooling work alongside a platform like Privacy Code?

Steve Hickman: 40:24

Okay. So, Privacy Code scans your privacy policy, generates all these tasks, puts them out in JIRA. Here's what you need to do and basically here's the rules you need to follow, based on your privacy policy. So, what we want to do is actually turn those in to executable rules at Epistemus so that we can verify that you've actually done what Privacy Code told you you were supposed to do.

Steve Hickman: 40:59

So then, as you do your design, then we've got those converted. In that sense, Privacy Code becomes input in ENT, and so then we can evaluate those rules against this design you've built. And we can even use, for example, the sample code that they have - I mentioned earlier about the BigID or something, the drag and drop. We can do a similar kind of thing with these privacy objects that Privacy Code has so that you can insert those into your process wherever is appropriate. And then, when or if you generate code or if you're just just evaluating against the model, you can say, "okay, yes, this is what the rule was. Is the model up to snuff? Is it actually following these rules now? So we, you know, we can take that from just being something in JIRA to actually verify that you've done what you were supposed to do.

Debra J Farber: 42:01

Got it, and that's so helpful, too, for teams. So, where are you in the current product development process? Are you seeking collaborators? Are you looking for POCs? Are you at the point where you're selling the product? You have the audience, you have the floor right now. Who do you want to collaborate with?

Steve Hickman: 42:22

We're very much interested in finding pilot customers, people interested in doing a proof of concept. You know, right now the initial focus is GDPR. Here we are close to the end of September 23rd. The goal is to, by the end of October, have a rough cut on Articles 1 through 45 of GDPR, which are really the only ones that are addressable with this approach. Then, once that's done, then we'll start looking at U. S. State law which, in terms of the kinds of rulesets that we support, the basic modeling is all there and you know we need to improve the documentation and improve usability.

Steve Hickman: 43:03

I'm very much interested in identifying people who want to give us feedback right now. The feedback is important. It's free to use. Just, I want your feedback, and we're interested in identifying potential consulting partners, people who are in the privacy consulting business and are looking for ways to help their clients. Because, if you're a privacy lawyer or you're a consulting company, you want to make sure that you're thorough. You don't want to drop the ball and somehow miss something, but you also want to focus your energy on the areas where your expertise is super valuable. So, the idea is that EMT can make sure that you're thorough. It can evaluate all the rules, make sure that you're not missing anything, and then you can focus on the areas that really matter. So, we're looking for people who are interested in those kinds of partnerships. I'm not going to lie, it's still rough. I mean, this is very early days, so it's sneak- peak time, but I'm very much interested in collaborations - people on multiple levels.

Debra J Farber: 44:15

Excellent. Well, I hope that you get multiple people ringing you up to talk about how we could make a privacy ontology, you know, eventually standardized, too. Right? I would think that at some point, whether you're leading that initiative or you were working with others in some organization that leads that, that we can get an industry standard around about a privacy ontology to really bridge that gap between privacy engineering and Legal GRC. You know, just the business in general. Tell us more. I know you've got some interesting privacy tools that you plan to ship in the future. Tell us a little bit about what's on the roadmap before we close for today.

Steve Hickman: 44:54

Okay. Well, the one thing that I'm excited about right now - I call it 'war gaming.' For those people who've read Daniel Solove's piece, "Data is What Data Does," and I think it's just out in preprint right now. So he really hits on a very important point there, and that is that the current approach to privacy law is fundamentally flawed.

Debra J Farber: 45:20

And he wrote the book on it. I took privacy law in 2004. It was the book that he wrote along with my professor, Paul Schwartz, and to this day that it is the same updated of course, the same legal law book that is being taught in most law schools. So just putting that out there to everybody who's listening here. Proceed with Dan Solove.

Steve Hickman: 45:47

Yes, I did ping him earlier today because I want to get his feedback on this, but we need a different approach; and so the foundation that EMT provides is not just about implementing what's currently the law, but it's a generalized foundation where we should then be able to war game new rules, new approaches to rules, and see what happens. In a different one of these papers, he talks about inference and he mentions the inference economy and whatnot and the challenge of it's not just what you know about something, but what you can infer about them because of advances in ML. And so, part of the issue here is can we detect the risk to people in the models? Because, we see, well, you've got all these different pieces of data that are all flowing to the same place. We have statistical evidence that says, for example, statistically, we know in the United States, if you know someone's birthday, zip code, and gender, you can identify the specific person 85% of the time with just those pre-pieces of information.

Steve Hickman: 47:07

If we can look at our models and say, "okay, we've got these different pieces of data floating around, they all flow together, there's a risk occurring here that information will be inferred that was never consented to but could end up mattering, and that could be something like you can infer, for example, somebody's race just based on their zip code.

Steve Hickman: 47:30

If you're making decisions about who can get credit, their federal law on credit is that you can't use race as a criterion; but, you might end up. . .completely incidentally, because the ML, you might, incidentally, be using things that are proxies for that. So, if we have tools that can identify these risks in the models, now we can start to see: "an we develop workable laws that actually achieve our goal of privacy, as opposed to what we're doing now, because what we're doing now, particularly with the advances in ML, it doesn't work anymore. It's not achieving its goal." So the idea there is that EMT is being built with a fundamental foundation that will enable us to start doing this kind of wargaming and start detecting these kinds of patterns, and I'm very excited to see where that leads, because we really need to, if at all possible. We need to get out in front of this because the technology right now, technology is stripping away our privacy extremely rapidly and we need to figure out a way to catch up.

Debra J Farber: 48:47

Yeah, I think that makes a lot of sense and that it's akin to finding vulnerabilities in code. It's red teaming, basically another type of red teaming; but, in the AI space, it's about trying to test and make sure that your rules will not be broken. So I think that's really exciting. As many listeners here know, because I bring it up often whenever it's relevant, my fiancé is a hacker.

Debra J Farber: 49:14

So, this is like constant conversation in our household, talking about addressing risks in code. We need to do it in AI for bias and for fairness, for figuring out how do you make a trusted product. So, if you want people to trust your products in the future, and not believe that it's eroding your privacy, being able to say that you're wargaming new rules or at least a type of test to make sure that it meets certain criteria before you never ship it, I think is essential. So, kudos to you for looking forward and seeing where technology is going and then what we need to get in place quickly in order to be able to even make decisions across a business that involve Legal, Risk, and Engineering. Right?

Debra J Farber: 50:05

It's really hard to get those things matched when in Legal, by design, a lot of things are left as generalities when they're defined. Like 'reasonable security' - what's that? How do you design for reasonable security? Right? What's that testing criteria look like? Being able to have a discussion where everyone's understanding and on the same page at a high level, I think is essential. So, Steve, thank you for joining us today to talk about privacy ontologies, privacy process tooling, and the exciting work you're doing at Epistimis.

Steve Hickman: 50:37

Thank you. It's always fun.

Debra J Farber: 50:40

Yeah, indeed! Until next Tuesday, everyone, when will be back with engaging content and another great guest, or guests. Thanks for joining us this week on Shifting Privacy Left. Make sure to visit our website, shiftingprivacyleft. com, where you can subscribe to updates so you'll never miss a show. While you're at it, if you found this episode valuable, go ahead and share it with a friend. And, if you're an engineer who cares passionately about privacy, check out Privado, the developer-friendly privacy platform and sponsor of the show. To learn more, go to Privado. ai. Be sure to tune in next Tuesday for a new episode. Bye for now.