S2E26: "Building Ethical Machines" with Reid Blackman, PhD (Virtue Consultants) Artwork

The Shifting Privacy Left Podcast

Shifting Privacy Left features lively discussions on the need for organizations to embed privacy by design into the UX/UI, architecture, engineering / DevOps and the overall product development processes BEFORE code or products are ever shipped. Each Tuesday, we publish a new episode that features interviews with privacy engineers, technologists, researchers, ethicists, innovators, market makers, and industry thought leaders. We dive deeply into this subject and unpack the exciting elements of emerging technologies and tech stacks that are driving privacy innovation; strategies and tactics that win trust; privacy pitfalls to avoid; privacy tech issues ripped from the headlines; and other juicy topics of interest.

All Episodes

The Shifting Privacy Left Podcast

S2E26: "Building Ethical Machines" with Reid Blackman, PhD (Virtue Consultants)

September 05, 2023 • Debra J. Farber / Reid Blackman • Season 2 • Episode 26

This week, I welcome philosopher, author, & AI ethics expert, Reid Blackman, Ph.D., to discuss Ethical AI. Reid authored the book, "Ethical Machines," and is the CEO & Founder of Virtue Consultants, a digital ethical risk consultancy. His extensive background in philosophy & ethics, coupled with his engagement with orgs like AWS, U.S. Bank, the FBI, & NASA, offers a unique perspective on the challenges & misconceptions surrounding AI ethics.

In our conversation, we discuss 'passive privacy' & 'active privacy' and the need for individuals to exercise control over their data. Reid explains how the quest to train data for ML/AI can lead to privacy violations, particularly for BigTech companies. We touch on many concepts in the AI space including: automated decision making vs. keeping "humans in the loop;" combating AI ethics fatigue; and advice for technical staff involved in AI product development. Reid stresses the importance of protecting privacy, educating users, & deciding whether to utilize external APIs or on-prem servers.

We end by highlighting his HBR article - "Generative AI-xiety" - and discuss the 4 primary areas of ethical concern for LLMs:

the hallucination problem;
the deliberation problem;
the sleazy salesperson problem; &
the problem of shared responsibility

Topics Covered:

What motivated Reid to write his book, "Ethical Machines"
The key differences between 'active privacy' & 'passive privacy'
Why engineering incentives to collect more data to train AI models, especially in big tech, poses challenges to data minimization
The importance of aligning privacy agendas with business priorities
Why what companies infer about people can be a privacy violation; what engineers should know about 'input privacy' when training AI models; and, how that effects the output of inferred data
Automated decision making: when it's necessary to have a 'human in the loop'
Approaches for mitigating 'AI ethics fatigue'
The need to backup a company's stated 'values' with actions; and why there should always be 3 - 7 guardrails put in place for each stated value
The differences between 'Responsible AI' & 'Ethical AI,' and why companies seem reluctant to talk about ethics
Reid's article, "Generative AI-xiety," & the 4 main risks related to generative AI
Reid's advice for technical staff building products & services that leverage LLM's

Resources Mentioned:

Read the book, "Ethical Machines"
Reid's podcast, Ethical Machines

Guest Info:

Follow Reid on LinkedIn

Send us a text

Privado.ai
Privacy assurance at the speed of product development. Get instant visibility w/ privacy code scans.

Shifting Privacy Left Media
Where privacy engineers gather, share, & learn

Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.

Reid Blackman: 0:49

LLMs don't deliberate. They don't weigh pros and cons. They don't give you advice based on reasons. What they're doing, in all cases, is predicting the next set of words that is maximally coherent with the words that came before it. It's a mathematical thing, right? So, when it gives you that explanation, it's not actually telling you the reason that it came up with the output that it gave you previously. It's more like a post facto explanation - an after-the-fact explanation, where it spits out words that would look like it explains what happened before (that coheres with the words that came before), but doesn't actually explain why it gave the output that it did. So, you might think that it's deliberating and then giving you the contents of that deliberation, when in fact that is not doing any of those things.

Debra Farber: 1:41

Welcome, everyone to Shifting Privacy Left. I'm your host and resident privacyguru, Debra J. Farber. Today, I'm delighted to welcome my next guest, Reid Blackman, PhD. Reid is the author of the book, "Ethical Machines: your concise guide to totally unbiased, transparent and respectful AI," which was just released last year by Harvard Business Review Press. He's also the creator and host of the podcast, Ethical Machines, and founder and CEO of virtue, a digital ethical risk consultancy. He is also an adviser to the Canadian government on their Federal AI regulations, was a founding member of EYs AI Advisory Board and served as a Senior Adviser to the Deloitte AI Institute. His work which includes advising and speaking to organizations, including AWS, US Bank, the FBI, NASA and The World Economic Forum - all big organizations we've definitely heard of - has been profiled by The Wall Street Journal, the BBC, and Forbes. His written work appears in the Harvard Business Review and the New York Times. Prior to founding Virtue, Reid was a Professor of Philosophy at Colgate University and UNC Chapel Hill. I have been following him for several years, and I'm really excited to have him on the show. Welcome, Reid.

Reid Blackman: 3:07

Hey, thanks for having me.

Debra Farber: 3:09

I'm glad that we were able to book you I know you're off busy, like writing and talking about AI. And the whole field has blown up. So, I definitely want to talk about some of the overlaps of privacy challenges and ethical AI and Explainable AI. I'd love for you to just tell us a little bit about your background, and how you became interested in AI ethics.

Reid Blackman: 3:32

Sure, my background is likely to be unlike the background of your listeners, from what I gather. So, I'm not a technologist by background; nor am I something like a privacy lawyer. I'm a philosopher by background. So my PhD is in Philosophy. I was a Philosophy professor for 10 years. So, that means when you take into account undergrad, plus grad school plus being a professor, I've been doing philosophy with an emphasis on ethics for the past 20+ years. So researching, publishing, teaching ethics for 20 plus years. I don't totally know how I got into AI ethics, in particular, to be totally frank. Somehow - you know, I had this idea for an ethics consultancy many years ago, well, before I started it. And, at some point, I don't know how - it was sort of in the ether. I became aware of engineers ringing alarm bells around the ethical implications of AI. And so, I just started digging in, I started reading up on stuff and learning more, and I found it really interesting. And then, a variety of other sort of both business and personal factors came into play. I left academia. I started the business. I was particularly interested in the ethical implications of AI, both because of their impacts and because it struck me as intellectually challenging. And so, off I went.

Debra Farber: 4:40

I mean, that makes a lot of sense. I really like to have varied guests. I mean, I'd love the socio-technical perspective that you're bringing to the space. So, if we only had privacy engineers talking to other privacy engineers, I think we'd be missing like broader societal challenges. And, anything outside of just pure tech and talking about tech stacks, and how do you make things work, so I love that. What, then, motivated you to write this book, Ethical Machines? And then, I know your intended audience isn't necessarily Privacy Engineers, but who is your intended audience, just to frame the book?

Reid Blackman: 5:13

Well, there's multiple audiences. I suppose the most obvious or explicit audience are senior executives, at larger corporations who are responsible for getting their hands around this stuff, making sure that things don't go ethically, reputationally, regulatorily, and legally sideways when it comes to using AI and other digital technologies. So, that's the primary audience, but I really wrote it in a way that I hope that basically anyone can understand. I gave it to a colleague of mine at AWS, who's a senior leader on responsible AI there, and she said, "Oh, this is great," after she read it. "This is great. I can give this to my grandmother and she'll finally understand what I do." And I thought, "Excellent! I want the senior executives, the Board members, the C-suite, to understand what I'm talking about. I also want a layperson. I want students, I want data engineers, privacy engineers, etc. I want everyone to be able to understand it. And, the reason I want everyone to understand it - the reason why I tried to make it, to some extent everyone my audience, is it's part of what explains why I wrote the book to answer your the first part of the question, which is that I just thought the issues were ill-understood, not well understood by the vast majority of people I encountered. Everyone knows the headlines; or everyone knows Cambridge Analytica; everyone knows, I don't know, maybe not everyone, but a lot of people know about biased or discriminatory AI, especially from ProPublica's account in 2016. So,we sort of know that headlines - the headlines around blackbox models, you know, scary blackbox AI. But, I didn't think that anyone actually understood the issues very deeply. I thought that the understanding was very superficial. And then people went from that superficial understanding to try to remedy the situation. And then everyone started screaming,"What good is ethics? It's too it's too thin. It's too flimsy. It's too subjective. You can't operationalize it; we need action!" And I just thought,"Okay, if people actually understood this at a deeper level, action would be relatively clear to see." And so, that's why I set out to write the book.

Debra Farber: 7:07

That's awesome. I think that's so needed in this space, just making it more accessible to people. It's really awesome. I have not read the book yet, but I look forward to doing so. I just ordered it.

Reid Blackman: 7:20

Yeah, great. Yeah, one thing, it's not merely accessibility, although that's part of it. Part of what makes it accessible is the language I use. So it's not filled with jargon. But what makes it accessible, I think primarily, is the way (at least I hope, if it is accessible, if I've succeeded and what I attempted to do), it's showing people the landscape. So, rather than seeing a set, or a motley crew, of risks that are associated with AI, I really wanted to show people the landscape - the ethical risk landscape; show it to be a kind of coherent whole; and show people how to navigate around that landscape, both in thought and action. So, that's part of the goal, so that it's not just sort of like, "Oh, this is bad! What do we do about it?" but "Okay, let's take a bigger picture approach here. Let's see the system. Let's see the parts of the whole, how the parts relate to each other and to the whole; and I think that gives a better grip on how to actually accommodate or manage these risks.

Debra Farber: 8:10

Yeah, I totally agree on that. That makes a lot of sense. I've heard you in a previous podcast, when you refer to two types of privacy, your mental model: an 'active privacy' versus 'passive privacy.' Can you just share your thoughts on that?

Reid Blackman: 8:25

Yeah, sure. So, when it came to talking to data engineers, or people on the tech side of the house, and they talked about privacy, they took it that, "Oh, okay, we respect people's privacy on the condition that (or maybe it's a sufficient condition to respect their privacy), that their data is suitably anonymized or pseudo-anonymized or it's protected, eople can't get access to it, et cetera. And so, that's a way in which you can think of privacy in a passive sense. So, you respect my privacy on the condition that you do certain things with my data responsibly: you anonymize it, pseudo-anonymize, et cetera, you know, behind the the correct security walls, so to speak. And, I am then passive with respect to whether or not my privacy is respected. I'm relying on you to make sure that my privacy is respected. Idon't do anything. But that's different from exercising a right to privacy. If I'm exercising my right to privacy, I'm doing something. I'm not passive, I'm active. And so, what would one be talking about if you're talking about data privacy from a perspective of exercising your right to privacy? Well, you're presumably talking about something like me having control over who has access to my data, under what conditions, etc. So, for instance, if I can control whether or not you see my data based on whether or not you pay me some money for it, that would be an example of me exercising my right to data privacy. So, it's much more active. Usually, tech types don't think about the act of conception of privacy. They think almost exclusively in terms of the passive conception. They're not explicit about this, of course, but that's the way they think about it. So, then you find talk about anonymization issues, like things like or techniques like differential privacy, that sort of thing. And that's all for is important. Don't get me wrong. I think the passive conception of privacy is, in some broad sense, a legitimate one. We need technologists thinking about it and how we can do it better; but, it's not the same thing as an act of conception of privacy, and I think we need to add that to the mix if we're going to get governance, right.

Debra Farber: 11:02

I totally agree. So to add on to that, I think we really need to remember that privacy is about individuals, and not about compliance from the perspective of a company's posture on privacy and their compliance with it. Very often you can get into a passive privacy mindset where you're just thinking, "What is the minimum I need to do to not violate a law regarding privacy and data protection?" as opposed to you know, really, thinking about the customer experience and making sure they feel trust. And that can take an active stance on, and take action on, the rights but also their preferences and they're elevating the the individual back into the center of the mental model of a business, as opposed to just their data that a company is hoovering up about them.

Reid Blackman: 11:53

Yeah, that's right. And those decisions can get played out in various concrete ways. Like, are we collecting certain kinds of data by default? Or, do they have to opt in in order for us to collect it? If we do collect it by default, do we give them the ability to opt out of collecting that data? If we do give them the ability to opt out of collecting that data, how easy do we make it for them to do it? There's lots of decisions to be made along here around thinking about what does the individual need in order to exercise their rights.

Debra Farber: 12:21

Right? So, I've also heard you say that the fuel of AI is people's data. Why is this a problem?

Reid Blackman: 12:29

It's not a problem in itself as it were. I mean, machine learning - which is the vast majority of AI we've got around now - machine learning gets fed on data. Everyone knows that. All else equal, the more data you have, the better your AI gets. That's because the more data it trains from, the more it learns - almost equal. But, that means that there is an unofficial incentive for organizations to collect as much data about as many people as they possibly can. Right? Because they want to go far, as it were, with their ML. They want to get as many ML models as they can running that will actually solve their business problems. They want them to be as accurate as possible. And so, they're incentivized collect as much data as they can. So, what can happen in that pursuit of data so that they can get their ML up and running and trained well, is they could inadvertently or intentionally, in some cases, collect data the possession of which itself constitutes a violation of privacy. Relatedly, it can be that the use of that data (forget about the collection of it or the storage of it), just the using of it can itself constitute an additional or novel breach of someone's right to privacy. So, just the nature of AI that incentivizes organizations to breach people's privacy.

Debra Farber: 13:39

Yeah, I think that's been one of the toughest things for a privacy professional to change a culture within a very engineering focused company, BigTech included, especially. Right? Where the goal is collect as much as possible ever since the Big Data days. Big data now to AI - it's still it's still incentivizing engineers to collect as much data as possible. So, it definitely is expanding the risks, and I think it's really hard to change that culture in industry. I mean, do you have any ideas on how we could get business leaders to, you know, rein that in a little bit?

Reid Blackman: 14:17

Well, that's sort of a question around how do we strategize to get buy in for privacy- preserving, governance, policies, practices, culture, etc. There's different ways. The thing I like to say is that different people have different motivational constitutions. They're, they're made up of different motivational cocktails. And so, different people are going to react(within an organization are going to react) to different kinds of reasons for putting guardrails around what kind of data we collect. Some people, you say, "It's the right thing to do. We've got to respect people's privacy," and they're on board. Although, that's probably somewhat the minority. I think it speaks to most people, though, to varying degrees. Then there's just things like reputational risk, regulatory risk, legal risk, and those different kinds of risks are going to speak to different executives to different extents. You sort of switch up, as it were your sales pitch, depending on who you're talking to. My general experience is, it never pays to go in there as an activist. "It's the right thing to do. We're going to do only the right thing. It's all about privacy and respecting people's privacy, and giving them the ability to control what data we collect about them and what we do with that data. You know, we need informed consent. We need dynamic consent...." So, even if you're right, it's going to be a hard sell because the truth of the matter is that you're talking to people who have other kinds of priorities, other kinds of things they need to think about, including just frankly, straightforward bottom line concerns that they're responsible for. And if they don't hit their numbers, literally or metaphorically, then their job is in peril. So, trying to figure out how your privacy agenda gels with their agenda is crucial, just like any kind of negotiation or collaboration.

Debra Farber: 15:58

Yeah, obviously, that makes sense. How can we get customers to trust us more, which will translate into a stickier customer that buys more from us or really going further to explain how privacy is going to enable the business as opposed to just be some add on - add a box of privacy to something. Right? It's kind of about baking it in.

Reid Blackman: 16:17

Yeah, I put it slightly differently. I mean, most people put it the way that you did. It's a perfectly fine way of something like how do we get their trust, how do we increase their trust? And to me, I don't know why (I do you know why), but I focus on the negative. Let's not cause distrust. Right? Because, a leader could be like, "Yeah, that's good. Increased trust is good. That's a really nice thing to have." But, while increased trust is "nice to have," loss of distrust - avoiding loss of distrust - is a "need to have." Avoiding distrust. So, you know,"Let's do this because otherwise, we're going to lose their trust. We will violate their privacy and thereby lose their trust. They will stop working with us. They will shout about us on social media. We might invite regulatory investigation for our practices. Let's at least define a floor that we're not comfortable with and make sure that it's operationalized."

Debra Farber: 17:05

I think that's actually really brilliant - the power words there. Because then, it really kind of hangs a lantern on it. Most executives feel like they're building trust in some way or another. So, I like that, where this way you're focusing on you don't want to lose that. And then, it's like, what is losing that look like? What's the metric? How many people are leaving? Or, then you can actually action upon that, and it also seems a little scarier, in a good way, to keep people on task about not wanting to lose that trust.

Reid Blackman: 17:34

Yeah. I always go for avoiding the bad first. We can strive towards the good; but first, let's make sure that we avoid the bad.

Debra Farber: 17:39

Yeah, I think one of the reasons as a privacy professional that I've stayed away from that is that I have seen - we've had very few metrics over the years. It's getting better now that we move more into the technical, but we've had very few metrics other than breaches. And then, the security budget kind of took that away from the privacy folks like 15 years ago, and that became a security magic as opposed to really about privacy. And so, we had very few and the ones people would really use to try to move the needle would be around like, "Oh, you don't want fines" or "You don't want regulatory enforcement." Right? And, I have found over the years that just talking about not having fines and regulatory enforcement doesn't move the needle. Too few companies actually are stuck with those giant finds; a lot of them are really big, huge organizations. It is not what moves the needle, and what does is really aligning more with what the business needs. You don't want to lose sales. You don't want to have your insurance provider not not insure you because you didn't meet some threshold - like other things. And, you just gotta get creative around how to get there and from using negative language, but I really liked the way you did it.

Reid Blackman: 18:48

Again, I think it varies. So, some people are going to like to like negative language, some won't. So, figure out what works for that person.

Debra Farber: 18:54

Yeah. And the org and the culture.

Reid Blackman: 18:56

Yeah, exactly. What you know, one thing I want to highlight, by the way is, you know, we talked about the training data. So, companies are, because of AI, companies are incentivized to collect as much data about as many people as they can so they can get more accurate models, yada, yada, yada. But, one thing that we didn't touch upon is the"inferred data" - privacy as it relates to that inferred data. Because ML is in the business of business of making certain kinds

of inferences, that is to say: 19:18

creating new data. And, that data is often about people and so that's another significant way in which privacy people might not have their eye on the inferred data at all, when they should. Training data it really important, but what you infer about people can be just as much, if not more than, a violation of privacy compared to the original data you collected about them.

Debra Farber: 19:42

Yeah, I totally agree. Can you expound upon that a little more in terms of some use cases you're seeing out there?

Reid Blackman: 19:48

Yeah, I'll give you my favorite toy example. So, suppose you're a telecommunications company. You've got lots of data about people and their whereabouts. Right? Because you've got cell tower data, where their phone is, et cetera. So, let's suppose that you're an organization and you have data about where I travel, where I am, throughout New York City, which is the city I live in. So, it's not as such a violation of my privacy for you to have that data, let's say. Let's say that you've also have some data that's perfectly not a violation of my privacy, like the addresses of cancer treatment centers throughout New York City, or the addresses of therapists throughout the city, whatever it is. And suppose you've got those two data sets, you throw them into your AI, and what the AI "notices" is that 3:30 pm every Wednesday, Reid goes to this cancer treatment center - goes to this address, which is a cancer treatment center or a therapist's office, whatever it is. So, what you've done is you've inferred, "Oh, Reid has cancer or he's a therapist," or whatever it is. That's a piece of inferred data. Let's just say it's true. It's not, but for the sake of argument, let's say it's true. What you've inferred about me is your knowing that, as it were, is a violation of my privacy even though you're violating my privacy by virtue of any of the training data that you used.

Debra Farber: 21:01

Right. That's really important to call out, especially in the privacy engineering space, where we - you know, this is a GDPR requirement, not only a requirement, but the definition of personal data includes inferred data; but, it's great to have the reminder for privacy engineers that we're also talking about and for datasets, which could also be biometric data, too, right?

Reid Blackman: 21:23

Oh, absolutely.

Debra Farber: 21:24

Yeah, that makes a lot of sense. What would people need to know regarding the training data and input privacy and does that effect at all the output of inferred data?

Reid Blackman: 21:34

I mean, there's lots of things to say about the kinds of things that you need to look out for when you're collecting training data and ways in which the collection of that data may constitute a violation of privacy. But oftentimes, it might not be the case that the people who are collecting the data are at all responsible for the inferred data. And so, it's not just a matter of telling everyone privacy matters and watch out for the training data and for the inferred data. It's going to be a matter of assigning particular roles, responsibilities related to the privacy of data throughout the AI lifecycle, which might mean different teams are handling or monitoring the data acquisition at different stages of the AI lifecycle, and so they're gonna need different kinds of training. So, it might be the case that the team that's responsible for collecting the data did their job; but it wasn't their job to figure out whether the inferred data will ultimately constitute a violation of privacy because they might not even know. They might not know if the communication is bad exactly what they're collecting the data for what it's going to be used for. Or, a team in an organization might take that data that wasn't originally collected for their intended purpose, but they found a new purpose for it. They didn't tell the team who originally collected that data about it, but they better have someone on their team who was responsible for checking to see whether the inferred data constitutes a violation of privacy.

Debra Farber: 22:44

Yeah, that was actually is a really good point. And I think, you know, my takeaway there is in the past, you know, we talked about the software development lifecycle. And now, in my shifting privacy left and my looking closer to development of how products and services get made, and making sure privacy is baked in, we're talking about the "DevOps lifecycle;" but then there's also the "personal data lifecycle." So there's the lifecycle of how personal data flows through organizations. I think you make a great point that there's now a lifecycle of AI training, and it sounds to me there should be some future framework or something that holds all these life cycles together, so that you can make sure that you're, in an agile way, being able to hit all of the requirements through all these various life cycles. Right? So that it's ethical, everyone understands what is expected. What are the requirements and when in each phase of each lifecycle?

Reid Blackman: 23:39

Yeah, I mean, a lot of you know, a lot of the work that we do with clients is to make sure that some of them have a well-defined AI lifecycle. Some do not have - and I'm talking about Fortune 500 companies don't have a defined lifecycle, and then mapping that lifecycle to your RACI matrix so you can define what roles are responsible for what throughout that lifecycle. At what stage in the lifecycle is really important because most organizations don't have a grip on that stuff. And then, you also need processes where, you know, the one team at Phase 1 communicates the right information to the other team at Phase 2, communicates to Phase 3, etc. So, you've got to assign responsibilities at each stage and make sure that there's communication between those teams for handing off to other teams.

Debra Farber: 24:21

Right, Right. Make sense. So, I often hear in the ethics space, especially around AI about human in the loop. Can you describe what the human in the loop actually means and why you think it's necessary to have a human in the loop with respect to AI?

Reid Blackman: 24:37

So, the idea that there's a human loop just means something like there's something going on in the world, there's inputs into the AI, there's outputs of the AI, and then there's something that results from the AI having an output. Having a human in the loop means roughly that you've got some human somewhere in that loop(it's not really a loop that's more of a linear set of occurrences, but then it loops back into the more data into AI). The general point is that something like we have a human involved; the AI is not just doing its own thing without anyone looking. So, for instance- what's a good example - maybe the AI gives a judge a recommendation about whether they're high risk and deserve a certain kind of sentencing or whether they deserve probation or something along those lines. Nonetheless, you might think,"Well, we want a human in the loop," which means it can't just go straight from AI risk rating to decision. It has to be AI risk rating to a judge that considers that risk weighting along with other factors, and the judge making a decision. That would be an example of a human in the loop.

Debra Farber: 25:31

Got it. So not an automated decision making. It's avoided.

Reid Blackman: 25:34

Solely or exclusively automated decision. Yeah, I do think, in many instances - again, this is going to be context sensitive, so I'm not going to say you always want a human in the loop. There are cases in which it's not good to have a human in the loop, even really ethically scary cases or gross cases. Suppose you're in military and you have the other side using AI in a variety of ways, making phenomenally fast,"decisions." Suppose you've got AI as well, but you want to make sure that you always have a human in the loop so that when your AI responds to their AI, before your AI responds, there's a human there to validate or assess the outputs of your AI. Well, that can be really dangerous because you're slow. Humans are really slow. Meanwhile, the other side is making more and more decisions, doing more and more stuff. So, you're going to be at a place in that particular instance where there's no time for human decisions because the opposing party made the case that you're just too slow now for their technology. So, you could put a human in the loop but that might lead to absolute disaster. On other cases, you better have a human in the loop, or that will lead to ultimate disaster. So, whether to place the human in a loop, and where to place them, and how they interact with the outputs is all going to be context-sensitive and it's complicated.

Debra Farber: 26:45

Yeah, I think it's fair to say that the instances where you would not want a human in the loop are probably the outliers, or I should say the exceptions to the general thinking. If someone's bringing something to market, for instance, versus military; but I otherwise agree. Again, everything's context-dependent.

Reid Blackman: 27:03

This is another, not corporate example, but criminal justice. So, I had a conversation with a professor of law who thinks that, plausibly, we are either at or will soon be at a place where AI makes better risk judgments about, say, criminal defendants than people do. And that's not surprising, he says, when you realize just how bad people are at judging the risk of other people. We're really bad. So, since we're really bad at it, it's not that difficult for an AI to be better than us because we're bad at stuff. I mean, we're, you know, limited beings. We've got these finite brains. We're tired. We're hungry. We're sleep-deprived. We're stupid or irrational or we have ulterior motives, but you know.... Yeah, we're kind of a mess. So, there might be

Debra Farber: 27:48

...we're distracted.... loads of instances in which an AI systematically performs better than, in some cases better than the average, in some cases better than the best. And it might be the case that you have a human in the loop, and they just screw it up because ML was in the business of recognizing phenomenally complex patterns that people can't see. In some cases, those patterns are really there, and when the person says, "No, no, no, no, that person doesn't have cancer. The ML is wrong. The AI is wrong!" Actually, that's the human expert that's wrong. They don't see the pattern as it were that the AI sees. In other instances, the AI has gone haywire. And, "recognizes" the pattern that's not actually there, and the expert has right to step in. So, how we interact with these things is really complicated and it's not a given that humans are usually better at it than the AI. Often, maybe for now. 5 years from now? 10 years from now? They get dicey. Interesting. Yeah. Well, definitely. You know, we all have front row seats to see how that's gonna turn out. I've been watching play out in my LinkedIn feed as a privacy ethicist - I've been following a lot of AI ethicist and I have a lot of the AI ethicists in my feed that are pushing back against LLMs and maybe even potential for a lot of risks; and they're explaining how there's a lot of risks around bias today - that we really need to put guardrails around things. But, at the same time, there's so many calls from industry to move forward with certain AI technologies, certain new companies. I mean, it's like a crazy hype cycle right now. But then, you've also got the AI ethicist who are trying to verbally corral that and make sure that we've got guardrails. So, I guess the question I want to ask is, how do we avoid AI ethics fatigue because we want people to take action, but how do we do that so that people aren't like, throw their hands up and say, "I'm tired of this, I don't know how we can ever address the problem" and then avoid it?

Reid Blackman: 29:49

Okay, so there are a couple of things to say here. So one is that, usually, when you hear people in the AI ethics space who are railing against LLMs, they might be doing a couple things. The one that might be saying, stop paying attention to this'godfathers of AI' who say that this is going to lead to an existential threat, or AI does pose an existential threat, when we have to address that. Don't pay attention to them because that stuff is BS; it's not going to happen; or it's so far off and we have real problems here today, right here right now like: biased AI, privacy-violating AI, etc. That's sort of one camp. I'm relatively sympathetic to that line of thought. You then have some people who are either genuinely concerned about the existential threat stuff, or they're very concerned about the economic impact of job loss from AI, if that's going to happen, they think that's going to happen. You also have some people who are really worked up about the ease with which misinformation / disinformation can be created and scaled using generative AI. Those are all legit concerns - well, the misinformation is certainly a legit concern. Job loss - possibly. I think it's too soon to tell. But, I don't think that any of those concerns -the existential threat one or the job loss one or the misinformation one - is the kind of thing that the vast majority of corporations need to do anything about, or even can do anything about, including the job loss one because I don't think we could talk about this if you want, but I don't think it's the responsibility or the obligation of businesses to hire people, or an obligation to maintain people when they can when they have more efficient means by which they can do their jobs, do their work. But,another thing to say is that there are some ethical risks that I think are particular to things like LLMs that enterprises should pay attention to. So here, I'm thinking about things like the so called hallucination problem, which is that LLMs output false information; what I call the deliberation problem, which means it looks like MLMs are deliberating and giving you advice for good reasons, when in fact, they're not; what I call the sleazy salesperson problem, which is, you can make really manipulative chatbots; and the problem of shared responsibility, the fact that there's a small set of companies will make foundation models like LLMs and then downstream developers, or enterprise or startups or whatever, tweak those models or fine tune them. And then there's a question about who's responsible things were ethically sideways? Those kinds of concerns really do matter quite a bit. Oh, and then your question was, how do we avoid ethics fatigue? Yes. AI ethics fatigue. There's a number of things. It depends on what the source of the fatigue is, of course, but if the if the source of the fatigue is people screaming about job loss and an existential threat, et cetera, then the fatigue is, "Hey, don't worry about that stuff. That's not for you." If you're Kraft Heinz, if your U.S. Bank or JP Morgan, you're not going to solve for the alleged existential risks of AI. So, one thing to stop the fatigue is to make sure that corporations focus on the risks that actually pertain to their organization. Another source of fatigue that's often cited is not the set of risks or the kind of risks, but the frustration with trying to translate ethics principles into action. So, people talk about justice and transparency, and privacy included. "We're for privacy. We respect people's privacy!" And then no one knows how to operationalize those things. I should say no one - lots of people don't know how to operationalize those things. They conclude from that, "Oh, ethics is just so fluffy. We can operationalize it. I'm tired of all this talk about ethics. We need to implement stuff, solutions. What does that look like?" And in that case, it's not the solution to the fatigue; it's to actually do the ethics well. It's not to abandon it. The problem is that you're at sort of layer 1 of 1000 of how to think about ethical risks of AI, and you've got to go deeper if you want to actually understand what you're dealing with here if you want to understand your problem. So, design the appropriate solutions.

Debra Farber: 33:35

Yeah. And in fact, I love this. It's not a direct quote, but from one of the appearances you had on I think someone else's podcast, you had said something to the effect of when you articulate a value on the front end, you must explain a guardrail that you've implemented to effectuate that value. For instance, in your privacy policy, or in your if you have an AI ethics policy or whatnot, if you just say we love privacy, we respect your privacy, but you can literally show nothing, you're just repeating hot air. You're not doing anything and taking any action. You're just placating.

Reid Blackman: 34:10

Yeah, exactly. Whenever I work with clients, and we start working on what their values are, how they want to write their AI ethics statement, as it were, we're not allowed to just say "We're for privacy. We're for fair and just outputs and against biased AI." What the hell does that mean? You've got to put some more meat on the bones. And one way is to articulate what red lines, those values create. If you value X, does that means you'll either never do Y are you'll always do Z, something along those lines. So to take one of my go-to toy examples, iff you just say we're for privacy, it doesn't mean anything. If you say, "We respect people's privacy and so we will never sell user data to third parties." Okay, that's something. Not all companies will sign up for that. Some will. Some won't. There's more to say, but that's at least a guardrail. Now everyone knows, and you have to talk about how to operationalize the guardrails, but at least in principle, you can sort of see how that gets done roughly. Okay, I understand now what they mean when they say they value my privacy, it means they're not going to take my data and sell it to a third party. Now, I don't think it should end there. I think there should probably be 3 to 7 guardrails per value because we value privacy, we'll always X; we will never Y; we'll always Z. You know, those set the guardrails. If you have you know, 3 to 7 - let's call it 5 - let's say got 5 of those per value, let's say you have 5 values, that's 25 guardrails of action, already from the start just from, you know, from your statement. And now, you can see the road to implementing such guardrails. You can measure compliance with those guardrails. And, when people read about these things, it's now a much more credible statement than, "We love your privacy."

Debra Farber: 35:46

Exactly. Makes sense. So responsible AI, or ethical AI? What's the difference? Is there any? And then why does it why do companies try to avoid the word

Reid Blackman: 35:56

Well, they try to avoid the word ethics because ethical? they don't know what to do with it, because they think it's squishy or subjective or something like that. And so, they abandon it. I think that's a mistake, although it does explain why I primarily speak in terms of ethical risks and not just ethics. So I'll say, we're going to build an AI ethical risk program, as opposed to an AI ethics program, because businesses could at least speak and understand the language of risk in a way that they don't understand the language of just ethics. That's one thing to say. I have been talking about AI ethics and AI ethical risks for a while now. What I've seen happen is that evolve into, as you've said, corporations talking about 'responsible AI.' For some reason, they're more they're more comfortable that are responsible and they are with ethical. But, what I see gets done when people use the word or the phrase responsible AI is it encompasses a whole bunch of stuff. Yes, we're for responsible AI. That includes the ethics stuff, but it also includes regulatory compliance, say with the GDPR or CCPA. It includes regulatory compliance and of course, legal compliance more generally. It includes engineering excellence, like model robustness and reliability. It includes cybersecurity practices, you know, best practices for cybersecurity or something along those lines. So, it's a bucket of stuff. Its security. Its regulation. Its ethics stuff. It's engineering excellence. So, it's a bunch of things. And then, when they start building out that responsible AI program, they don't really know what to do with the ethics stuff. So, they focus on regulatory compliance and cybersecurity and engineering excellence, and the ethics stuff gets short shrift. Usually, it'll just be a focus on bias, as though that were the entirety of AI ethics or AI ethical risks, and they carry on building this responsible AI program that ignores the vast majority of the ethical risks because they don't know what to do with it. So, I think responsible AI is a fine thing to do. If you want to use the phrase, nothing intrinsically wrong with the phrase. What's problematic that use of the phrase has led people to ignore ethical risks and pay more attention to the things with which they're already familiar, which is, that's a problem.

Debra Farber: 37:57

You threaded some things together for me there that really like underlines the point. So, thank you for that. In fact, to extend on that point, I read your recent article, "Generative AI-xiety," which was published in Harvard Business Review, I think 10 days ago or a week ago, something to that effect, and that walks businesses through AI risks as you see it. You mentioned before- you list four main areas that people should pay attention to. You mentioned them before. I'm just going to say them again, just to make it clear that these are the four areas you really feel like are the main risks, and then if there's anything you want to expound upon, I'd love to hear it. The first is the hallucination problem. The second is the deliberation problem. The third is the sleazy salesperson problem. And four, the problem of shared responsibility. Do you want to go into little more depth on those?

Reid Blackman: 38:53

Yeah, sure. So, one thing I want to highlight is that I think those are the, if you like, "the Big Four" that pertain to Generative AI in particular.

Debra Farber: 39:00

Oh, yes. I'm sorry. Yes. Generative AI.

Reid Blackman: The old ones: 39:02

bias, privacy, explainability. Those are still around and they're not going anywhere. They apply to Gen AI just as much as the non-Gen AI. But, those are the big four for Gen AI, I think. Yeah, I could talk about those. I mean, people know that hallucination problem, LLMs output false information. One thing that I tried to highlight in the article, though, is that the problem is not merely the fact that it outputs false information. It's that people are lazy and they suffer from automation bias, and so they tend to believe the output. So, you can tell someone, "Hey, listen, this thing, you know, sometimes outputs false information." But, it's really easy to rationalize, like, "I know what this thing outputs false information, but the answer that it just gave me seems really reliable...it seems Great. And then the deliberation problem? true. That makes sense. That resonates. I don't think this is one of those cases in which it's hallucinating. This seems right. It's easy to make that rationalization when: A) we're already inclined to defer to the outputs of machines, because for some reason, we think that they're more reliable because they're mathy, or something like that. So ,we suffer from automation bias. We're lazy. So, having to actually verify the outputs takes a lot of work, that or we just think that it's already got it right. And, we might just want a quick answer. Doing research takes time. This thing just gave me an answer. That seems good. All right, I'm just gonna go with that. So, it's not merely the fact that it outputs false information. It's that we humans are bad at really checking our own biases against- biases in favor of it, and making sure that we do our due diligence. That's the hallucination problem, The deliberation problem is harder to see. But I think it's a pretty significant one. So, let's say you're in financial services, and you create a an LLM that gives financial advice. So, let's say you're interacting with an LLM that goes financial advice and say, "Hey, listen, here's who I am, blah, blah, blah," LMMs don't deliberate. They don't weigh pros and cons. They don't give you advice based on reasons. What they're doing, in all cases, is predicting the next set of words that is maximally coherent with the words that came before it. It's a mathematical thing, right? So, when it gives you that explanation, it's not actually telling you the reasons that it came up with the output that it gave you previously. It's more like a post facto explanation - an after-the-fact explanation, where it spits out words that would look like it explains what happened before (that coheres with the words that came before)- it doesn't actually explain why it gave the output that it that it did. So, you might think that it's deliberating and then giving you the contents of that deliberation, when in fact that is not doing any of those things.

Debra Farber: 41:32

So basically, it's not thinking. It's perceived follow up answer could be completely another fabricated response or it is fabricated.

Reid Blackman: 41:44

Yeah, exactly. Yeah, can be completely fabricated, especially if it's in the realm that requires expertise. Right? It produces some seeming reasons for why give that output that look to reflect an appreciation of the complex world of financial services or financial advisory or something along those lines. And, to an expert, they're gonna be like, "Yeah, that sounds good," but they don't know. They're not experts. That's the whole reason they went to the LLM in the first place. And, problem is compounded by the fact that LLMs often evince its recommendations and so-called explanations quite confidently. It's easy to be taken in by that, even though you did well to overcome the problems associated with the hallucination problem by saying,"Yeah, I don't know about this. Let me dig deeper. You then dig deeper, and now you're giving an explanation - that's not really an explanation - for why it gave you the advice that it did, and you can be taken in by that.

Debra Farber: 42:31

Right. It's just more and more risks compounding one another. Okay, the third one that we should pay attention to for LLMs - the third risk is the sleazy salesperson problem, which you did touch upon before. But if you could give maybe a use case.

Reid Blackman: 42:45

Yeah, this one is easier to get a grip on. It's just the fact that you can fine tune your LLM. You can further train your large language model, your chatbot, to interact with people in various ways. So, for instance, you can say, "Go read this negotiation book, and then engage in conversation with customers in a way that reflects the lessons or the teaching of that negotiation book, or of that book on how to manipulate people - that book on how to press people's emotional buttons so that you get the sale." Right? And so, then what you're doing is you're training your chatbot to be manipulative in various ways. And so, then it goes out, talks to your clients or customers, and negotiates with them in ways that are perhaps sleazy, manipulative, ethically objectionable, and that result in loss of trust. That can easily happen.

Debra Farber: 43:30

Got it. Thank you. And the fourth risk is the problem of shared responsibility; and, I think that one is particularly interesting to my audience, if you could give a little more detail there.

Reid Blackman: 43:40

So, the issue there is, let's say it's OpenAI or Google or whomever develops a LLM, a foundation model. They sell it to whatever, your organization - healthcare and life sciences, financial services, whatever your industry happens to be, they're selling it. And now, your engineers, they fine-tune the model for their particular use case for whatever you're trying to build in-house. So you're building on top of that foundation model. Now, if things go ethically sideways - who knows what happens maybe it starts giving really bad recommendations to customers, gives recommendations that are bad internally that were supposed to increase efficiency when instead of decreases efficiency, who knows how it goes sideways, but it violates people's privacy in ways that you didn't anticipate, so on and so forth. And now you say, "Well who's responsible for it's going sideways? Is it the people who built the foundation or the people who built on top of the foundation? Is it the organization from what you sourced that LLM? Or is it your organization?" And so there's a question about how do we distribute responsibility when things go ethically sideways? Generally, what I think here is that there's a question, and it's an empirical question on a use case basis, which is, did the foundation model company gave the downstream developers sufficient information such that they could do their due diligence - they had enough information such that they could check that this thing is safe enough to deploy?

Debra Farber: 45:01

What does that look like?

Reid Blackman: 45:02

Well, that's gonna vary by use case. I mean, one thing is that you might need to know... For instance, let's just take a sort of an abstract example: "Hey, listen, in order for us to really do your due diligence here, we need to know a lot more about the training data that you use to train this LLM."

Debra Farber: 45:16

Oh, I see.

Reid Blackman: 45:17

And then they say, "Nope, we're not going to give that to you." Okay. If they don't give it to you, let's say it's Open AI, and say, "Look, we tell you what kind of training data we've got. That's our IP. We're not giving that to you. You're not allowed to know that." If your organization says, "Okay, well, we need it in order to do our due diligence to make sure that this thing is sufficiently safe to deploy, but they're not giving it to us, so we're gonna move forward anyway. Well, in that case, your organization certainly bears a lot of the responsibility when things are ethically sideways because you chose to go forward knowing that you didn't know enough to do your due diligence.

Debra Farber: 45:49

I see a lot of that happening right now as people are like guinea pigs in this new world of LLMs.

Reid Blackman: 45:55

I think this gets really complicated because, probably I could be dissuaded of this, but I am inclined to think we need a regulation around this sort of thing. Suppose that you are the government. If you've got any function, if the government has any functions, it's to protect people - to protect its citizens. If it has any function, it's one of protection. Suppose the government says, implicitly or explicitly, to the foundation model builders, suppose the foundation model builder say,"Listen, it's 'buyer beware.' We're going to build our stuff; there's a lot of software to keep secret; and it's buyer beware. If you want to buy it, and then fine tune it, and then use it, and things go sideways, that's on you - that's not on us. What's going to happen is that the companies with the largest risk appetite are going to be the ones who use it. So, of course, what are we going to see? We're going to see people hurt We're going to see people's privacy violated. We're going to see people discriminated against- all the bad stuff. That's a really bad outcome, if you're government, right, to allow a marketplace in which only the riskiest players are going to play. That's bad. That's bad for society. So, it looks like we need certain kinds of requirements around foundation model companies disclosing enough information to companies such that they can do their due diligence. And if they won't disclose it to Company A, let's say in financial services, and then they can't also sell it to Company B, and not give that information.

Debra Farber: 47:15

Right. That definitely makes a lot of sense. I guess that's a big part of responsible AI is being able to explain it, too, and have explainable AI. So, I guess, as we're getting to the end of this amazing chat, I'm definitely going to put in the show notes a link to your book, Ethical Machines, which I know is for business leaders who are thinking about incorporating AI into their organizations. What I'd love for you to do is give any advice you have for technical staff, so like data scientists, architects, product managers, devs, when they're trying to build products and services that leverage LLM's in a responsible manner. It could be high -level advice, but what should technical staff think about as they're bringing this tech to market?

Reid Blackman: 48:02

If you're a privacy engineer, then you know, one thing to think about is what are the various ways that we're going to need to protect people's privacy? How are we going to ensure, for instance, that when people throughout our enterprise use it, how do we make sure that we're not loading into it or not putting into the prompts sensitive data, personal information, personal health information (if it's a healthcare company), sensitive corporate data.... One thing to think about is how do we make sure that we educate people throughout the organization about how to use this stuff. There's also questions that privacy, people need to think about with regards to whether you want your people using LLMs in particular by way of an API - e.g., to an open AI or to a Microsoft or to a Google, or whether the LLMS should live on premises, whether should be on your own server, so no data, no prompt data goes out to any other company. So, thinking about how that data gets handled is going to be one of the big things. The truth is that if you're a more junior engineer, the best thing you can do is join with others to push for stronger, more robust governance from the top down.

Debra Farber: 49:03

Yeah, you know, as you were talking, I was even thinking to myself, Oh, okay, so it's, this is more data governance. It's just data governance for AI, and you can add to the data governance processes for one organization that they've got set up already, or that they're putting in place(or that they should have in place), it's not like an independent thing you now have to do this for AI. There's probably processes that already exist for data governance that you can then make sure the AI models, you know, and then the whole lifecycle for AI is incorporated into the data governance.

Reid Blackman: 49:33

Yeah, that's exactly right. So, that's pretty standard project for us to do is to do an analysis, you know a gap and feasibility analysis of an organization as it stands now, with regards to AI ethical risks. One thing we're certainly looking at are their data governance structures, processes, policies, workflows, RACI matrices, etc. to see what is there that we can leverage and just augment such that it includes AI ethical risk management, as opposed to building things from scratch.

Debra Farber: 50:02

Right. You always want to reuse what's already in the org that's working. Is there anything else you'd like to leave the audience with? Whether it's something inspiring or a tip or go to XYZ conference?

Reid Blackman: 50:16

Yeah, I don't know. Inspiration - I think I'm bad at inspiration. I'm not...it's not my, you know, motivational speaking, is not my thing. One thing to walk away with when it comes to the ethical risks of AI is to understand that it's simply not true that we don't know what to do about these things. Some people like to say, "Oh, we got to do more research. We've got to look into this. Everyone's trying to figure out what to do. It's the Wild West. No one knows the answers." And it's sort of like, "Well, no, that's not really true. There are some people who know the answers, and people who have been working on this stuff for a long time. I'm not saying people like me know all, but it's simply not the case that we don't know what to do here. If your organization is not doing things, it's not due to sort of humanity-wide ignorance; it's due to a lack of political will within the organization, which means that the thing to do is to figure out how can we generate that political will to engage in appropriate ethical risk mitigation.

Debra Farber: 51:05

Awesome. Well, thank you so much for joining us on Shifting Privacy Left today, and you're discussing your work with ethical AI and overlapping privacy issues. Hopefully, we'll have you back in the future.

Reid Blackman: 51:17

Sounds great. Thanks for having me.

Debra Farber: 51:19

Until next Tuesday, everyone, we'll be back with engaging content and another great guest.