6 'P's in AI Pods (AI6P)
6 Ps in AI Pods (AI6P)
🗣️ AISW #072: Debbie Reynolds "The Data Diva", USA-based data privacy and emerging technology expert
0:00
-55:23

🗣️ AISW #072: Debbie Reynolds "The Data Diva", USA-based data privacy and emerging technology expert

Audio interview with "The Data Diva" Debbie Reynolds, USA-based data privacy and emerging tech expert, on her stories of using AI & how she feels about AI using people's data and content (55:23)

Introduction - Debbie Reynolds, “The Data Diva”

This post is part of our AI6P interview series on “AI, Software, and Wetware”. Our guests share their experiences with using AI, and how they feel about AI using their data and content.

This interview is available as an audio recording (embedded here in the post, and later in our AI6P external podcasts). This post includes the full, human-edited transcript. (If it doesn’t fit in your email client, click here to read the whole post online.)

Note: In this article series, “AI” means artificial intelligence and spans classical statistical methods, data analytics, machine learning, generative AI, and other non-generative AI. See this Glossary and “AI Fundamentals #01: What is Artificial Intelligence?” for reference.


Interview - Debbie Reynolds

Karen: I’m delighted to welcome Debbie Reynolds, “The Data Diva” [from the USA] as my featured guest today on “AI, Software, and Wetware”. Debbie, thank you so much for joining me on this interview! Please tell our audience about yourself, who you are, and what you do.

Debbie: First of all, thank you so much for inviting me. It's a pleasure to be here. I follow your work. You are astounding the amount of detail that you put into things, and so I'm super excited to be here. So I'm Debbie Reynolds. They call me the Data Diva. I work at the intersection of data privacy and emerging technology. So that's everything from AI biometrics, satellites, all types of wacky things that companies want to do with data of people.

And I have a consulting company, Debbie Reynolds Consulting, where I do a lot of advisory work in addition to a lot of writing, speaking, and consulting. I also have a podcast called the Data Diva Talks Privacy Podcast. It's the number one data privacy podcast in the world for six years now. And we have listeners in over 130 countries.

Karen: That is so impressive. And I can see why it's number one. Your episodes about data privacy are just incredible. You talk about real things, very practical. And it resonates with people around the world.

Debbie: That's so sweet.

Karen: Let's get into some specifics. I have standard questions that I ask everybody. Next one is, what is your level of experience with AI and machine learning and analytics? You obviously know a lot about data! But if you could talk a little bit about how you've used AI and machine learning, professionally or personally, or if you studied those technologies.

Debbie: Yeah. I'm a self-taught technologist. Back in the day when you bought computers and they had books, I read the books. And I figure out how to use them. My early career started in library science. So I will build systems for data retrieval and things like that. So I know a thing or two about data, databases, data systems.

And so, over the years, I help a lot of companies either build or implement those types of systems. For me, data is the food for any data system. So when people ask me about certain technologies, I'm like, "Well, all of them need data". So I know a lot about that. I know a lot about data flows and how data works.

And especially right now when everyone's super excited about artificial intelligence. We know that artificial intelligence isn't new. Obviously there have been monumental advancements in AI in terms of what it is capable of doing, and also being able to have AI in some way democratize. A lot of times, those AI uses were only for companies who could afford the talent to build those things or could afford those systems. And so I'm happy to see some AI uses that regular people can use every day. It's really cool. But I think from a privacy perspective, it creates more challenges for companies around transparency and things like that.

So I've built a lot of data systems. I can explain a lot of data systems. I've actually testified in court about data systems. So I know a lot about how data flows. And so to me it's really interesting to separate the hype of what you see in the news versus what actually happens under the hood.

Karen: Yeah, definitely. And certainly all of these systems run on data. So I think that's one advantage that those of us who have worked in data for a while have, when we think about AI and machine learning, we have more of an understanding of what's really going on.

It sounds like you've used it quite a lot professionally, building data systems for people. Are there any AI or machine learning tools that you use personally, either to help you with your work in your consulting business, or that you just use as part of your personal life?

Debbie: Those are great questions. Well, I guess in two ways. One I predicted many years ago, once ChatGPT came out and people just went crazy about AI. I work with a lot of companies where they're very sensitive about how their data is handled. A lot of them, their legal departments are saying, "We can't use AI. We're going to need to close the door on AI." But you can't, because almost every tool that you use now is implementing some type of AI, right? The AI is in the house, whether you want it to be or not. And then there are all these different models that are out there that people are playing around with.

One of the ones I like a lot: I use ChatGPT. All types of wacky things. I guess I try to break things. Or I try to figure out how far I can go with the technology. So I want to know what it can do. So for me, a lot of it is asking questions, testing what it can and can't do.

A lot of times, of course, they hallucinate or get things wrong. I'm like, "Hey, that's wrong", or this or that. So right now I'm playing around a lot with memory of those tools. As you know, especially the generative AI tools, they can forget things very easily. So when people are trying to build something as a foundation, it's hard for those systems, the way that they're built right now, to remember those things. And so I'm really having fun playing around with what it remembers, what it forgets, what are those thresholds, and things like that.

But yeah, I use it quite a bit. I love it for my podcast stuff, like show notes and summaries. If I have an article or a journal that I'm looking at, I may ask it to give me a summary or a couple of things from there.

Probably the thing I use it most for, that I like the most, is formatting documents. Let's say for instance if you had to do a publication for a particular journal or something, and they may have pages and pages of descriptions of how you have to format that document. And so it's really good at stuff like that. So I use it for stuff like that, or formatting bibliographies and things like that. Those are pretty much my major uses of it.

Karen: One of my guests had talked about using it for formatting. There's a regulation you may have heard of called the FAR, the Federal Acquisition Regulations. And to submit proposals to be compliant with FAR, it's a lot of work. The regulations were huge. So having it help with something like that was very useful.

I know a lot of people have used AI for summarization. You mentioned that it does hallucinate. It makes things up sometimes. How often do you find that these summaries are accurate and useful, or how often do you find errors in them?

Debbie: Let's see, I'm trying to remember. A lot of times, when a big law comes out, I may read the whole thing, but then I'll say "Summarize this thing" and I want to see what they pulled out versus what I gleaned from the article. And sometimes they're like, "Well, there are 10 things that this thing talked about." And I'm like, "Well, no, they're actually 14." And they're like, "Oh, I'm sorry. Yeah, they were 14." You know, stuff like that.

So you really have to be careful. A lot of those bigger things, I would still read them myself, or I still check myself. Even for – you said someone you knew was using it for formatting – I would just double check their work, just like you would do if you had an assistant helping you or whatever. You would definitely check their work, even though these tools, especially the conversational ones, seem so confident, right? You can't be deceived by that level of confidence, because you're ultimately responsible for what the output is going to be. So you really need to look at that really closely.

Karen: Yeah, there was a recent study that said that when these tools are wrong, they're still highly confident that they're right 47% of the time.

Debbie: Right. And then sometimes they're like, "Yeah, okay, I'm sorry. Yeah, you're right, you're right. Okay. Okay. I'll clean this up." Or whatever. Try to be very apologetic. But yeah, definitely, use it as a tool. I wouldn't trust it blindly for any reason. A lot of lawyers have been getting in trouble 'cause AI has been citing cases that don't exist or creating, taking quotes from cases. The case may exist, but the quote isn't in the case. So people have to be really careful with that.

Karen: Yeah, law is definitely an interesting area. There's been a lot of publicity about bad case citations and things like that. And I know some people who are working on tools and technologies that use RAG, for instance, so that they constrain where it can pull quotations from, and making sure that they're all valid and useful citations, and things like that. So that's an interesting area. But there's obviously a lot of other topics as well.

So do you use AI tools voluntarily for anything? For instance, AI is around us everywhere. We have it in our movie recommenders and in our phones. And for writing, it's hard to find a word processor or a social site that doesn't somehow offer to help you write with AI, things like that. Have you tried any of those tools? Or do you deliberately avoid them? What are your thoughts about those?

Debbie: I use stuff like Grammarly. I've been using Grammarly forever, almost since they first came out. And it's advanced over the years, which is good. I like the advancements. But I don't like when they try to rewrite stuff or when they suggest, 'cause it's like, "I wouldn't say that. I wouldn't talk that way." It's not natural. A couple of times I say, "Okay, well, give me a suggestion". Nine times out of 10, I don't like their suggestions. So I don't bother with that, in terms of rewrite.

But there are times when, let's say for instance, you're writing and you have a paragraph or a sentence, and it just doesn't quite flow. And I may say, "This is what I want to say. Give me some suggestions or some pointers by how to say it." And that brainstorming loop is very, very helpful to me as well.

You had mentioned RAG and that's something I use a lot. I was doing RAG before people knew that you could do it. So it was basically, you define what the source is and then you tell us to do a particular thing. And so that's almost always the way that I use these tools. So I really ask it to search the internet and give me stuff. Like I'll do all my own research, but then I'll say, "Okay, I want you to look at these four documents that I put here. And then do a summation of that." So because I know what those four documents are, when I'm reading some summary, I can tell what's right or what's wrong, or make the adjustment.

Karen: That's great to hear about. And you mentioned using Grammarly. Obviously when Grammarly first started, they weren't using generative AI under the hood, and now they are. Do you have any concerns about Grammarly as far as data privacy, which obviously is one of your areas of expertise? There's been some talk about Grammarly where, if you don't have a paid or enterprise account, the terms and conditions of using Grammarly say that you are basically agreeing that they can use whatever you put into the tool as input for further training on their models. And so there's some risk of some potentially confidential content leaking out. Or maybe something that you're planning to publish, but not right away. There's obviously some risk of data leakage there. I'm curious what your thoughts are about that.

Debbie: Yeah. I've used Grammarly, I don't know, for as long as it's been out, right? So I've been a paid member. I'm not as concerned about them as a company. I use them in almost any publishing that I do to do spell checking. I'm not concerned about them.

But the thing that you're speaking about is a real thing, which is: if you use a tool and it's free, a part of that free layer is that it's almost no limit to what they can do with your data. But if you have a paid subscription to something, then you have a little bit more rights, right? But I think a company like Grammarly, at this point, I don't think it's in their best interest to try to use people's writing for republishing or anything. I think they get a lot of rich data insights about how people write and the types of mistakes that they make so that they can make a better product. And so I really like them for that reason.

Karen: Are there any other tools that you use on a day-to-day basis? You mentioned ChatGPT, but it sounds like you're experimenting and pushing the boundaries, more so than trying to accomplish a specific task.

Debbie: Yeah. I just ask all types of wacky things. I mostly use that. The only other thing that I've used: Notebook LLM. Notebook LLM is a Google thing. I think it was made for research, like students doing research and they needed to research materials. And they will summarize stuff and help them do papers and stuff. But they had a thing, and I can't remember what it's called right now, but it basically would take whatever you gave it, and it would create a podcast.

And so I played around with that. I have a newsletter that comes out every month. So I do an article with my newsletter every month. And just for the hell of it on a Sunday, I said, "Well, let me see what this thing could do". So I threw in my PDF of the newsletter and they did a 10 minute podcast with two people talking about the thing. They were like, "Hey, Debbie Reynolds, the Data Diva wrote this", and it was just mind-blowing. I couldn't believe what they were able to do just with that PDF.

And so I had put out a couple episodes. I hadn't had the chance to go back to it. But I think the way that they initially intended was, let's say for instance, someone is in school, they have a document that they want to understand better. It will create a conversation between two people where they're talking about the document and it helps you understand it.

But now a lot of people are using it to create podcasts from a paper or a PDF or something. You do it in 10 minutes. It's just astonishing what these tools can do, in terms of heavy lifting and things like that. So that's definitely a fun one I would recommend people play around with.

Karen: Yeah. I've seen a few times where people have posted articles that said, "This article was basically generated by taking something and putting it into NotebookLM". And I've always wondered if it wouldn't be a better use of my time just to read the paper that they're talking about than to listen to the podcast. I know some people prefer listening and learn better that way. I do better, still, at reading, I think. I'm not sure how many other people feel that way. I've heard people also say that if this was generated conversation, they feel like it's not worth their time reading it. Other times I've heard people say that it pulls out insights that they didn't necessarily latch onto when they read it for themselves, and it highlights some things that they thought were of interest. So I've heard a lot of conflicting opinions about it.

Debbie: Yeah, I think people learn in different ways and people consume or create content in different ways. So I think it's just another way that people can use to help them do stuff.

This may be a little bit off topic, but I have my own wacky note taking system that probably will never work for anyone else on Earth except me, right? And so I was trying to figure out a way to have it be more structured, right? You've heard things like bullet journals and different things like that. And so I'm like, "How can I make my thing more structured and less chaotic, the way that I do stuff?" So I went to ChatGPT and I said, "Hey, this is how I take notes. How can I structure it?" So it was just a better system. It gave me some good pointers that I probably wouldn't have otherwise thought about.

Or one thing that I really liked that I've been using it lately for is: let's say you have a tool that you use. Let's say you use Zoom, right? And then you decide, well, Zoom keeps going up on their prices. Are there other tools out there that I can look at? Instead of me going out trying to search each tool, I can ask ChatGPT, "Hey, I'm looking for this new tool. I'm looking for this, and this price range." And it'll actually go out and look for that stuff and it'll give you comparisons and stuff. And that saves me a ton of time.

Karen: So is there anything that you deliberately avoid using AI-based tools for? And if so, can you share an example of when and why you would choose not to use AI for that?

Debbie: Yeah. I don't like AI to write things for me. I know that you can get it to write stuff for you. And it irritates me. Let's say, for instance, I opened a Google Doc and they're like, "What do you want to write today? I'll write this for you today." I didn't ask you for that. So I'm swatting the notifications away.

Or I had tested out a couple tools to do presentations. And really what I wanted was to find something that will do the graphic part. So I have the content right. And all I wanted them to do is make it more pretty, so that I wouldn't have to be messing around trying to design it. And a lot of those tools I was looking at, what they wanted to do was literally write the presentation. It was like, "No, that's not what I want", right? So they wanted something where they could pull a bunch of pictures, put a few words, some razzle-dazzle. That's not the purpose of what I wanted, right?

So for me, I don't like it when it tries to take the reins and tries to write things. 'Cause I know what I want to say. And the things that I want to say are not what is in your database or what you think about.

And this is funny — so it's another Grammarly story. They send you stats of stuff about how you write or whatever. My most proud stat I get from Grammarly is that they tell me that my writing is 98% more unique than anybody else's. I'm like, "Yes, that's what I want", right? Because the things that I'm saying and the things that I'm writing aren't like other people, 'cause I'm not like other people. So I think just having that uniqueness, I think, is very important.

Karen: Very cool. So I'd like to switch topics a little bit. We talked about systems getting our data from the fact that we're using the tool, especially if the tools are free. So this is a common and growing concern, about where these AI machine learning systems get the data and the content that they train on. A lot of times they will use data that people have put in online systems or published online. It may be what some people call ‘publicly available’, but not really ‘public domain’. These companies are not always transparent about how they intend to use our data when we sign up for them. So I'd like to hear how you feel about companies that use our data and content for training their systems and tools. And specifically from an ethical perspective, should they be required to get consent from people, and credit and compensate them, if they want to use people's data for training?

Debbie: Yeah, that's a deep, deep question. I think that they should compensate people, give people credit for things, not use their stuff without people's permission. But the problem is that the internet was created to share information. It was not created to secure information, right? And so what these companies are taking advantage of is the fact that there's so much information out there that they can hoover up as much as they possibly can, right?

And it is so funny 'cause I saw an article where they talked about – so websites can have a file on their website called robots.txt. It's supposed to be a statement to tell whoever is encountering your website that you don't want your content to be scraped or pulled into engines and stuff like that. I saw one of these, I think it was ChatGPT, OpenAI, when I read their privacy statement, not long ago, and they said, "Well, we respect robots TXT", like, on a TV show. I was drinking water and I think I just spit it out. Oh my God. So I was like, "What?!" I was quoted in an article in law.com, I think, about this. I'm like, robots.TXT is like the original fig leaf on the internet, okay? It has no power to stop anyone from doing anything, right? It's almost like a pinky swear. Like, "You promise that you're not going to take my data" or whatever. It's like having a "Beware of dog" sign on your fence without a lock or something, right? So that is the problem with the internet.

And so the other problem is that it's hard for creators and people who don't have money and don't have access to be able to fight these things, right? So you have a website. These people have billions of dollars to spend on lawsuits to make you tired enough so that you won't fight for yourself and it won't be worth it.

So that's a huge problem with the internet, the way that it is now, that it's so open. It was really built on assuming that people had good intentions about that. And that's just not the way things are going at this point.

Karen: Do you have any thoughts about what you'd like to see changed that could make that better?

Debbie: I do. I'm glad you asked me that. The thing that I would like to see changed is that, first of all, there are technological capabilities for people to store data in decentralized ways. What we're seeing with a lot of these big incumbents is that they want to have you not use those things, right? So for example, your phone is like a computer, a more advanced computer than we had when we started having computers in schools and things like that. And some of that decentralization is possible with things like smartphone and other systems. We are not seeing enough uptick or enough companies are trying to leverage it in that way. Right now it's like, you go to the mothership and you give them all their data, and then you have to hope that they don't get breached, or hope that they don't misuse your data or something like that, right? But you don't know what's happening behind the scenes.

I think in the future there's a possibility to create more decentralized infrastructure where you're only sharing what you need to share to get what you want, right? As opposed to Google or Facebook or whatever, having all this information about you. Like if you want to search a service or a product, instead of you having to share everything, you only share what you need, in that moment, for what you want. And then you can revoke that access if you want, right? So that's what I would love to see in the future. And so the technology is there to start down that path. But it really is going to take some really forward-thinking organizations or companies that feel like it's a benefit to do that.

Karen: You probably have some thoughts about the feature that meta announced recently of what they call 'cloud processing'. Basically "Here, let us look at every picture in your camera roll and we'll pick some out and we'll offer you some new features for it."

Debbie: Yeah, all that, what Meta is doing and what a lot of companies are doing, it is all under the banner of personalization, okay? When you see a personalization, you should think, "privacy issue, red flag, red flag, red flag", right? So they're like, "If you give us more information about you, then we're going to try to give you at least something back for what we take", right? The idea is, if you give more information, they can monetize it or use it however they want, but then they're going to try to give you something back in return. It's a very asymmetrical relationship. So you give more, they get more, you get less, right? And so I would say, look at personalization, stuff like that, with a very critical eye. 'cause really what they're looking for is to take more information than you would have normally given them, for something that may not even be worth it to you.

Karen: As someone who has used AI-based tools, do you feel like the tool providers are transparent about sharing where they got the data that they used and whether the original creators consented to it being used?

Debbie: I don't think so. I don't think so. If you go on ChatGPT, you ask for an image, you have no idea where it's coming from. Or even to read, you wouldn't even know how to, to look right, to look, define how they're doing it.

And part of that is because a lot of copyright and things like that don't necessarily cover derivative works, right? So if it's different enough, it may be something that could in a court of law not be considered duplicative of what that person is doing. But that line is shifting more and more every day, where these technologies are very capable of maybe taking something, maybe an original work, and changing it enough so that legally it isn't considered like a copy, or something like that. That's a huge problem.

There are companies that are trying to create situations where at least if you use an image from AI or something, it'll at least say, "Hey, this is AI-generated", or "This image exists in X repository". I think Adobe is trying to do something like that as well. I guess the problem with that is that a lot of the way that they're trying to do it relies on you using certain types of tools, right?

Let's say you use LinkedIn and you put in an image that's from OpenAI. And they may say, "Okay, this was an AI-generated image". But let's say you take that same image and put it somewhere else, and that tool may not recognize or may not have that feature. And so I think in the future what's going to have to happen is that there’s going to have to probably be an increase in the amount of metadata that's in these images or in these documents, so that when they move from place to place, these systems can read them in terms of, "Is this copyright protected? Where should it go?" Almost what the music industry has done with music files. I think that's going to eventually have to be implemented with some AI tools.

Karen: Yeah, the music industry is interesting. One is that they're obviously involved in some of those lawsuits. They have a better tradition of protecting the rights of musicians and performers and songwriters. So it's interesting to see what they're doing, and I'm sort of hoping that they will break some ground that sets a precedent that can be extended to other types of creations. There's a lot going on in that arena right now. And even with images, is it Disney and Universal? are now suing Midjourney for misuse of their content and the images. But like you said earlier, you have to have a whole lot of money and a lot of resources to be able to protect yourself. And the small creators simply don't have that.

Debbie: Yeah, it's true. It's a sad state of affairs. Actually, what is happening now is that these big companies like the New York Times, they're trying to put a lot more of their content behind paywalls, at least. So it creates a situation where, if a company wants more of their data, they have to at least pay a subscription or something.

Paywalls may not be a benefit to a smaller creator. But the fact that that's what these big companies are trying to do more and more is just really interesting, in terms of, they're basically doing all that they can do to protect the data or the assets that they have.

And so in the future, when I've talked about decentralization, I hope that even small creators can do that. Maybe not with a paywall, but some type of system where they can only share what they choose to share with whom they choose to share with.

Karen: So as consumers or members of the public, there are times where our personal data or content has been used by AI-based tools or systems and it's really outside of our control. Some examples might be online test taking tools or TSA and biometrics and photo screening at airports, or just, there are cameras in the streets everywhere, right? And so our data is being captured and used without our consent in a lot of cases. I'm wondering what your thoughts are about that. And then if you know any times when your own data maybe has been used without your consent and you've become aware of it.

Debbie: I guess I'll answer the last question first. I'm not aware of my information being used without my consent. I'd obviously be upset about it if I saw that. But one thing I try to make sure is that I'm very purposeful about what I put out on the internet. And I make sure that when I put out things, they have my logo on it or whatever. So you can't confuse my stuff for other people's stuff 'cause I make sure that that's branded.

And I do recommend to people, even if you don't have a major following or a major thing, if you put out any content, I would highly recommend you at least put a logo on it. Or something that symbolizes you that can be traced back to you. Because that will help if in the future you have a situation where there's an issue about copyright or things like that. I highly recommend it.

There was a case, and there are cases that are happening. Well, first of all, if you are in a public space, say, walking down the street, from a legal perspective, they say you don't have any privacy, right? So the fact that you're walking down a public street, you're captured in some way, you don't have any privacy.

The problem with that now is that it is not just a static camera taking a picture of you that gets forgotten after a day or so. Maybe some of these cameras have facial recognition. Or maybe they're searching your face against a database or something. That's kind of that gray area that is in law, because the laws really lag behind technology, and they haven't really answered what that is.

But an example I'll give of a case I thought was really interesting, and this is has to do with publicity law. So publicity laws in the US are state-based, which makes it suck, right? Because based on the state that you're in, the laws may be different. But there was a case where, I think it was a tool that tried to get people's yearbook pictures for school. And so they were advertising and a guy found his picture on their advertisement. And he was like, “What? It's my picture on this advertisement. I didn't give permission.”, right? This is in Ohio. They were fighting it out on the court. And basically the company was saying they had a right to use his picture because he was not a notable person. So let's say if he was a famous person, then they probably wouldn't have used this picture. But they were saying because he was not famous, that he didn't really have any rights.

And so I have to look back and see what's happened with that case. But this is going to be a huge problem, right? Because in olden days, your books were paper, and they weren't disseminated on websites, but now a lot of those are electronic. They are disseminated on websites. And so because they end up in the public domain, some of these companies feel like they can use that information. Unfortunately these battles happen at a state-by-state level. And most people don't have the money or the time to even file those types of cases. So it's definitely worrying.

Karen: Debbie, these are all really interesting thoughts that you're sharing. And it's great to have your perspective on data and the way that it's being used and not having the right to privacy in public. I'm really curious about that case you mentioned about the yearbook. There's a site called classmates.com that I think has done a lot of that, where they've actually bought yearbooks and scanned them and made them digital even though they weren't that way in the first place. And then they try to sell people the pictures, sell them accounts, sell them ability to message other people who they graduated with. And it's an interesting business model. But yeah, to say that they have the right to use somebody's picture just because they're not famous, that doesn't feel right.

Debbie: No, it doesn't. And that is a challenge, right, in today's world, because how do you define notability? Maybe they're famous in their neighborhood or they've done something locally, maybe not on a national or international stage, but they have some type of notability. So I think the question will be, what is notable in the future, right? Like, if you have 500 Facebook fans in your neighborhood, are you notable? So I wonder what that's going to mean in the future.

Karen: Yeah, I think it's interesting to look at the way that people's data gets used. And there's so much data that's out there that's been collected about us over the years, and these data brokers have scraped it up. On the one hand, it's kind of funny to find information about yourself that is so clearly wrong. But on the other hand, that data can get baked into a model now, a machine learning model. And once it's in there, it's very hard to ever get it out, even if you discover that you've been misrepresented. To be able to actually get that information out of the model is extremely difficult technically right now. And so this whole concept of the right to be forgotten or the right to be able to correct mistakes in your records, machine learning kind of disrupts that.

Debbie: Well, in a big way, right? Because really, data systems are made to remember data, not forget it. So when you're saying "forget it", it's like going swimming upstream. And then when you bring AI into these systems, they don't forget, right? Mostly what these companies are doing, if you put in a "right to be forgotten" request, they're not deleting it. They are suppressing it in some way. And actually, if someone is really talented, they can get around those safeguards if they ask the right question in the right way.

So it's definitely concerning in the future about people's data, even if they have rights. You're talking about the right to be forgotten. In the US we don't have a right to be forgotten. We have a deletion right, in some regard. Deletion and being forgotten are two different things, and deletion also has many different layers legally. Let's say for instance, you had an account with a company and you canceled your account with them and you say, "I want you to delete my data". So the deletion may mean that they'll delete some of it, but not all of it. So the fact that you were a customer ever, that will never get deleted. Certain other things will probably never be deleted. They may make it so it's less publicly accessible or available. A lot of the laws around deletion in the US, the deletion is bound by time, so maybe two years. So let's say you had an account with someone for 10 years and then say, "I want you to delete my stuff". Well, maybe they legally only have to delete two years, not everything, where right to be forgotten is a longer period of time. It's just complicated. Delete doesn't mean delete in terms of what people think of delete. People think delete is “I press a button and it goes away forever”, and that's just not the way it is.

So my advice to people is to be very careful about what you put in the data systems. Do it with purpose, right? And make sure it's a good value for what you're giving.

Karen: Yeah, questions about deletion came up with the ancestry site and the prospect of it being sold, and then someone else having access to all of that genetic data that people gave, thinking they would just find out something about their ancestors, not that it might be sold, and then used by insurance companies, for instance, later on.

So that was quite concerning. I think that woke up a lot of people about, "What do you mean they're going to have my genetic data?" But that's just one example. I guess that is still working its way through.

The interesting thing I think about the genetic data is: my genetic data isn't just mine, because I share genes with other human beings in this world who, even if I give my consent, they didn't give their consent, and yet I've basically given away their data. And so that is, I think, an interesting complication. I'm curious to hear your thoughts about that.

Debbie: Yeah, it's very concerning. For example, when the 23 and Me situation happened, they filed bankruptcy. And this is still happening now. 'cause I think people were concerned about what happens with that data. The problem with that, just as you say, when you give your DNA, it's not just you, it's your other ancestors and things like that. When people did that, the purpose for which they were getting it is to find out more about their history, right? They were not intending for that stuff to end up in criminal databases, or they didn't think it was going to be sold, right, to another company.

And so one of the big loopholes in privacy is around, "Okay, I have this right because I gave this company my data". If a company is bought, or sold, it could be sold to anyone. And those people may not have the obligation to you that the original person had. So the obligation may not travel with that sale. And that's really the thing that makes people really nervous about these companies, right? Because if they go bankrupt, there may not be any limits to what the buyer may do with that data. 'cause they may not have to uphold those same standards or those obligations.

The only limit to that in the US right now is a law in Illinois called a Biometric Information Privacy Act. And what they've done is, if you are a customer of those companies and you live in Illinois, if they are sold, they cannot sell your data. So they have to delete it. Or in that company, if they want it again, they have to request it from you again. So I'm hoping to see more stuff like that. 'Cause to me that makes a lot of sense. When you signed up for ancestry.com or 23, you didn't agree that they were going to sell your data or go bankrupt and give it to somebody else for some other purpose. So I think those things make a lot of sense.

Karen: Yeah, I think 23 and Me is the one that I was thinking about, going into bankruptcy and selling the data to a new owner. I think they did have some clauses saying that if they were sold, the buyer would have to observe the same data privacy policies that were there in the signup for 23 and Me. So that seems like a good thing. Like they tried to do the right thing.

But then I think the question people always have is, “Can I actually trust that they aren't going to use it, and they aren't going to misuse my data, and that they are going to respect that commitment afterwards?” 'Cause we can't really necessarily tell or prevent them. We don't have visibility into what these companies do with our data. And that's, I think, where a lot of the distrust is coming in. I'd like to hear your thoughts about that.

Debbie: Yeah, I agree. Just 'cause you trusted company A, doesn't mean that you're going to naturally trust company B. And then once company B gets it, they can create their own new policies or procedures. Let's say for instance, you were part of 23 and me. Your data’s sold to this new company. You're under the impression that they're going to uphold the same standards. But they can come out with a new document and say, "Hey, we have new terms of service and we want you to sign this", and maybe you're signing away something that you may not understand, right? That's always the problem with stuff like this, and that's always an issue.

And so I get really nervous when people talk about giving biometrics, 'cause you can't change your face, you can't change your DNA. And the wide ranging potential abuse of that data is broad and it is very long term. It's not like you lost your credit card and you get another one or something like that.

So, really thinking really closely about, especially biometric data, I think that's really important for people. Because you can't get that stuff back once it's out and then you kind of lose control of that.

Karen: Another aspect that I'm curious to hear your thoughts on is with regard to medical privacy, because you pretty much can't get treatment in the US nowadays without signing your right to the medical provider. Or if you go to a hospital, you're signing that they can use data from that. If you get colonoscopy screening or something like that, or what's the Cologuard kit? You're sending them a biological sample. And there's really very little restriction on what they can do with that data once they have your sample. Anywhere that you go in the medical system, if you want to be treated, you have to agree to let them use your data the way that they say that they're going to use it, whether it's for research or for other purposes.

And one of my previous guests commented that when you sign, for instance, with the medical practice, and you say, " Yes, I agree that you can use my data". The data that they find most valuable isn't actually your personal medical history. It's more, “These are my emergency points of contact. If there's a problem with this, you can contact this person, and this is my sister.” And so they're getting information about how people are connected to each other, and that data is actually more valuable in some ways for these companies to use and to sell than the actual medical data. And I thought that was an interesting comment.

Debbie: Yep. That's so true. That's so true. Medical is complicated. And as your guest pointed out, there are a lot of different types of data in medical that are protected in different ways. So I think the thing that people misunderstand about medical data is that they assume that anything health-related is protected by HIPAA and it isn't, right? So in a patient provider situation, let's say you get a colonoscopy. You pay for it through insurance or pay for it yourself. Those things are protected by HIPAA. But just like you said, the relationships: this is your emergency contact. This is where you live. This is how much you make. This is where you work. All that stuff. So that stuff isn't protected in the same way.

Especially when you go on websites that are asking for medical information, forms on a website, right? You go to a hospital website, you ask questions or you try to get an appointment. Those things are not necessarily covered by HIPAA, right?

The whole other thing about health, let's say for instance, you get a Fitbit, or you have Apple Watch that tracks your heart rate and stuff like that. None of those things are covered by HIPAA, right? To understand what your rights are, you have basic consumer rights, just 'cause you gave those companies your data. But those are much lower rights, and they aren't as stringent as some of the HIPAA patient- provider protections.

Karen: You mentioned Fitbit. That's actually one of the kinds of devices that I've had some concerns about. Because when I got my first Fitbit watch, so many years ago, it was just this little company and I read through their terms. I know you and I read privacy terms, probably most people don't, because they're so dense and impossible to understand! But I did read them and I felt okay with it. And then, like you said: company got acquired. Google bought them. And now Google is trying to say, "You can't keep using your email address for Fitbit. You can't use that anymore. You have to have a Google account." I think it's by January or sometime early next year. And for me, I think that's going to be the last straw. I'm going to have to switch to a different type of watch. And so I started looking at other watches that have better privacy protections because Google has so much data about me already, I really don't think they need my heart rate and sleep data.

Debbie: I agree. I agree. And it's good that you can make those choices. I like to see a lot more alternative choices for people who really want the capability and functionality without giving up your privacy about that stuff. I agree.

Karen: Do you think that we're going to be seeing a backlash or more consumer pressure for people to say that? Once they have a better understanding of what's being done with their data to say, "No, I want better choices." Do you see that starting to happen already?

Debbie: I think unfortunately, a lot of times people aren't aware. What's the best way for me to put it? You can't imagine what companies could possibly do with your data. I think as people see things in the news like 23 and Me and stuff like that, it raises that awareness with people about what could possibly happen. And then people are starting to ask more questions and looking for more alternative choices.

So I think, unfortunately, a lot of times it doesn't come to public consciousness until something bad happens. You're like, "Hey, I don't like that. What can I do differently?" So that particular breach or that particular issue has definitely raised that more to public consciousness. And I think people are trying to find ways to limit the data that they give, especially for people or companies they don't trust in the same way with that stuff.

Karen: Yeah, I think we are seeing that there's increased awareness, and going along with that, increased public distrust of these companies. 'cause they're saying "You're doing what with my data?" and becoming more aware of it.

So looking at it from the positive side, what is the most important thing that you think an AI or tech company should do to earn and keep the trust of their customers? And do you have any specific ideas on how they could do that? What would a company need to do to be a preferred provider of services?

Debbie: Yeah. A couple things. One is, to me, table stakes, which is you want to have good security around your data. And so that's probably one of the bigger downfalls of 23 and Me, is that they didn't have that level of security that people expected from them. And that created a situation where they lost a lot of value. They lost a lot of trust. They ended up in this bankruptcy and stuff like that.

But then also for companies, they should be transparent about how they're using the data. Let's say for instance, you have a company you decide you want to use. You have API access to ChatGPT or some other type of model, right? You may not know what's happening under the hood of the tool, but you should be transparent about what you're using it for.

I tell companies, a lot of times they're very obsessed with what's in the model. "What's in the model? What's the model doing?" I'm like, "Don't get all twisted in knots about that. What you need to be concerned about: what goes in, what you put in, and what comes out. So that's what you're responsible or accountable for." So just because, just like we're saying, ChatGPT gave you an answer, it was wrong, and then you use that wrong information to get a wrong insight or did something to harm someone, it is your responsibility. You can't go back and blame ChatGPT. So you are responsible for what goes in, which means you need to know what goes in, and then you're responsible for what comes out. Because you should know what goes out and you need to make sure it's accurate, especially if it creates a situation that can create a harm to someone.

Karen: Yeah, that's an interesting point. I think a lot of people say, "Well, I don't know where that data came from. The companies don't tell us where they got the data that they used, and they gave me this answer, and why can't I trust it?" But I think there's maybe a false assumption that people make that, if the information coming out of the tool is wrong, the tool company is responsible. And that's not the case at all.

In fact, there's only one company that I know of that actually has tried to indemnify their users against any copyright infringement lawsuits, and that's Adobe. They put out this position that their Firefly tool was enterprise-safe. But then it turned out, well, they used Midjourney images in it, and those are not copyright safe. And so they've kind of poisoned the whole well of water with that. And so, even though they tried to indemnify people, that didn't really solve the problem.

And I think there are some small protections with Microsoft and Google saying "we'll protect your data" or "we'll partially indemnify some corporate customers", but that doesn't help most people. And it only covers things like copyright infringement. It doesn't cover, we've given you bad information, or information that came from an unauthorized source. Or, and this is happening more now, using information that was AI-generated and scraping that and feeding that back in to update the model. And so it's perpetuating things in a way that's not at all constructive towards getting better quality results.

Debbie: It's true. It's true. Yeah. AI doesn't like its own slop under no circumstances!

Karen: Yeah, it's a hard cycle to break. And I think some of the initiatives you talked about, being able to identify if something was AI-generated-- at least then the companies that are looking for more data, more data, more data would be able to know, "Okay, this is data that won't actually help my product improve".

Debbie: Right, exactly. I don't know. It is a very interesting time. I have not ever remembered seeing products come out where they're basically like, "use it at your own risk". Like, I've never seen that. When you get in the car, it doesn't say "use it at your own risk". Stuff like that.

So I think people really need to understand that when they're using these tools, they are going to be the one responsible, not these big companies. They're making a lot of money. But you really take the risk. And so you just have to, you know, "Is the risk worth it? Does it make sense? Am I okay with this?" So you have to ask all those questions before you take on these tools.

Karen: Yeah, definitely very good points. Hopefully this conversation will help other more people to be aware of the risks and the trade-offs that they have to think about, because it really is on us to know what risks we're taking. I don't know that we're going to get to informed consent anytime soon. I mean true informed consent where the companies are actually sharing, "Here's what I'm actually going to do with your data, and is this okay?" And giving people true choice and true consent options. Most of them, I think, write the terms and conditions that say basically, "We can use your data for product improvement", just leaving it as wide open as possible. And so it's really hard for people to know, and it's hard to fault people for not knowing. And yet at the same time, we have to try to know, because we are responsible.

Debbie: It's true. They make it as difficult as possible. So I say, they're like, "Take the cotton candy today", but your teeth are going to fall out in six months. That's kind of the deal that we get with these things, right?

Karen: Yeah. Well, what can we as consumers do to combat that? Other than trying to be more aware, what advice can you give people?

Debbie: I would say, try to make sure that the exchange really does bring you value, right? And it's not so asymmetrical. For example, a couple years ago, Amazon was doing this thing where they were like, "If you give us your biometric, your fingerprint or something like that, we'll give you a 10 dollar coupon." I don't think a $10 coupon is an even exchange for that information. So I would say, is it a fair exchange? Even though we know that these companies get more than they give, is it a true value to you for those things? So just think about that in terms of the long term benefit, I would say.

Karen: Yep. That's good advice. Hopefully this will help more people to be aware of it and to be able to take constructive action toward protecting themselves. Any other thoughts that you'd like to share with our audience about AI or data or data privacy or anything happening in that area? Do you have any events coming up that you'd like to tell people about?

Debbie: Oh, so sweet. Let's see, what can I say? I do have some events coming up. They're not public events. I have some keynote addresses I have to do in Europe, one in France and one in Finland in the fall. I am always putting out new content on my LinkedIn. It's always a video or a podcast coming out every week that people can definitely look at. And then I have articles that come out as well and different cadences. But I always try to make sure there are things people can read and learn about.

For me, I always want to make sure that I give people something practical that they can use, like, right now. So not pie in the sky and all these super ' leather patches on tweed jacket' insights. Those aren't the types of things I like to talk about. But yeah, just those things, and collaborating with smart people like you on some projects that will come to fruition very soon.

Karen: Yeah, it's very exciting to be working with you on those! I won't spill the beans here, but yes, very, very excited to have that coming up with you. So it sounds like you've got a lot of events. What's the best way for people to follow? Is it LinkedIn or through your website?

Debbie: Well, the best way to find me is on LinkedIn, probably. So you type in Data Diva Debbie Reynolds and my name pops right up. Or you can go to my website, DebbieReynoldsConsulting.com, where I have a lot of my videos and writings and things like that that people can look at.

Karen: And your podcast, where can people find that?

Debbie: Oh, I forgot about that. The Data Diva Talks Privacy podcast is on all the major podcast directories. And then also I do a post once a week of each episode that comes out with a link to the podcast and everything.

Karen: Awesome. Yeah, I'm definitely going to be checking out more of your podcast episodes. I've really enjoyed the ones that I've listened to so far, so I'm excited about that. Anything else that you'd like to say?

Debbie: No, just, I'm kind of starstruck hanging out with you here talking. I love your writing and I love the things that you do. You have an incredible mind, so I'm really happy that you're going forward into more publications, as I can say, because you really have a talent for that.

Karen: Wow, I'm so flattered! I thought I was the one being starstruck here on this interview! So happy that we finally connected. Thank you so much, Debbie. Appreciate it. And talk to you again soon!

Debbie: All right. Talk to you soon. Thank you so much.

Interview References and Links

Debbie Reynolds on LinkedIn

Debbie Reynolds Consulting

The Data Diva Talks Privacy podcast

Leave a comment


About this interview series and newsletter

This post is part of our AI6P interview series onAI, Software, and Wetware. It showcases how real people around the world are using their wetware (brains and human intelligence) with AI-based software tools, or are being affected by AI.

And we’re all being affected by AI nowadays in our daily lives, perhaps more than we realize. For some examples, see post “But I Don’t Use AI”:

We want to hear from a diverse pool of people worldwide in a variety of roles. (No technical experience with AI is required.) If you’re interested in being a featured interview guest, anonymous or with credit, please check out our guest FAQ and get in touch!

6 'P's in AI Pods (AI6P) is a 100% reader-supported publication. (No ads, no affiliate links, no paywalls on new posts). All new posts are FREE to read and listen to. To automatically receive new AI6P posts and support our work, consider becoming a subscriber (it’s free)!


Series Credits and References

Disclaimer: This content is for informational purposes only and does not and should not be considered professional advice. Information is believed to be current at the time of publication but may become outdated. Please verify details before relying on it.
All content, downloads, and services provided through 6 'P's in AI Pods (AI6P) publication are subject to the Publisher Terms available here. By using this content you agree to the Publisher Terms.

Audio Sound Effect from Pixabay

Microphone photo by Michal Czyz on Unsplash (contact Michal Czyz on LinkedIn)

Credit to CIPRI (Cultural Intellectual Property Rights Initiative®) for their “3Cs' Rule: Consent. Credit. Compensation©.”

Credit to

for the “Created With Human Intelligence” badge we use to reflect our commitment that content in these interviews will be human-created:

If you enjoyed this interview, my guest and I would love to have your support via a heart, share, restack, or Note! (One-time tips or voluntary donations via paid subscription are always welcome and appreciated, too 😊)

Share

Discussion about this episode

User's avatar