AISW #007: Angeline Corvaglia, Italy-based Data Girl and Friends founder 🗣️ (AI, Software, & Wetware interview)
An interview with Italy-based 'Data Girl and Friends' founder Angeline Corvaglia on her stories of using AI and how she feels about how AI companies are using our data and content (audio; 16:30)
Introduction - Angeline Corvaglia interview
This post is part of our 6P interview series on “AI, Software, and Wetware”. Our guests share their experiences with using AI, and how they feel about AI using their data and content.
This interview is available in text and as an audio recording (embedded here in the post, and in our 6P external podcasts). Use these links to listen: Apple Podcasts, Spotify, Pocket Casts, Overcast.fm, YouTube Podcast, or YouTube Music.
Note: In this article series, “AI” means artificial intelligence and it spans classical statistical methods, data analytics, machine learning, generative AI, and other non-generative AI. See this Glossary for reference.
Interview
This is an interview with Data Girl And Friends founder Angeline Corvaglia. Angeline, thank you so much for joining me today! Please tell us about yourself, who you are, and what you do.
Thank you so much for having me. I’m Angeline, and I am the founder of Data Girl and Friends. It’s a collection of videos, activities, and stories to help children and youth be safer and thrive in the AI-powered digital world. The mission is to empower them with improved awareness and critical thinking. Importantly, it is done in a fun, engaging, and informative way.
I really admire that mission!
Thank you.
What is your experience with AI, or machine learning, or analytics? Have you used it professionally or personally, or studied the technology?
Before I started with Data Girl and Friends, I was a CFO for 10 years in a financial institution. Among the areas that I was responsible for was the data office. At the time, it was kind of too early to even consider incorporating AI, but I was heavily focused on data management and analytics.
In my next career step, I worked for a software provider that also offered AI-powered solutions to some customers, mainly as a part of the analytics tool they implemented. But AI was something that only the tech experts managed, and the non-tech experts like me didn’t have anything to do with it.
After I started working on my own, which was relatively shortly after OpenAI created a storm with the public release of ChatGPT, I started to get concerned about the fact that AI is an integral part of people’s lives, yet they usually don’t have any knowledge about it.
For this reason, I developed a very strong interest in learning about AI and spreading that knowledge to other non-tech experts like me, in a language that they can understand. I have been especially focused on identifying and working to counterbalance all of the unintended consequences of this huge explosion of AI for the masses.
I’m always looking for resources and research to help me learn about AI and its impact on society. As I said, I have a significant background in data management and analytics, but the AI part is self-taught with the support of tech experts who stand behind my mission.
That’s great! Could you share a specific story on how you have used AI tools? And what are your thoughts on how well the AI features of those tools worked for you, or didn’t?
Well, I use common chatbots like Gemini, CoPilot, and ChatGPT to help me with brainstorming and fact-checking. They obviously have their limits, especially when I want to chat about novel ideas or new ways of seeing things. They get lost easily in those cases.
That said, I try to not use too many different tools. Not only because of the environmental impact of using the tools unnecessarily, but also because I believe that all the AI tools are out there to collect our data. So I try not to spread it out through too many.
I had a period where I used to use it to create images, as well. But now I avoid that because of ethical considerations and, as I said, I feel it unnecessary.
Yeah, I understand not using the image generation tools. Do you have examples of other AI tools that you’re using, and why?
I use specialized tools for things like grammar, style checking of my writing, creation of character voices (otherwise all of my characters would have the same voice), and character mouth and head movements with Adobe Character Animator.
Finally, I sometimes use it to search for information, in case the traditional search engines produce results that are useless because of the large reliance on SEO.
We have Amazon Alexa at home, but it’s actually muted now, as I’m very aware of the data privacy issues and I don’t want it listening. So it’s just sitting there most of the time because we’re too lazy to un-mute it.
Yeah, I’m with you on that. We were actually gifted an Alexa some years ago. And we plugged it in, and we looked into the privacy policies, and said, nope, we’re not using that. And it got put away.
Exactly! When it’s on, my daughter really likes it. She likes to be independent and pick songs to listen to. But other than that, it just sits there. My husband complains about not being able to use it, but he’s also gotten used to it. Privacy is more important.
Yep. A common and growing concern nowadays is where these AI systems get the data and the content that they train on. And you’re alluding to that with this discussion about privacy.
These companies often use data that users put into online systems or publish online. And they’re not always transparent about how they intend to use our data when we sign up.
How do you feel about companies using data and content for training their AI systems and tools? Should ethical companies get consent from people (and compensate them) when they want to use that data for training?
I think they should. I mean, I understand that a lot of people say, well, they can’t build their tools without just taking all of the data that’s out there, basically any data they can get their hands on. But I do believe that companies should get consent from the people whose data they want to use.
An extreme example is Meta. And I am so unhappy about the fact that they decided to use Facebook and Instagram data from 2007 onwards to train their AI model. And not only that, they make it impossible for many users to opt out. I can, because I have GDPR on my side. I just had to write this sentence to the tune of, “I have GDPR” and I was opted out. But others in the US, I know weren’t able to opt out.
And the problem is, exactly on Meta platforms, if you think about “sharenting” parents who have the whole life of their children basically chronicled. And if you calculate how many users there are for those tools, and what percentage are parents, it’s like hundreds of millions of children whose lives are chronicled and fed into this AI system, and they couldn’t consent. So I feel very strongly that they shouldn’t do that.
And obviously writers, artists, actors - those people who rely on their content for a living, I think that AI companies should get their consent and also compensate them.
That’s a great observation about sharenting and how posts and photos about minors are being shared without their consent. And now that information is out of their hands for good.
Yeah. I mean, obviously AI companies say they haven’t trained the model on this stuff before, but it’s kind of hard to believe. But yeah, there’s not much we can do about it. And you never know what’s going to happen with this data in the future. So it’s very unfortunate, and I think it’s extremely unethical.
Yeah, the Meta situation, I found out about it the same that you did, not from Meta. But as a resident of Europe, you’re lucky that you have that protection. But I actually ended up deleting my Facebook account before the deadline, my content, because they responded to my opt-out request to say that they didn’t need to honor it. And I really had no other way to protect myself from that information from being used for Meta’s AI models, except to delete the content. There’s still no real way to ensure they haven’t already used it, or that they actually deleted it in the back end.
Yeah, exactly. It’s hard to trust, unfortunately.
So as a member of the public, there are probably cases where your personal data or content may have been used, or has been used, by an AI-based tool or system. Do you know of any specific cases that you could share?
Well, generally I expect that anything I upload online, especially my professional videos and learning materials, will be used to train an AI tool. Because I’ve read enough articles about companies like Google thinking that anything that goes into their search engine is somehow content that they can use as they like.
I’m not saying I agree with this practice, but based on some things that they have said, I have the impression that large tech companies, like Google and OpenAI, think that any data they can get access to is “public domain”.
So until there is any regulation to stop them, I just believe they’re doing it. Especially considering the fact that the data needed to continue to improve the models is too scarce a commodity, and the public pressure isn’t enough to realistically expect them to do otherwise.
Yeah, you’re probably right that they ARE doing it. The latest uproar is the story that broke this week on how NVIDIA used scraped content to train their AI models. Fair use concepts vary quite a bit worldwide and that makes it even more challenging to try to enforce. Public awareness and pressure do seem to be building.
Yeah, I’m really happy about that. It’s a really good sign. Sometimes it feels like the tech companies are all-powerful, but public pressure does work. They obviously know tech much better than the rest of us, so most people are at an incredible disadvantage. But I do believe that public pressure will continue to increase, especially once people are more aware of what it means to share your data. That’s one of the reasons that I do what I do, empower people and kind of grow the grassroots movement to take back our data and our lives as much as possible.
That’s great. Has a company’s use of your personal data and content created any specific issues for you, such as privacy or phishing?
I haven’t had any specific issues. One of the things that I’m really concerned about are these companies that do the DNA analysis, like 23 and Me and Ancestry. The information’s being collected by largely unregulated private companies. I know a lot of people who do that, and so much can go wrong for individuals, especially in the US. I was just in the US talking about it with my family. We were sitting around the table with 3 generations. Some doctors were there as well, and they were saying, imagine if this information gets to the health insurance companies. This is obviously forbidden, but apparently the companies can share it with life insurance companies and drug manufacturers in the US. All sorts of stuff could go wrong with that.
Yeah, that’s a great point. And going back to the issue of consent, if someone gives their DNA sample to one of those companies, it can impact many of that person’s relatives, even though the relatives never gave their DNA sample or consent. And there’s a lot of potential value in medical research to understand how our genetic profiles affect our health. But there’s also very high potential for harm, in insurance (as you point out), or there are other ways that bias can slip in.
Well, I’ve read a lot of articles about the police catching serial killers from 20-30 years ago because of this kind of information. So on the one hand, this can be seen as positive. But on the other hand, that’s a slippery slope. It’s a very slippery slope when the police are able to find criminals based on information that someone else has put out there on platforms like Ancestry or 23 and Me.
Yeah, we’re seeing a lot lately that public distrust of AI and tech companies has been growing, and maybe that’s related to the increasing awareness that you and others are trying to create.
What do you think is THE most important thing that AI companies need to do to earn and keep your trust? And do you have specific ideas on how they can do that?
Well, I think that’s a really hard one. I had to think about that for a long time when I was preparing for this podcast.
The problem is that this mistrust stems from the people only slowly understanding what big tech has been doing, such as collecting and using their personal data, for such a long time, and how it affects society in the company’s favor. I deal a lot with parents and children, and I see all of the mental health issues that teenagers and children are getting into as a result of how their data is used to create feeds solely focused on increasing engagement. There is often a lack of ethical consideration for whether the children are even ready for what they are being shown. I mean things like info bubbles, expansion of misinformation, and interference in elections.
If someone is on social media for 5 hours a day, and there’s a lot of misinformation there, that they are constantly being fed, by definition this is influencing elections, and that’s a problem.
Another issue is the empowerment of people who are looking to harm children. Before the internet, before social media, pedophiles and people who want to manipulate children couldn’t just easily access them, like it is now.
So that’s kind of the starting point. People are starting to realize it’s going to be more and more difficult for tech and AI companies to earn trust. But obviously there are some things that could help to build trust. Like:
Slowing down the AI race. I know this is kind of a dream, but hope dies last.
And then invest more in sustainable AI before pushing to grow engagement and model sizes. There are ways for AI to use less water and electricity, but the technology is behind. AI and tech companies should invest in it first, not as an afterthought. That would really help.
Accept regulation - I mean, really accept regulation, external regulation, not just say you’re going to.
Pay the owners of the data that they’re using to train the model. I know it’s a very difficult concept. But it should be something that’s kind of a given, that we need to try to find a solution for.
The last one: just stop saying that they’re committed to helping groups like children where it is easy to prove their actions are not sufficient. For example, Meta. I know I keep bashing Meta today. But they were so proud of some initiative against sextortion for teenage boys. They announced that they have a solution that could check if some images that are being sent contain nudity. And the solution was, if they see with their AI that there’s nudity there, they’ll put a popup, asking the teen “are you sure that you want to do it?” And the teenager can just push Yes or No. And obviously, we were all teenagers at some point, of course you’re going to push Yes. And then it’s done. So much can go wrong! That’s saying that it’s protecting without really protecting. If you have the AI to recognize that nudity is there, and it’s illegal for a minor to be sharing or receiving nudity, then you can just block it! I think this would be a big step towards trust.
That makes a lot of sense. Thank you for sharing all of those suggestions - those are all really good points.
Is there anything else that you’d like to share with our audience?
Well, as I mentioned, I am on a mission to empower the next generation with AI awareness and online safety and privacy. And to do this, I’ve got programs that work with parent groups, schools, and other organizations that are on a similar mission. So I would love to be contacted by anyone interested in working together to make a difference because there is so much to do and the resources are scarce. So anyone who can work together can make a difference, especially working directly with children and youth to help them help themselves. The digital generation just knows the online world better than older people like me ever will, because they’re growing up with it. But they don’t know the realities of life as well as someone who’s lived longer. So I want to build a collaboration between the different generations on this.
That sounds like a great plan. Angeline, what’s the best way for people to contact you if they are interested in working with you on this mission?
The easiest way is LinkedIn. Just my name, Angeline Corvaglia, or Data Girl and Friends. Or you can go to the Data Girl and Friends website, which is Data Girl And Friends with a dash between the words, dot com.
All right! Well, that’s a wrap! Angeline, thank you so much for joining our interview series today. It’s been really great learning about what you’re doing with artificial intelligence, and why you still use your human intelligence for some things!
And best of luck in your mission to empower our next generation with AI awareness!
Thank you so much.
My pleasure - thank you 😊.
Conclusion
Folks, I highly recommend checking out Data Girl and Friends. Angeline’s videos and educational resources are FREE, and they will help both you and kids you care about to protect themselves online. It’s data dash girl dash and dash friends dot com .
About this interview series and newsletter
This post is part of our 2024 interview series on “AI, Software, and Wetware”. It showcases how real people around the world are using their wetware (brains) with AI-based software tools or being affected by AI.
We want to hear from a diverse pool of people worldwide in a variety of roles. If you’re interested in being featured as an interview guest (anonymous or with credit), please get in touch!
6 'P's in AI Pods is a 100% reader-supported publication. All new posts are FREE to read (and listen to). To automatically receive new 6P posts and support our work, consider becoming a subscriber! (If you like, you can subscribe to only People, or to any other sections of interest. Here’s how to manage sections.)