AISW #005: Ralf Gitzel, Germany-based industrial researcher 🗣️ (AI, Software, & Wetware interview)
An interview with German industrial researcher Ralf Gitzel on his stories of using AI and how he feels about how AI is using his data and content (audio; 17:27)
Introduction
Today I’m delighted to welcome Ralf Gitzel as our next guest in this 6P interview series on AI, Software, & Wetware! Ralf is a highly experienced industrial software researcher based in Germany, with 18 years at ABB Corporate Research. Today he’s sharing with us his experiences with using AI and with AI using his data and content.
This interview is available in text and as an audio recording (embedded here in the post, and in our 6P external podcasts). Use these links to listen: Apple Podcasts, Spotify, Pocket Casts, Overcast.fm, YouTube Podcast, or YouTube Music.
Note: In this article series, “AI” means artificial intelligence and spans classical statistical methods, data analytics, machine learning, generative AI, and other non-generative AI. See this Glossary for reference.
Interview
Ralf, thank you so much for joining me today! Please tell us about yourself, who you are, and what you do.
Hi, I’m Ralf Gitzel. I’m a principal scientist working for a company, ABB, which is a company that is into automation, and also power. As a researcher, I work on AI topics at the moment. I’m also quite interested in generative AI. But I should make clear that at the moment, I’m speaking as a private person, so this is my personal opinion, and not necessarily the position of ABB.
Fair enough! Thank you.
Tell us about your experience with AI, ML, and analytics. Have you used it professionally or personally, studied the technology to some extent, built tools using the technology, etc.?
I’ve worked with what I would call classical machine learning. So not generative methods, but more the classification and maybe regression direction. And I’ve worked with that mainly professionally. I mean, that’s no secret, I’ve worked a lot with condition monitoring on switchgear. And I think I spent more than my fair share of time looking at infrared images of the interior of a switchgear and trying to find a good machine learning algorithm to find hotspots in that. You reach the point when you start dreaming about infrared images of switchgear, which is, okay, we have to finish this algorithm. (laughs)
And on the private side, I’m a user of AI, but I don’t really develop things. But I think I have a good insight on how these things work.
Can you share a specific story on how you have used AI/ML? You talked a little bit about your experience with switchgear.
What are your thoughts on how well the AI features worked for you, or didn’t work? What went well and what didn’t go so well?
What I like about AI is, and I think that’s the strong point, if you have a rough understanding of how things work, but you don’t really know the exact parameters, the exact thresholds for things, then machine learning is really great. Because it’s a statistical approach, and you can just say ok, that’s what happened before, we think it’s going to happen in the future. I mean, there is a bit the caveat that correlation and causation are not necessarily the same. But if you are in a more or less well-understood field, you can kind of say, ok in this case correlation is also a causal relationship. And that has helped us quite a bit. I worked a lot with people who worked in the sensors field, and have a classical signal processing background. And it’s really interesting how you can, with machine learning you can sometimes do things where you would otherwise spend a lot of time trying to manually find the right parameters to get a good signal out of your input. So I think that’s the strong point of AI.
On the other hand, sometimes you find that you spend so much time working on your model and you don’t have enough data, so you start to do feature engineering. And I’ve also seen the case where at the end, you’ve done so much feature engineering, that you think okay, now we’ve solved the problem analytically, we don’t have to do machine learning any more. So that was also an interesting experience.
It sounds like you’ve had some good successes with using AI for some of your work purposes.
Are there any things that you’ve avoided using AI-based tools for, for your work, for some things? Can you share an example of when, and why you chose not to use it?
Yeah, if we see AI as what it’s considered today, which is machine learning and mostly neural network machine learning - for that, sometimes the problem is you absolutely do not have enough data. You don’t have any chance to get enough data, because that would be too expensive to generate. Then I tend to avoid it.
And the second reason to avoid AI is, if you need to understand what is happening. You can’t have a black box model, because it’s something where it’s really critical that you understand why you get a certain result, and you kind of want to audit what’s happening. And then it’s not a good choice, I think.
Great, thank you - those were great examples, Ralf.
So, one thing, there’s a common and growing concern nowadays about where AI and machine learning systems get the data and the content they train on. Often they’re using data that users are putting into online systems or publish online, and they’re not always transparent about how they intend to use our data when we sign up.
So how do you feel about companies using data and content for training their AI/ML systems and tools? Do you think ethical AI tool companies should get consent from (and compensate) people whose data they want to use for training? (Some examples: musicians, artists, writers, actors, software developers who write code, medical patients, students, or social media users)
So I would distinguish 3 cases.
So the first case is where it’s something that’s useful, but it doesn’t have a negative impact on me. I don’t know - for example, I’m at the doctor’s, and that can help to train a model to detect certain diseases. And the data is anonymized, and it’s used to create such a system. Even if they make money from it afterwards, I’m totally fine. That’s okay for me.
The second case is where they use my data to do something to trick me into doing certain things that are not to my benefit. And in that case, of course, I’m against it.
But I think the worst case is if data is taken to generate new data to sell instead of the data that maybe I would like to sell. So this is the case where you were referring to where there are musicians or painters or other artists, and then their data is used to create a system to replace them. I think that is not a very good approach.
Yeah, that makes sense.
So you’ve worked with building AI-based systems. To the extent that you can, and if you can’t share anything that’s fine, but can you share anything about where the data came from for that system, and how it was obtained?
At work, I’ve been working in two big projects that are public-funded, so it’s also no secret. There was Fleming, and there was AProSys. And these two are about condition monitoring. And we actually did endurance tests in the laboratory. So the data was generated by us. There was, I think, some customer data, but that was really not significant. So that was just to test how the system reacts to real data. So everything else was created in the laboratory - quite expensive, I can say. I mean, the equipment is expensive, takes months to run in this accelerated aging tests. So that’s where our data came from.
Great! So as someone that has used AI-based tools: as a user of the tools, would you know if the tool provider’s been transparent about sharing where their data came from? For instance, a lot of people use ChatGPT or Copilot for generating code, and tools like that.
Yeah, I think they are absolutely not transparent. And I mean, I hear things like that they deleted certain databases because they were afraid that they would have legal consequences if that was found out, what was in there. There are things where people didn’t copyright their creations because they felt they wouldn’t need it. I’m not so deep in copyright law, but I think that’s also associated with a fee, and legal overhead. So many people don’t copyright their stuff. I think they never expected it would be used to train such a system. I think that’s at least ethically dubious, if not legally.
My impression is that a lot of copyrighted material was also used to train these systems. Not because that was revealed, but - you know, when you use an image generator and all of a sudden, you see something that looks like a watermarking from, I don’t know, these big companies that sell images - Pixabay and those companies - when you saw it, I get the impression that that was scraped off the net.
But there is really no transparency. And I think there is a good reason why they are not transparent about the data they are using - because they are afraid that there will be legal consequences.
Yeah, there’s a case that I’ve been following in the music area where it showed up in the recent lawsuits that the music labels filed, where one of the tags of a producer for a music clip showed up in an artifact that was generated by their tool. So they obviously scraped that producer’s content without giving them credit or compensating them. Unfortunately, I think, more common than we would like it to be.
As a member of the public, there are probably cases where your personal data or content may have been used, or has been used, by an AI-based tool or system. (For instance, there’s been a lot of ruckus about Meta using people’s Facebook and Instagram photos for training their AI.)
Do you know of any cases where your personal data or content has been used?
I don’t know any. I think that’s, a bit, part of the problem, that there’s not the transparency. On the other hand, I’m not using WhatsApp, I’m not on Facebook. As you know, I’m on LinkedIn. Sometimes I wonder what happens with that data. There’s no good transparency.
Yeah, LinkedIn is an interesting example, because there’s the information that we post for the people to see, and they also have the information about our job history and what we’ve done and our contact information. So they have all of that information.
Do you know of any cases where a company’s use of your personal data and content created any issues for you, as far as privacy or phishing or anything like that?
I don’t really know of any particular case. But that’s not so much AI, it’s the use of data.
Sometimes it’s a bit spooky, you know, going back to the LinkedIn example, when you see what happens when you like a post written by someone that you haven’t talked to for a while, and all of a sudden you get more of his or her posts. And then you really wonder, how does this push into a certain direction.
It’s the same on YouTube, you’re interested in a certain topic, and you get more and more extreme positions on that. More harmless example is, you’re interested in vegetarian cooking, and then, they give you content about vegan cooking, and then maybe about eating nothing, I don’t know!
That is my experience. I read recently, and I tend to agree, the problem is not so much AI. The problem is that algorithms are using our data to make decisions. So that could be AI, but it could be a simple rule-based thing. And there’s an emergent behavior on the internet, and that’s not going into a good direction. And it’s not like an evil mastermind that pushes us into one direction, but it’s more an emergent behavior that comes about and that leads to really strange effects. There’s polarization, and you go into bubbles, and whatnot.
I’ve been seeing that public distrust of AI companies, but also just companies that use data, has been growing. What do you think is THE most important thing that these companies would need to do to earn and keep your trust? Do you have specific ideas on how they can do that?
That’s a good question. I mean, there is always a limit to what you can do to instill trust. You can be more open about what you do. But then the question is, are you as the company really telling the truth? Or is this just a PR campaign to make you look good?
So the next step is to have open source. Then the question is, are you really open sourcing everything? And in some cases, is it really a good idea to open source everything? Because that also means that some technologies get into hands where maybe you don’t want them. I guess in principle I’m pro open source, but I also see that there can be some issues with it. So it’s very hard for a company to come up with a good line.
But I think a good first step would be to give you the feeling that they are not manipulating you. There are things that are quite obvious. You get certain advertisements and you realize that’s because you visited that website and looked at books to learn a certain language, and all of a sudden you are flooded with stuff related to that country. And then you really get the impression of, my data is being abused.
They have to move away from these obvious abuses. But that’s not enough, and I think it will be hard. I mean, once the trust has been lost, you know, it’s very hard to re-establish it. And to be honest, I don’t know what they could do. Maybe if there is a longer time where I have the feeling that everything is going ok, then maybe that would establish more trust.
Very fair - thank you, Ralf.
Is there anything else that you’d like to share with our audience today?
Yeah, I mean, when you invited me to this interview, I thought, OK, how do I feel about generative AI? And that’s a really a topic that I read a lot of posts about, and also comment on posts. It’s really fascinating because on the one hand, this is something that really appeals to me, both on the side of technology, but also on the side of using it. But on the other hand, I can see bad consequences coming from it, in many, many directions, being bad for the content creators, and also being used in fake news and stuff.
I think that is maybe also a big challenge for us in this context, that we have to see, how can we consolidate our different views. Even as a single person, I see different aspects of that. And how can that be put to good use, I think?
And what I would like to see is maybe more, if generative AI was designed more to help content creators. And I’m not just talking about just typing some stuff and getting some random image to slap onto some posts or something, but really help people to create stuff in a creative way.
Maybe let’s compare it to a word processor, like Microsoft Word. It’s really nice that I can type a text and I don’t have to photocopy it, or handwrite it, but I can print it or I can send it electronically. I can send it to as many people as I like. That is really something that would help me to perpetuate my ideas and give them to different people. That’s a useful tool.
But if some tool writes a text for me, then I’m no longer creating content. I’m just spreading content that’s not mine. And if we can find a good way to have a tool that is more like a word processor, maybe using AI, I think that would be a nice tool.
On the other hand, we have to make sure that we are not producing a lot of CO2 just doing stuff that we don’t really need to do. This is a really complex field that I think about a lot, actually.
It definitely is complex! Thank you so much for sharing all of that today, Ralf. I appreciate you joining our interview series and for sharing what you’re doing with AI, and how you’re using your human intelligence for things.
Thank you, Karen - thanks for having me. And I really appreciate that you are also shedding a bit of light into this topic, with this series of interviews.
Conclusion
That’s a wrap! I hope you all enjoyed this interview as much as I did. Let me know what you think:
About this interview series and newsletter
This post is part of our 2024 interview series on “AI, Software, and Wetware”. It showcases how real people around the world are using their wetware (brains and human intelligence) with AI-based software tools, or being affected by AI.
We want to hear from a diverse pool of people worldwide in a variety of roles. If you’re interested in being an interview guest (anonymous or with credit), please get in touch!
6 'P's in AI Pods is a 100% reader-supported publication. All new posts are FREE to read. To automatically receive new 6P posts and support our work, consider becoming a subscriber! (If you like, you can subscribe to only People, or to any other sections of interest. Here’s how to manage sections.)
Credits
Ralf Gitzel on LinkedIn
Audio Sound Effect from Pixabay