Listen now (54 mins) | An interview with Ireland-based data protection and privacy expert Carey Lening on her stories of using AI and how she feels about how AI is using people's data and content (audio; 54:20)
Interesting to read your perspectives about the use of copyright-protected materials. Not getting why companies can pay lawyers, data scientists, rent, etc. but they can't pay content creators? Why is that so impossible? I think its odd that when educators use my work I get a check, but when billionaires use my work I get nothing.
You noted, "Or they're working with actual public domain or Creative Commons materials." However, CC folks are now realizing that the licenses aren't adequate because CC points out the need for this addition to the current license structure, because:
- The use of openly available content within generative AI models may not necessarily be consistent with creators’ intention in openly sharing, especially when that sharing took place before the public launch and proliferation of generative AI.
- With generative AI, a handful of powerful commercial players concentrated in a very small part of the world produce unanticipated uses of creator content on a global scale.
In other words, people chose open access so those in less fortunate situations could access their work, not so billionaires could chew it up and spit it out with other stolen garbage. SO - best not to advise people to upload articles etc. to AI unless the writer has given express permission.
I think you assume I'm making a moral argument/justification here (i.e., it's expensive/hard/would cut into profits). I'm not. I'm making a very practical argument: namely, how?
Billions of people have posted innumerable amounts of content on the Internet. How do you track all those folks down? How do you assess the value of the content? By volume? By impact? To who?
Let's say we only focus on licensable content. That's fine, except at that point, the billionaires are paying other millionaires and billionaires -- publishing houses and media companies -- not the individual content creators. Is that any better?
I'm not saying what was done was the right/moral/best thing to do, but it is what was done, and solving that problem post facto is really, really hard! We've already been here before -- search engines/social media sites also harvest and copy content from the internet, and various laws and lawsuits have tried to extract $$ from the likes of Google & Facebook to compensate the authors/publishers. They've all been phenomenally unsuccessful.
It's going to take a significant technical change to solve this problem. Well, that, or shutting it all down.
Hi Janet, thank you for your comment. Since you're quoting Carey's comments about Anthropic's data sourcing from public domain or Creative Commons materials, I'll defer to her to reply :)
Interesting to read your perspectives about the use of copyright-protected materials. Not getting why companies can pay lawyers, data scientists, rent, etc. but they can't pay content creators? Why is that so impossible? I think its odd that when educators use my work I get a check, but when billionaires use my work I get nothing.
You noted, "Or they're working with actual public domain or Creative Commons materials." However, CC folks are now realizing that the licenses aren't adequate because CC points out the need for this addition to the current license structure, because:
- The use of openly available content within generative AI models may not necessarily be consistent with creators’ intention in openly sharing, especially when that sharing took place before the public launch and proliferation of generative AI.
- With generative AI, a handful of powerful commercial players concentrated in a very small part of the world produce unanticipated uses of creator content on a global scale.
(https://creativecommons.org/2024/08/23/six-insights-on-preference-signals-for-ai-training/)
In other words, people chose open access so those in less fortunate situations could access their work, not so billionaires could chew it up and spit it out with other stolen garbage. SO - best not to advise people to upload articles etc. to AI unless the writer has given express permission.
I think you assume I'm making a moral argument/justification here (i.e., it's expensive/hard/would cut into profits). I'm not. I'm making a very practical argument: namely, how?
Billions of people have posted innumerable amounts of content on the Internet. How do you track all those folks down? How do you assess the value of the content? By volume? By impact? To who?
Let's say we only focus on licensable content. That's fine, except at that point, the billionaires are paying other millionaires and billionaires -- publishing houses and media companies -- not the individual content creators. Is that any better?
I'm not saying what was done was the right/moral/best thing to do, but it is what was done, and solving that problem post facto is really, really hard! We've already been here before -- search engines/social media sites also harvest and copy content from the internet, and various laws and lawsuits have tried to extract $$ from the likes of Google & Facebook to compensate the authors/publishers. They've all been phenomenally unsuccessful.
It's going to take a significant technical change to solve this problem. Well, that, or shutting it all down.
Hi Janet, thank you for your comment. Since you're quoting Carey's comments about Anthropic's data sourcing from public domain or Creative Commons materials, I'll defer to her to reply :)