Introduction - Karen Spinner

This article features an audio interview with Karen Spinner, a 🇺🇸 USA-based writer and software developer, and the author of Wondering About AI. We discuss:

how she built on her experience as a copywriter to begin building with AI

why she shut down her successful StackDigest aggregator and analysis project (hint: it wasn’t due to its use of AI)

the terrible travel itinerary her AI tool gave her & how her family adjusted during the trip

using an LLM with Reddit to help solve the mystery of her cat’s seizures

her experiments with using AI to generate infographics

why she thinks the big AI and tech companies should be broken up

how LLMs based on Common Crawl are like the tofu of writing

building her new Future Scan prototype and looking for beta testers

and more. Check it out, and let us know what you think!

This post is part of our AI6P interview series on “AI, Software, and Wetware”. Our guests share their experiences with using AI, and how they feel about AI using their data and content.

This interview is available as an audio recording (embedded here in the post, and later in our AI6P external podcasts). This post includes the full, human-edited transcript. (If it doesn’t fit in your email client, click HERE to read the whole post online.)

Note: In this article series, “AI” means artificial intelligence and spans classical statistical methods, data analytics, machine learning, generative AI, and other non-generative AI. See this Glossary and “AI Fundamentals #01: What is Artificial Intelligence?” for reference.

Interview - Karen Spinner

Karen Smiley: I’m delighted to welcome Karen Spinner from the USA as my guest for “AI, Software, and Wetware”. Karen, thank you so much for joining me on this interview! Please tell us about yourself, who you are, and what you do.

Karen Spinner: I am a writer on Substack. I’ve got a newsletter called Wondering About AI, and I use that as a place where I can write about my personal experiences with AI. When I try something out, I can write about if it works. And a lot of times it doesn’t work. And I try to be very transparent as far as the story that I’m telling about AI, which in my view is very mixed. There’s a lot going on there.

Professionally, I’ve been a copywriter for many years. Currently I am also building some AI projects. One of them is a research tool that makes it easier for me to find and interpret papers on AI-related subjects that I’m interested in.

I think that’s enough.

Karen Smiley: Yeah. I have to say, as someone that reads your newsletter, I think these stories that you post about the things that don’t work are the most interesting, because most of our learning comes from there.

Karen Spinner: Yeah. You don’t necessarily want to read about “I built this thing and it went flawlessly.” You can’t learn anything from that.

Karen Smiley: Exactly. Yeah. And I hope we’ll talk more about your latest projects later in the interview. For now, could you share with our audience what your level of experience with AI and machine learning and analytics is, and how you’ve used it professionally or personally, or if you’ve studied the technology?

Karen Spinner: I think I could be characterized as an AI power user. I typically will use Claude Code every day for something. Sometimes I’ll use Gemini and some of the other models. I’ve had kind of an evolution with AI. I started out when I was a copywriter. I researched AI to figure out “Okay, can this thing actually write as well as a human?” And I found, no, maybe not so much. But I found it really interesting. I was learning Python at the time and I realized that AI is really good at writing Python, better than I am. And so I started using it to build a variety of software tools for my own use and it kind of grew from there.

So as I learned more, I took a machine learning class, Google’s crash course for machine learning. And then I took some additional courses on Hugging Face to get an idea of how it operated. In some of my projects, I’ve actually been applying machine learning to various problems. It’s been a really interesting, fun thing to learn. And it’s given me a kind of a perspective when I’m also just interacting with AI models as a user in terms of what may also be going on behind the scenes.

Karen Smiley: Great. Can you tell us a little bit about some of the projects that you’ve worked on?

Karen Spinner: I think the first kind of big project that I worked on is I had this idea. This is the point at which I was writing maybe a hundred different articles a month, and I was getting very burned out. I thought maybe a little bit of AI help would be welcome, you know? And I didn’t like how any of the existing tools really worked as far as copywriting.

So I used AI to help me code, basically, an AI-powered writing platform. It was set up to basically mimic an agency workflow. It would let you upload a style guide so you could spell out exactly, “Okay, this is how we want to write.” You could upload a bunch of writing samples. And then when you wanted to initiate a writing project, you would fill out a big old detailed creative brief, just like you would give a human copywriter.

And so I got that working. And I thought, “Oh, this is pretty cool.” It still didn’t reach a hundred percent of what I wanted, but I was like, “Wow, this is actually better than just dealing with ChatGPT”. And I’d built in an editor, so I was also able to have the AI update specific passages of copy in real time. And I was like, “Okay, this thing is really cool. It’s doing what I want. Maybe I can turn it into a product.”

And so not knowing any better, I decided, “Okay, my first step in this case is I’m going to just run some Google ads for this thing.” And so I did. I had a couple hundred bucks and I’m like, “Okay, we’re going to see if I can get some interest, get some other people using this.” And I discovered that the main market for this tool was kids cheating on their tests. At first I was like, “Oh, this is awesome. People are using this. They’re signing up.” And I’m like, “Oh, no, no. They’re all, like, 17. They’re all kids cheating on their essays.”

And so then I shut off that ad. And I was like, “Okay, well, let me see if there are other professionals like me who might be interested in this kind of tool.” And when I did reach out to folks in my position, they were like, “Oh my God, why would we want to do that? I’m a professional writer. I don’t want to use AI.” And I was like, “Oh, okay.” And I found that that was really the majority opinion when I talked to other writers about AI, that folks really didn’t want anything to do with it.

I still have the project for my own personal use and it still lives on, on my desktop, but I did take it out of production. So it’s no longer a product, it is a project.

Karen Smiley: Thanks for sharing that story. How did you know that they were 17 year olds?

Karen Spinner: Google gives you demographic information, so if you go into Google Analytics, you can get an idea of who’s clicking on your ad and where people are going. And so I was able to see that it was a real young crowd.

Karen Smiley: That’s a shame, because it does seem like there are people out there who would benefit from this. I especially hear about a lot of people who are, say, neurodivergent, and they find that an AI tool actually helps them. It seems like the ads didn’t reach that segment of people, though, who might have found it useful.

Karen Spinner: Yeah, and I think it’s interesting because I built it, and AI time is, like, so fast. So I built this thing maybe like a year and a half ago, and I feel like it’s now totally out of date!

Karen Smiley: Yeah. That can definitely happen! Do you have another project that you’d like to share with our audience?

Karen Spinner: Sure. Well, I think, since this is Substack, I did a second project, something called Stack Digest. When I first signed up to Substack, I went kind of cuckoo with subscriptions. I ended up with a couple hundred and my inbox was getting killed! So I shut off the emails, but then I wasn’t actually seeing what any of my subscriptions were doing. I thought it would be kind of cool to build a tool that would allow me to get a quick little summary of what all of my subscriptions have been writing and have that in just one little package digest.

And so I built that tool. It started out as a little Substack note and I got enough interest. I’m like, “Okay, I’ll build this thing.” And I built it, got it into production. I added a lot of features because as people used it, they would come to me and say, “Hey, we really need to add this.” And so I was usually happy to oblige. And I ended up having an analytic tool. It actually looked at the whole Substack universe of newsletters – I mean, not the whole universe, but a nice cross section – and provided information, like, “What percentage of newsletters have X number of subscribers?” And things like that. And it allowed people to map their newsletter in the universe of similar newsletters. And in that I’d actually used some machine learning clustering.

And unfortunately, one of the things with software development is, if you are using information from another platform, you typically interact with that platform through something called an API or an application programming interface. And typically these APIs, they’re usually pretty well documented. There’s usually rules about what you can and cannot do with an API.

Now Substack, they have an API that their internal developers use, but it is not documented officially, and they have no policies or rules on it. And so while people in GitHub have mapped it, and you can map it pretty easily yourself by basically using Google developer tools and taking a look at the endpoints and the URLs, if you use something like this in production, you have two main risks.

One risk is that whoever owns that API will just radically change it and it becomes unusable, and so your software is instantly unusable.

And the second problem is that if there’s no Terms of Service [TOS], it’s really hard legally to figure out what you actually are permitted to do with the information that you’re getting over that API. And when I’d had a lawyer take a look at Substack’s agreement, with just the TOS that we sign when we become Substack publishers, all of my activity with the API sounded like it was going to violate these Terms of Service.

I know there are people who do have production apps that are running off of the Substack API. And I think everybody’s got to make their own personal risk determination, but I’m risk-averse. So I thought it would probably be a good idea to back off having a large scale production app using that API.

Karen Smiley: I was, I think, one of the first people that saw your note and said, “Oh my God, this would be wonderful.” Because I had the same problem. I had subscribed to, I don’t even know how many hundreds it is right now. I don’t get to see them all and I end up missing articles. So having the digest was just amazing. And I was able to use it for our She Writes AI community, which has just been incredible. And it’s been taking off. It’s a great tool. It’s a shame that Substack doesn’t just release that API and make it something that people can use, with guardrails, with rules about how often you can call these within an hour or every minute or whatever. I would hope that they would do that. But since it hasn’t happened yet, I totally understand why you had to shut it down, and fully support that.

You mentioned doing a new tool now, which is using AI to analyze research papers. Can you say a little bit about that?

Karen Spinner: Sure. Well, when I research subjects for my articles, I like to go to some primary technical source material. And sometimes it’s helpful. Sometimes it’s too technical for me. But I found this data source called archive, but it’s spelled with an X in the middle [arxiv.org] and it is a repository of all these recent preprints of papers by AI and machine learning experts. And they come out of different corporate research labs and universities and things like that. And they’re really interesting.

And the other thing I noticed is that that data source is entirely open. That as long as I have their little disclaimer on my website, you can use their API to to access all of the information to get abstracts and metadata from all of these papers, and even access the papers themselves as long as you’re not reproducing them.

And so I thought, “Wow, I can take this and apply what I know about machine learning and do things like, Hey, let’s look at what kinds of trends are happening in different aspects of AI research.” So, like, “What are people researching in terms of multi-agent collaboration?” I am able to actually take a look at papers that match that query, and then look for papers that are similar, and build little clusters using those papers that are similar. So you can kind of map that universe of, “Okay, what are papers in this area actually talking about?”

And then what I’m doing is using AI to then make some inferences about, “Okay, what does that mean for what’s going to be developed next?” So I have a variety of research trend reports that do that. And I’d added on to that capability a little bit.

So now we’ve got a broad semantic search for recent papers. So you can just put a query in and instead of getting the full analysis, you can just get a regular list of results with little AI overviews, just like Google. And I added an option — no ads — for ‘find similar papers’. So if you find a paper that seems intriguing and you’re like, “Oh, is this a weird outlier or is this part of a trend?”, You can then get a list of papers that are similar to it.

And I have one more thing in the tool that I’m building. I have to restrain myself from building too many features because it’s fun. But this feature actually takes search results from a query, and then it takes the highest matching papers and then extracts data from them. So you get like a little CSV file of data points that live in those papers. There’s a whole process it goes through, in terms of screening papers that score well, to make sure they do have a result section and graphics, things like that.

So that one, I’d say I have a minimal version of it online. It’s still in a little bit of development. I probably want to see if anyone other than me wants to use this tool first before I really go crazy with that.

Karen Smiley: Yeah. I’m really interested in what you’re doing with your project. I think it’s called Future Scan, is that right?

Karen Spinner: Mm-hmm.

Karen Smiley: I spent a large chunk of my career in corporate research and working in AI. And so we would do things like this. We would be working with PhD students who needed to do a literature review and find out what was relevant and then keep up with new changes in the field. This type of tool, I think, would’ve been very useful back then. And especially the fact that you’ve got it constrained to arXiv, which anybody can contribute to, but if you know what you’re looking for, like if you’re looking for NeurIPS papers or such, then it’s easy to make sure that you’re getting what should be good quality papers. With no hallucinations about making up papers that don’t actually exist, that seems like it would be really a useful tool. So I’m really excited to see what people might do with that tool. I’m not in that field anymore, but I’m really curious to see how people use it. I would think you would get some interest.

Karen Spinner: I think the biggest challenge is that, with a Substack tool, I’m a Substack writer. I know a bunch of Substack writers. This is easy. For something like this, I’m not a machine learning AI researcher. I don’t know anybody in the field. So I’m reaching out to a few people. I did an article on AI and education and I interviewed a few people who are university professors. So I’m going to start with those folks and then see if I can get referred out to actual AI machine learning researchers, because cold emails are not my forte. And being someone who does not have a research background, I feel like there might be a credibility issue when reaching some out to somebody with a tool like this.

Karen Smiley: You mentioned not being good at cold emails. In tech, often they say engineers are terrible writers and they don’t know how to market themselves. And some of this does come down to, I think, marketing. That’s one thing that I’ve noticed more and more, even with my books, is that some people feel like marketing is ‘ew’. But it’s really just an essential skill, is what I am coming to realize.

Karen Spinner: Yeah, no, and I feel like it’s actually more challenging than the coding part. Because it is really kind of relaxing to just sit there and build something and you tinker with it and then it works. That’s cool. And you move on to the next thing. Whereas with marketing, so much of it is a black box. ‘Cause you’re just, you’re trying to reach out to people in a way that’s authentic and respectful, but you also need to get their attention, and there’s a lot of competition out there for attention.

Karen Smiley: That’s definitely true. So it sounds like you’re making really good use of AI for your projects and hopefully products as well.

Karen Spinner: Right.

Karen Smiley: Are there any cases where you have used AI for personal non-software purposes? Like, I think you mentioned one time that you have chickens and dogs. Do you use it for giving you tips on how to take care of them, or recipes, or personal communications, or photography, or any of those types of purposes?

Karen Spinner: Well, there are a couple of things that I used AI for in my personal life. One I say worked out well and one I’d say worked out terribly. I can start with a terrible one first, where I was planning a road trip from Arkansas to Boston, Massachusetts for a family thing and I was like, “Okay, so let me plan an itinerary.” And it gave me an itinerary, and I’m like, “Okay, this is awesome.” I was kind of busy and I just sent it to my husband and I said, “Here you go.”

And then we all got ourselves ready. And then we’re driving, and I’d made hotel reservations where it told me to. And then we’re like, “Okay, they gave us a first day of 15 hours of driving!” And obviously we screwed up by not really scrutinizing it ‘cause we were just really busy, and just hadn’t looked at it carefully. At the last minute, we were just like, “Oh my God, this is just insanity.” So we ended up rerouting everything and changing the various reservations and doing it old-fashioned style. Because I think the AI doesn’t recognize that people can only drive for so long before they fall over.

Karen Smiley: Yeah, definitely. Wow. 15 hours. That would be rough.

Karen Spinner: And then the other thing that I used AI for, which was kind of interesting, is that we have a five-year-old tabby cat and she started having seizures. And she would only have these seizures in her covered litter box. And it was the weirdest thing. We took her to the vet, and they weren’t really sure what was going on. And they gave us some phenobarbital to give her if she had too many seizures. Of course, what is too many seizures? I think one seizure is, right?

So I asked ChatGPT and Reddit. I’m like, “Well, what’s going on with this cat?” And I plugged in her lab work and all the various things that weren’t wrong with her. And what ChatGPT came up with, which was validated by Reddit users, was environmental seizures. That there was something in her environment that was causing her to have seizures. And we looked at the litter box and we thought, with the cover over it, maybe the strong ammonia scent is pushing her over the edge into seizures. And we thought this is the easiest thing we’ll do. We’ll just uncover the litter box and see if she has more seizures. Because she’d had seizures on and off for maybe about a month before I resorted to ChatGPT. And so we literally removed the cover from the litter box, which meant we started cleaning it a few times a day because, you know, it’s stinky. And seizures stopped.

Karen Smiley: Oh, wow.

Karen Spinner: So we were like, wow, that was it.

Karen Smiley: Wow. Yeah. That’s really interesting. So do you have more than one cat using the same box?

Karen Spinner: Yes. Yes, we do. We have two cats. They’re sisters. They were little barn cats and we adopted them.

Karen Smiley: Aw. That was how I got my first two cats many, many years ago. They were sisters too. It was fun. Well, I’m glad you were able to figure that out. That’s a really good story of how it was productive. I’m wondering what tools you choose to use other than ChatGPT. If you have more than one, how do you choose which tools you use for what?

Karen Spinner: Yeah, I feel like I need to cut back on my AI tools. I use Claude pretty exclusively for coding. I do all that through the command line interface. And then I will occasionally use NotebookLM, which is a wonderful free tool. If, let’s say, I’ve got a bunch of research papers, I can pop it into NotebookLM, and it does a pretty good job of grouping them into categories and pulling out little snippets and translating stuff into plain language. I don’t use ChatGPT much anymore. I probably need to just turn that one off. I have whatever the lowest level of subscription is.

And then for graphics, I’ve been experimenting with Nano Banana, the Pro edition, and actually testing it out on infographics, which has been really interesting, both in terms of how far it’s come and also how stubborn some of the limitations are, as well.

Karen Smiley: That’s good. My next usual question is about a specific story on AI and machine learning features and what worked well and what didn’t. Could you say a little more about Nano Banana Pro on how that works well, and what doesn’t work well, especially for the infographics?

Karen Spinner: Yeah, well, it’s funny because AI only very recently has been able to kind of understand text and be able to actually incorporate text into an image. I remember just six months ago, you’d get all these weird garbled symbols and stuff. But I think with Nano Banana Pro right now, it actually does a really decent job of incorporating text into imagery. And I started out using it to build a little avatar and little branded thumbnails for all my Substack articles. And I thought, “Wow, it did such a nice job on this. Maybe it can help me create infographics based on research papers.”

So I played around with that and discovered that, first of all, you can’t just give it a research paper and say, “Write me an infographic.” Can’t do that. It will die. It will think of all these alternatives and you’ll use up the context window and you’ll get a little “Something went wrong” error. I was hoping that would be the most brain dead way to use it. I was like, “All right, so, let me develop some copy.” So I went ahead and developed some proper infographic copy, and then fed it to the AI.

And it has its own style, graphically speaking. It loves a lot of colors and it tries to cram as much copy onto the page as possible. It’s amazing because it is an infographic, but it’s horrifying because it’s ugly as all get out. So I was like, “All right, this is bad. This would be mildly embarrassing. So let me move on.” So then I took a style guide that I developed for my Future Scan product. I use that style guide to teach Claude what it should use in terms of fonts and colors and stuff. So I was like, “All right, let me clean this style guide up and I’ll give Nano Banana the style guide. I’ll give it a copy and see if it does any better.” And it did do much, much better. Very respectable output. At first, I was like, “Oh my God, I’ve discovered it. I don’t have to use Canva ever again.” And then when I took a closer look, unfortunately, I was like, “Oh, wait a minute. If I zoom into this thing, I see that all the color fills, all the little background blobs are blotchy. They have blotches. What is this?” And I was unable to prompt it away. You can’t get rid of it. It’s an artifact of just how it’s doing its thing.

Karen Smiley: Oh, wow. Okay.

Karen Spinner: And so that’s a problem. So then you have to either tell it not to use color fills with text, or you have to go into a Photoshop type tool and manually fix it. And I know some people who are wizards at Photoshop who could do that. I’m not there yet, so my solution is to tell it, “Just don’t use those.” Then you can get a copy and you can fill in yourself. However, then you get some of the issues with the resolution and the text. The text is usually mostly good, but again, if you zoom in on it, or if you have it at any kind of decent resolution, it’s pixelated. You’ll see little places where it’s supposed to be a curve, but it’s a little jaggy thing, or it looks like something’s taken a little bite out of the font someplace. So there’s that aspect of it as well.

And again, for some purposes, if it’s like a social media post and it’s only going to be visible for like six to eight hours, you may not care. But in terms of putting on a branded website or something, I’d say no.

Karen Smiley: Mm-hmm.

Karen Spinner: I’m very torn because I am a terrible designer. So I think in some cases I probably will just say that’s good enough. Put it on LinkedIn! But I think in terms of, would that be best practice? No. And is it really ready to press a button and publish an infographic? I’d say no. If you want to actually do it the right way and have it be a nice job, you’re going to have to commit to spending some time in post-production, depending on your skill level and your preferences. It’s the same thing with AI written copy. You have to edit it. And would you rather spend your time editing, or would you rather spend your time writing? I think it’s probably the same kind of situation with the graphics as well. Would you rather be messing in Photoshop, or would you rather just create the graphic? And for somebody like me who’s not a good designer, I’m like, “Yes, I’ll just mess around in Photoshop. I don’t want to try to create it.”

Karen Smiley: Yeah. I’m curious if you’ve noticed how people respond to the graphics that you’ve generated with Nano Banana or other tools like that, as opposed to the ones that you’ve created yourself?

Karen Spinner: Let’s see. The branded graphics that I have in my thumbnails has been overwhelmingly positive in terms of people. They all use the same kind of accent color. And Nano Banana Pro was actually able to keep a fairly consistent cartoon avatar. So it looks like pretty much the same little person from thumbnail to thumbnail. I created four or five different reference images that I use for that. So I’ve been really pleased with the thumbnails and will continue to use it unless they change the model in a bad way.

And then on LinkedIn, I have published three images that honestly, from a design standpoint, are fairly atrocious. One of them was just a raw infographic made by the model. And it actually got more engagement than anything that I’ve published recently. I was really surprised. And I don’t get very much engagement on LinkedIn. It’s a platform I struggle with. But I was like, “Oh my God. It got like 30 likes.” And then I tried again and got, not quite as enthusiastic, but fairly similar results. And this was one where I’d applied some branding to it, so I thought it actually looked much sharper and more coherent than the previous one.

And then the final thing is, I’ll try one of the little cartoons that they create. So I did a comic strip and I published that on LinkedIn. And then that one got flagged by the LinkedIn editor, where it was like, “Oh, this is AI-generated content. You have to put this badge on.” I’m like, oh, okay. But I did it anyway.

Karen Smiley: Really? Okay. Yeah, I didn’t know LinkedIn was flagging AI-generated image content.

Karen Spinner: An automated thing where it’ll pop up and it puts this little content mark on it, which unfortunately will usually sit in the middle and in part of your title. If you don’t know where the content mark is going to be, you can’t design around it. But I’d never seen that before. And so I’m like, “Okay, well, I’ll publish this thing anyway.” And then it got really no impressions. So I think there’s a throttling aspect. If LinkedIn detects that it feels like it’s an AI-generated image, it’s going to affect your reach. So don’t use the comic strip format.

Karen Smiley: Wow. Okay. Yeah, that’s a really interesting story. I didn’t know that LinkedIn checked that or flagged it. Perhaps they’re shadow-banning it, I think, is a term some people use for that kind of content?

Karen Spinner: And LinkedIn is an interesting platform ‘cause you can do some automation with it. I think through Buffer, you can schedule posts. And I think there’s a few different tools that have gotten LinkedIn’s blessing to exist. But I’m also aware of a lot of tools that accessed LinkedIn and were sued and were banned, because I guess LinkedIn is really selective about who they allow to use their API and how it works. They’re just an interesting platform.

Karen Smiley: Yeah. So you’ve talked about a lot of ways that you use AI based tools. Are there any examples that you could share about ways that you don’t use AI tools, or situations where you would not use an AI tool?

Karen Spinner: Well, it’s interesting because when I write articles, I generally write the first draft myself. And I think this is just because, with design, I would be like, “Oh no, I’m going to have to design an infographic. It’s going to take me, like, two days.” Writing is something that I’m relatively experienced at. I can be, like, “Okay, I can crank this out in a couple of hours and then use AI to help me fact-check and clean up copy and edit.” So I think how we use AI really depends a lot on what our own expertise is. And at least me personally, I am using it to backfill things that I’m not very expert in.

Karen Smiley: You mentioned you also use it for writing code. Do you also write drafts of code yourself and then use AI to follow up? Or do you use a different approach on code?

Karen Spinner: With code, typically I’ll write my own specifications. So I’ll write out in a pretty detailed fashion, okay, exactly what something is supposed to do. Like, “Okay, the user does this, and then the data goes here, and then this is how it should be stored.” So I get really granular in the specification., but then I usually do have AI write most of the code. And then I continue to learn coding by just taking classes and doing practice exercises and things like that. Because I know enough now that I can look at some AI-generated code and know if it’s screwed up. So I know enough to debug. But in terms of writing for drafts of code, I’m going to be slow.

Karen Smiley: Yeah. Yeah. That’s fair. You mentioned LinkedIn putting a label on automatically when you have used AI to generate something. Do you have anything that you do to identify whether or not you used AI to any extent, or to some degree, on some of your content?

Karen Spinner: Well, it’s interesting because this comes up as a big debate. I’ve heard both sides of it, but I typically will disclose how exactly I’ve used AI. So if it’s an image, I’ll be really clear, like, “Hey, I used Nano Banana Pro for this.” Or if I used Claude to synthesize a paper or to edit my copy, I’ll disclose that in a little blurb at the beginning of my articles. And I kind of like doing this because first, a lot of my audience are writers, and I think those folks, writers in particular, are really sensitive to how AI is being used. It is a whole separate conversation. But it’s really been terrible for freelance writers and for a lot of people in creative fields. So I just want to be sensitive to that and be really clear what I did and did not do with AI, so that if folks are like, “No, I don’t want to consume content that has been touched with AI”, I want to be respectful of that.

Karen Smiley: Mm-hmm.

Karen Spinner: On the other hand, there are a lot of folks who use AI for accessibility purposes. They simply struggle to communicate without it, but they have their creative thoughts and they’re finally able to express them via AI using it as a language tool.

So I feel like there shouldn’t be any shaming of people for using AI. And I wonder, well, would a disclosure requirement, for people using it for accessibility, would it then limit their audience by having that label? So it was really kind of a complicated topic.

Karen Smiley: It is definitely complicated, yeah. The idea of disclosing whether or not AI was used – it kind of reminds me of using pronouns. Just normalize it, you know? If everybody does it, then it doesn’t make other people stand out when they choose to or need to use theirs. You’ve probably seen, I’ve written articles about shaming. The shaming needs to stop because it solves nothing. If the problem is job loss for creative people because their content was stolen, let’s focus the shame on the people who did the stealing.

Karen Spinner: Professionally, I think people, almost half are obliged to use AI at this point, for some basic survival. So the idea that “Okay, we’re going to boycott it”, I don’t think that’s realistic in terms of, could that even be organized? So I think we’re kind of stuck with it, and then we just need to find ethical ways of using it and communicating about it. And yeah, I think, as you’d written, just shaming people is not super productive.

Karen Smiley: Yeah, you were referring earlier to it being hard to avoid. It’s almost impossible: even as you’re writing, autocorrect jumps in. Even though I don’t seek out using AI tools for my writing, could I say that there was no AI that ever touched any word that I wrote? Probably not, because even autocorrect kicked in. If I’m typing on my phone, I make typos and I use it for fixing that. That’s a form of accessibility, in a way. So I think it’s very difficult to avoid. I think some of the objections are ethical, because of the way that the tools were sourced and labeled, and everything else. Sometimes it’s that some people use the tools in a very lazy way, and what some people object to is just this crap all looks the same and they don’t want to see it, ‘cause they assume that the words aren’t unique, human thoughts either. And that’s not a safe assumption, but it’s an understandable assumption, I think.

Karen Spinner: Well, it’s interesting because all AI models are trained on the Common Crawl. So they’re all based on the same huge collection of text. All AI models gravitate to the same middle-of-the-road writing style, that they have the same kind of — I’ll just say a flavorless sort of — it’s like the tofu of writing! And so I think style is one of those things where the AI can get some words out, but if you’re creating art, and not just a communication, they need to be adjusted.

Karen Smiley: Yeah. The tofu, I love that analogy. You have to put your own mapo tofu spice, or curry, or whatever other flavoring on it, for it to taste like something that you created.

Karen Spinner: Yeah. Although it is interesting because AI is getting better. I did a little lunch-and-learn training for a marketing agency that I freelance for occasionally. I had these different passages that I had Claude write in really different styles, and I’m going to use them as examples of AI writing or not-AI writing. So I had everybody vote: Was this AI or was this not AI? And everybody thought most of it was human. And the only thing that they picked up on as being AI-written was the stuff with the em dashes.

Karen Smiley: Oh no!

Karen Spinner: And granted, these were short passages. I think AI is much, much better at short form copy, because it just loses the plot after a certain point. It’s going to run into context window and continuity issues after, I’d say, about 1500 words; it craps out. But it was really surprising how close it really is getting. In the examples I had, one was technical documentation, which nobody cares about style. One was a financial report. One was a little fictional passage, which people actually all thought a human wrote. It was just really fascinating how there’s a lot of AI out there that we’re probably not noticing.

Karen Smiley: Yeah, I’m sure that there is. And the tools that detect AI are flawed. And the tools that are generating AI content are getting better at making it seem more human-like. So I think even detecting it reliably is a problem. A lot of students have gotten burned by that, when they didn’t use AI, but it thought that they were. There’s a lot of issues with those tools. You mentioned the use of Common Crawl, and obviously that contains copyrighted material that was used without people’s consent. What are your thoughts about whether or not we should be trying to help to protect creative people by making sure they get what some people call the three C’s: to have their consent before their work is used, to credit them, and to compensate them?

Karen Spinner: I think that’s best practice, and I think that’s what we should be doing. But I think practically, unless courts step in, I feel like it’s almost a lost cause. I am hoping to be proved wrong, but I feel like a lot of these large organizations that have really benefited financially from it, I think they may end up paying, like, a slap on the wrist, some kind of amount of money in a class action suit that’s going to give everybody like 55 cents. I think it would be great to have that moving forward.

On the other hand, then I feel like, well, let’s say somebody’s got a startup. Should they now be prohibited from using Common Crawl because it has copyrighted info, even though all of these large companies were able to benefit?

One thought that I’ve had was, would there be any way of taking something like Common Crawl and sanitizing it? There’s also some nasty stuff in there. So if you could take it and find some way to do a copyright filter and take all the Nazi stuff out, there might be some way of getting a data set that isn’t compromised, and then successfully training a model on it. This would probably be the best way of promoting that ethical treatment of content.

Karen Smiley: As someone that uses all these different tools, do you feel like the companies have been transparent about where they got the data that they used for it?

Karen Spinner: No, no, no, no, they’re not. Absolutely not. It’s just like, “Here’s a magic box, press a button.” They’re not telling us what’s in the box.

Karen Smiley: With Future Scan, you mentioned you’re using arXiv documents, so that’s obviously a source that you’re well within your rights to use within their terms of service. So there shouldn’t be any issues with that. As far as use of your personal data, do you know of any cases where your data has been used by an AI based tool? I mean, aside from LinkedIn – we all know that they do that.

Karen Spinner: Yeah, I feel like anything I put on the internet now is going to get ingested and used by something. And if I got better distribution, I probably wouldn’t care at all. I’ve been fortunate that I haven’t had plagiarism. I’ve heard some people, they’ll publish an article on Substack and then they’ll still see the content elsewhere in some new form. Although I think in that case, that’s not AI: that’s just human problems.

But I wonder to the extent to which AI-based tools could contribute to data breaches, because I do see a lot of folks moving to AI agents. And so when you have a bunch of AI agents communicating with each other and potentially handling people’s data, I feel like that is a large risk waiting to happen. And granted, I am not a security expert. I am not formally trained in that, but it seems like common sense that you have AI tools that are non-deterministic, which means they’re guaranteed to give you a slightly different answer or even behave a little differently every time. And you give these non-deterministic tools access to information and some of it’s going to get stolen.

Karen Smiley: Yeah, yeah, definitely. Humans have been plagiarizing other humans long before AI came along, I don’t know that that’s going to go away. One would hope that maybe AI could help to detect that and maybe stop it. With your Substack newsletter, we have this option in our sub settings to disable basically the robots dot text file equivalent, to say no, don’t let other people train on my content. Not that Substack can stop the bots that ignore that, but at least they try. I’m wondering if you’ve set yours to block that training or if you allow it. Some people say it would hurt discoverability, like you said, giving you more distribution.

Karen Spinner: I welcome all bots. I think that one of the issues too is that even if you turn it off, I believe the free content is published by RSS feed by default. And RSS feeds are designed to be programmatically crawled. So you’re going to get crawled through the RSS feed. So I don’t know if the robots text instruction by a Substack means anything if you’re distributed on RSS.

Karen Smiley: Yeah, that’s a really good point. And I’ve heard some people say that it hasn’t hurt their discoverability ‘cause they’ll turn up in Google searches, and I’ve seen that for myself. But others say that, if there’s going to be discussion about the topic and you want your views to be represented, it’s a good way to try to get your point of view out there. So I can see both sides.

Karen Spinner: Yeah. And it’s interesting because Google search still is really the primary method of discovery. There’s a lot of hype over AI discovery, and I don’t optimize any of my content for search, but I’ve probably got hundreds of people visiting my content from Google. I’ve got maybe 10 from ChatGPT and one from Claude, although the one from Claude did subscribe.

Karen Smiley: As far as your personal information, have you ever given a company your information knowing that they were going to use it for AI? Do you feel like you have ever had a choice about whether or not your data gets used?

Karen Spinner: I don’t really feel like there’s much of a choice in a lot of cases. I think our data is just out there. Even in situations like, let’s say healthcare where I know that insurance companies have models that they use all our information to train. You get into the doctor’s office and you have to sign like maybe seven, eight, page after page after page of stuff, which is basically signing away your rights to information. And it’s all couched in HIPAA language, but most of the stuff you’re signing is exceptions to HIPAA, and this, and this, and this. I think this is actually a problem with virtually all documentation referencing how our data is used. It’s legalistic, it’s impenetrable, and we’re presented with stuff to sign in situations where there really is no choice. Are you going to say “Oh, I’m not going to get medical treatment. I’m not going to get my antibiotics today because I want to read 10 pages of privacy stuff and see what I’m comfortable with”? And I think that’s the same thing with software. Like, okay, you need to buy this particular tool to complete your freelance assignment, and you’re confronted with screen after screen after screen of stuff, most of which I’m sure is just consenting to use of data.

Karen Smiley: Mm-hmm.

Karen Spinner: My husband watches South Park and there is an episode of it, a long time ago, about how a kid accidentally sold his soul by clicking ‘like’ on a TOS document.

Karen Smiley: Yeah, there was actually an April Fool’s joke some years ago where they put something like that into the terms of service just on April 1st for that one day. And it was kind of a test to see if anybody actually read it. And of course people didn’t read it.

Karen Spinner: Yeah.

Karen Smiley: You mentioned data breaches earlier. Have you ever been involved in a data breach or had your information stolen or anything like that, that you’re aware of?

Karen Spinner: Usually about every year or so, I get a letter from my insurance company saying oh, well we had a data breach from one of our contractors. And it happens all the time. I get notices that my kids’ information has been leaked or stolen from one of the insurance company contractors. Happened maybe five or six times. I see that stuff and I just change everybody’s passwords and move on.

But in terms of what you can do about it, there’s not too much, unless you can produce some kind of specific evidence that, “Oh, you guys did this and then my identity was stolen the next day.” Connecting the dots is virtually impossible.

Karen Smiley: Yeah, that’s a really good point. Being proactive when you are notified is in many cases about all we can do. Last question. The public distrust of AI and tech companies has been growing partly because of things that they’re doing with our data, which we hadn’t realized before. What’s the one most important thing that they could do to earn and keep your trust? Or is that not even possible, do you think?

Karen Spinner: Let’s see. They could dissolve themselves and divide themselves up into little small companies.

Karen Smiley: Do you mean like when the big telecom companies were broken up due to antitrust regulations?

Karen Spinner: I would even chop them up, dice them up into even smaller components. Meta has made their model available as open source. I think OpenAI has actually made one of their models open as well. But I think the big issue for AI and competitiveness is not necessarily access to the models at this point, it’s access to the infrastructure. And I think these huge companies owning all the infrastructure is just really deeply problematic.

Karen Smiley: Yeah. The infrastructure and data, I would think. They’ve gone to so much trouble to scrape terabytes of data, in many cases, and most people don’t have the capacity to do that. And some of them are certainly very huge. I think it would be very easy to argue that the big AI and cloud providers have more control over our society nowadays, even than the telecom companies used to, back in the day.

Karen Spinner: Yeah. I’m not super comfortable with them. But it’s sort of where our society and the technology is going, and so I guess I have to navigate it.

Karen Smiley: Yeah. I just saw something, I guess it was yesterday, from someone in Europe who was saying that Europe really needs to detach themselves from the big US-based American cloud providers and other infrastructure companies, because they need to have more autonomy. The way they put it was: they should have done it 20 years ago, but the next best time to do it is now.

Karen Spinner: Yeah. And then with the infrastructure, there’s also the whole environmental impact side of things where nobody has any incentive to design, “Okay, can we get this compute power more efficiently with less cost?” And it seems like nobody’s really building in that direction right now ‘cause there’s no reason to.

Karen Smiley: Yeah, one I’ve heard of is a Swiss company, Swiss Public AI, they call it. They trained it on these Alps computer systems that are powered by renewable energy only. And they only used data that was not copyrighted, data that is legally allowed to be used and consented. Apertus. It’s chat.publicai.co. They’re the prime example, I think, of someone that’s trying to do it right. They’re doing it as a public service in Switzerland. I thought that was really interesting.

Karen Spinner: Yeah. I assume that they have some kind of private funding for that, or just people donating their time?

Karen Smiley: Yeah, it’s a consortium. Some of it is, I think, from some universities in Switzerland. So that’s one example that I’m aware of where they are trying to do the right thing and do it as a public service. I think a lot of people would like to see more of that. The whole digital divide is now being amplified by who has access to AI tools and who doesn’t, who can afford them, and where the connectivity is available, where the tools are available, and they can afford them. So I think that’s a really interesting move to see. But yeah, they’re notable because they’re an exception.

Karen Spinner: Well, that’s interesting. The idea of AI as a public utility would be kind of interesting. The question would be, “Okay, so who owns it?”

Karen Smiley: Right, exactly.

Karen Spinner: We have power companies that are heavily regulated. So maybe that might be sort of a model for some of the AI, which would be interesting also to couple it with power regulation because they use so much of it.

Karen Smiley: Yes. Have data centers been an issue in your state?

Karen Spinner: Fortunately, no. We’ve got a large pasture that we’d looked into putting a data center on it. And then we looked at it, and then we saw just how horrible they are for the local communities in terms of noise and water consumption and power consumption. We looked at that and we’re like, “No, we’re going to get laughed out of the town hall if we try to do this.”

Karen Smiley: A lot of communities are fighting back. It doesn’t really bring the jobs to local people like they hoped for. And then the local residents end up bearing the burden of the increases in costs for power and the water competition with agriculture and everything else. A lot of communities are now trying to fight back, at least in the US. I’m hearing a lot about it in California and Virginia.

Karen Spinner: Yeah, no, I think a data center, because literally so much of it is automated, it’s not a wonderful jobs thing at all.

Karen Smiley: Yeah. Yeah. Well, thanks so much for sharing your insights on these! Is there anything else that you would like to say? Anything that you wish I had asked you that I didn’t? Any other thoughts that you’d like to add?

Karen Spinner: No, I think this was fun. I enjoyed participating!

Karen Smiley: I want to thank you for making the time to share your AI experiences with us. And if somebody wanted to try out Future Scan, what would be the best way for them to get in touch with you?

Karen Spinner: Just send me a DM on Substack or LinkedIn, and then I would just send an access code.

Karen Smiley: Okay. That sounds great. Right, well, thanks so much! I hope your project turns into a product and it’s successful. I think it could help a lot of people, so I’m really looking forward to following its progress.

Karen Spinner: Cool. Well, thank you so much!

Interview References and Links

