Is Google Workspace data private?
Many AI tool companies, especially chatbots, are using our data whenever we use their tools. Is Google Workspace safe?
This article is a quick response to a comment posted by
on a thread about this post from the on how chatbot companies are using our data. Valerie’s question was:“I’m curious how it varies across paid models. The OpenAI/Shopify news and the creepiness I’ve been feeling even as a paid user of ChatGPT has me shifting away from it. But as a Google Workspace user, my understanding is that I’m NOT training anything. So is this mainly an issue for free versions and this companion tools?”
And it’s an excellent question.
For many AI tools, whenever we type in questions or upload photos to manipulate, the AI tool company claims the right to use our submitted questions, photos, etc. however they wish, including for doing additional training of their AI tools. (See CHT’s article for more about this.)
For some tools, but not all, paid users do get more privacy protections. Grammarly is one example: when I evaluated it last summer for readability scoring, I learned that the data of free users (like me) would be used for training the public product and models, whereas data for paid users is only used within the user’s domain.
Can we trust any AI tool company to keep our data private?
In general, it comes down to:
Which tool features you use - e.g. does your Workspace admin have Gemini AI or NotebookLM enabled, and do you use them?
What the Privacy Policy and Terms & Conditions SAY the company will & won’t do, in general and for specific tool features, and
Whether the company can be trusted to do what they say, and not do what they say they won’t.
I hadn’t looked at the Google Workspace terms and conditions lately, so I took a peek and wrote up what I found here when it got to be too long for a Note.
What Google Says They Do (& Don’t Do)
Google’s Workspace FAQ claims they’re compliant with various data protection standards. They also claim that they don’t use or share customer files without user permission. The Google AI solutions page says:
Confidentiality
Your data is your data, and it’s not used to train Gemini models or for ads targeting. You can delete your content or export it.
and, in the FAQ:
How does Gemini keep my organization’s data private and secure?
Your organization’s data in Workspace is your data, and it’s not used to train or improve Gemini models or for ads targeting. Gemini only retrieves relevant content in Workspace that the user has access to and does not share user prompts or generated responses with other users or organizations. You can restrict Gemini’s access to sensitive data in Workspace with built-in data loss prevention (DLP), information rights management (IRM), and client-side encryption (CSE) controls. For more detailed information, please refer to the Generative AI in Workspace Privacy Hub.
And their Jan. 2025 announcement on rollout of AI features for Workspace asserts:
“Your data is your data: We don’t use your data, prompts, or generated responses to train Gemini models outside of your domain without permission. We don’t sell your data or use it for ads targeting.”
However, Google’s current privacy policy on genAI and their privacy hub page do not say that they never use your data for training AI. Here’s what the policy does say:
Your data is your data. The content that you put into Google Workspace services (emails, documents, etc.) is yours. We never sell your data, and you can delete your content or export it.
Your data stays in Workspace. We do not use your Workspace data to train or improve the underlying generative AI and large language models that power Bard, Search, and other systems outside of Workspace without permission.
Your privacy is protected. Interactions with intelligent Workspace features, such as accepting or rejecting spelling suggestions, or reporting spam, are anonymized and/or aggregated and may be used to improve or develop helpful Workspace features like spam protection, spell check, and autocomplete. This extends to new features we are currently developing like improved prompt suggestions that help Workspace users get the best results from Duet AI features. These features are developed with strict privacy protections that keep users in control. (See below for more detail on additional privacy, security, and compliance commitments we make for business customers).
Your content is not used for ads targeting. As a reminder, Google does not collect, scan, or use your content in Google Workspace services for advertising purposes.
Note: They specifically say they don’t train the underling genAI and LLMs that power systems ‘outside of Workspace’ ‘without permission’. This wording begs three questions:
When, where, and how might a user might be prompted for permission?
Can a Workspace admin opt in everyone in the Workspace (give permission), without individual users’ permission, for all of the data to be used outside of Workspace?
Do they use content within a Workspace to train AI models used within that Workspace? (This could ‘leak’ information internally in an undesirable way.)
This policy also makes it clear that Google does use Workspace user data (‘anonymized and aggregated’) to train AI tools inside of Workspace. Is this ok? Maybe.
What Others Say Google Does (& Don’t Do)
Vox had reported in July 2023 that:
“Google says it doesn’t use data from its free or enterprise Workspace products — that includes Gmail and Docs — to train its generative AI models unless it has user permission, though it does train some Workspace AI features like spellcheck and Smart Compose using anonymized data.)”
If still true today, this implies that free Workspace users have the same protections as enterprise users.
On the question of trust, the Vox report notes:
“We know that Google scanned users’ emails for years in order to target ads (the company says it no longer does this).”
2024 discoveries about Chrome’s incognito mode not being private have cost Google the trust of many people.
There is also an active class-action lawsuit asserting that “Google misused content they posted to social media and information shared on Google platforms to train its chatbot Bard and other generative AI systems.”
The Bottom Line
The short answer seems to be: Google says most of your content in Workspace is protected, but:
Google acknowledges that they will use some ‘anonymized and aggregated’ user data for training some of their AI models.
It’s not clear how and when ‘permission’ to use the data more widely might be requested, or whether in-Workspace models are being trained.
It’s not clear how much Google can be trusted to do what they say they will do, and not do what they say they won’t do.
What do you think? Do you trust Google, and are you ok with how they say they will use your Workspace data?
Thanks for the analysis!
It'd be interesting to know if there's any workspace software that is 'safe' or ... if they are pretty much the same (which is more likely), so users like us need to constantly monitor the privacy policy.
It'd be cool if there's a digital product that can help monitor the privacy policy and set a reminder, then something dodgy comes up... :)
Thanks for the quick response and analysis! To be honest, I had avoided Gemini in large part because of the lack of trust from the prior situations you mentioned. However, the Workspace designations have me feeling much more comfortable compared to some of the other companies. I do wish they were more specific about when feedback was used. With Claude at least it is clear that it’s the thumbs up/down that triggers feedback. I just avoid using those features across all models. FWIW I am a paid workspace user for my business and am also the admin. I am working with an org however that is investigating turning on some of the features and we have the same questions about “internal” leaking. I will let you know how it goes and what I learn!