What ChatGPT Actually Keeps When You Upload a Document

RHRob HudsonMarch 22, 2026(Updated April 10, 2026)4 min read

The question nobody asks before uploading

You have a 40-page insurance policy and you want to know whether water damage is covered. You open ChatGPT, upload the PDF, and type your question. Ten seconds later, you have an answer. It's fast, it's useful, and you probably didn't think about what happened to that document after the answer appeared.

We read the privacy policies and terms of service for ChatGPT, Google Gemini, and Anthropic's Claude so you don't have to. Here's what each one actually does with your uploaded documents.

ChatGPT (OpenAI)

What they keep

Warning

Uploaded files and conversation content are stored on OpenAI's US servers.
By default, your conversations and uploaded files may be used to train and improve future models. This is opt-in by default — you have to manually disable it in Settings > Data Controls.
Even with training disabled, OpenAI retains conversations for up to 30 days for safety monitoring.
ChatGPT Plus, Team, and Enterprise tiers have different defaults — Enterprise and Team conversations are not used for training. Free and Plus are.

What's unclear

Whether disabling training is retroactive for data already submitted.
How uploaded documents are handled by OpenAI's subprocessors and cloud partners.
Long-term retention of uploaded files after a conversation is deleted.

Source: OpenAI Privacy Policy (privacy.openai.com) and Terms of Use

Google Gemini

What they keep

Warning

Conversations with Gemini are stored by Google and may be used to improve Google products, including AI models. This is the default.
Uploaded files are processed through Google's infrastructure. Data is subject to Google's standard privacy terms.
Google retains Gemini conversations for up to 3 years, even with activity controls turned off, for safety and abuse monitoring.
Gemini Advanced (paid) offers different terms, but the default consumer tier uses your data.

What's unclear

The boundary between “improving Google products” and “training AI models” — Google's terms are broad.
Whether uploaded documents are indexed by Google's broader data systems.
How Google Workspace Gemini features differ from standalone Gemini for data handling.

Source: Google Gemini Privacy Notice and Google Privacy Policy

Claude (Anthropic)

What they keep

Note

Anthropic does not use consumer API inputs to train models by default. However, the free Claude.ai tier has different terms.
Free-tier conversations may be used for training and safety research. Pro and Team tiers are excluded from training.
Anthropic retains conversation data for safety evaluation. Retention periods are not publicly specified in detail.
All processing happens on US servers.

What's unclear

Exact retention periods for different tiers.
Whether uploaded file content is treated differently from typed conversation text.

Source: Anthropic Privacy Policy and Usage Policy

The comparison

Here's how these three tools handle your uploaded documents side by side:

ChatGPT / Gemini / Claude

Training on your data: Yes by default (all three). Opt-out requires manual action. No anonymization before AI processing. No SIN/credit card blocking. All data on US servers. Retention: 30 days (OpenAI), up to 3 years (Google), unspecified (Anthropic).

Archevi

Training on your data: Never. Opt-out not needed. Automatic boundary anonymization before any AI sees your documents. Hard block on SINs and credit card numbers. Canadian data residency (Toronto). You control deletion.

The architectural difference

Important

Every tool on this list relies on a policy to protect your data. Policies can change with the next terms-of-service update. Archevi relies on architecture.

Before any query reaches an AI provider, Archevi's boundary anonymization layer replaces your real names, addresses, and personal details with realistic surrogates. The AI processes a query about fictional people and returns an answer. We swap the real names back before you see the result.

Even in a worst-case scenario — if every provider broke every commitment — they would only have synthetic surrogate data. There is nothing identifiable to train on, leak, or subpoena.

What to do if you've already uploaded sensitive documents

ChatGPT: Go to Settings > Data Controls > disable “Improve the model for everyone.” Delete the conversation containing the upload.
Gemini: Go to myactivity.google.com, find the Gemini activity, and delete it. Adjust activity controls.
Claude: If on the free tier, consider upgrading to Pro where training is excluded. Delete conversations with sensitive content.

These steps help, but they don't undo what's already been processed. The safer approach is to use a tool designed for sensitive documents from the start.

Try the alternative

Start free, forever

Sign up free with Archevi. Upload the documents you'd never paste into ChatGPT. Ask questions in plain English. Check the privacy vault to see exactly what the AI received — surrogate names, surrogate addresses, and nothing real.

Your family's documents deserve a tool that protects them by design, not by policy.

What ChatGPT Actually Keeps When You Upload a Document

The question nobody asks before uploading

ChatGPT (OpenAI)

What they keep

What's unclear

Google Gemini

What they keep

What's unclear

Claude (Anthropic)

What they keep

What's unclear

The comparison

ChatGPT / Gemini / Claude

Archevi

The architectural difference

What to do if you've already uploaded sensitive documents

Try the alternative

Related Posts

Notion Just Made AI a Pay-Per-Question Service. Archevi Will Not.

Microsoft 365 Family Gives AI to One Person. Here Is What That Means for Your Family.

Your Will Is on OpenAI’s Servers. Here’s What That Means.