How Archevi Protects Your Family's Privacy
When you upload sensitive family documents -- tax returns, insurance policies, legal papers, medical records -- you need to know they're protected. Not just stored, but by design protected from exposure.
At Archevi, privacy isn't a toggle you switch on. It's built into the foundation of how we process every query. Here's exactly how it works.
Right now, millions of people are uploading private documents to ChatGPT
Tax returns. Insurance policies. Wills. Medical records. People paste them into ChatGPT or Gemini, ask a question, and get an answer. It works. What most people don’t think about is what happens next.
When you upload a document to ChatGPT, the AI receives your real name, your real address, your real policy numbers — everything in the document, verbatim. By default, OpenAI may use that data to train future models. Google’s Gemini follows a similar pattern. Even if you find the opt-out toggle, your data has already crossed the wire in plain form.
Archevi was built to solve the same problem — ask questions about your family’s documents — without that tradeoff. Here’s exactly how we do it.
Archevi takes a at its core different approach.
Boundary Name removal: How It Works
We call our approach boundary name removal. The idea is simple: anonymize data at the boundary between our system and external AI providers. Your real data lives securely in our database. Only when a query needs AI processing do we replace personal information with realistic surrogates.
Here's what happens when you ask a question:


1. You ask: "What did Sarah Thompson say about the mortgage renewal?"
2. Archevi detects entities: Sarah Thompson (PERSON), mortgage (financial term)
3. The AI receives: "What did James Chen say about the mortgage renewal?"
4. The AI processes the query using surrogates, finds the relevant document passages, and generates an answer referencing "James Chen"
5. You see the answer with your real names restored: "Sarah Thompson mentioned the renewal is due in March..."
The AI never knew your real name. It processed a realistic but fake identity, found the right information, and returned a useful answer. We swapped the surrogates back before you saw the result.
If you accidentally include sensitive data like a SIN or credit card number in a question, Archevi's hard redaction layer catches it before it reaches any AI service. The query is blocked entirely, not anonymized.
What Gets Anonymized
Our entity detection system, powered by Microsoft Presidio (the same NER engine used by enterprises worldwide), automatically detects and replaces:
- Names -- personal and family names become different realistic names
- Email addresses -- replaced with generated surrogate emails
- Phone numbers -- swapped with different numbers
- Locations -- cities and addresses replaced (Toronto becomes Halifax, etc.)
- Organizations -- company names replaced with generated alternatives
Each conversation maintains its own name removal vault -- a mapping between real entities and their surrogates. This means the same person always maps to the same surrogate within a conversation, so the AI can reason consistently across multiple questions.
Hard Redaction: The Second Layer
Some data is too sensitive even for surrogates. When our system detects highly sensitive information like Social Insurance Numbers, credit card numbers, bank account numbers, or passport numbers, it doesn't anonymize them -- it blocks the query entirely.
This two-layer approach uses:
- Layer 1: Regex pattern matching -- instant detection of structured data formats (SIN patterns, credit card numbers, IBANs)
- Layer 2: Presidio NER analysis -- deep entity recognition for unstructured mentions of sensitive data
If either layer detects highly sensitive data, the query is rejected before it reaches any external service. You'll see a clear message explaining what was detected and why the query was blocked.
Canadian Data Residency
Your documents are stored on Canadian infrastructure (DigitalOcean, Toronto region) and are subject to Canadian privacy law (PIPEDA). Your files never leave our servers. Only anonymized query text -- containing surrogates, not your real data -- reaches cloud AI providers for processing.
Your family documents never leave Canadian infrastructure. All data is stored on servers in Toronto, subject to Canadian privacy law. When we use AI features, your personal information is anonymized before it reaches any external service.
AI Providers We Trust
We use AI providers with contractual commitments not to use customer data for model training. For the full technical comparison, see our AI with guardrails post.
Family Isolation
Every family on Archevi operates in a completely separate tenant with database-enforced row-level security. Your documents, conversations, name removal vaults, and search history are invisible to other families. There is no query path that crosses tenant boundaries.
What We Don't Do
- We don't sell your data
- We don't use your documents for AI training
- We don't share your data with advertisers
- We don't send real personal information to cloud AI providers
- We don't retain data longer than needed
Privacy-preserving AI isn't just a feature we added. It's the architecture we built. Learn more on our security page, or sign up free to see it in action.
For a deeper technical dive into how we run LLMs without data exposure, see our post on AI with guardrails.
For a comparison of how AI providers handle training data, read why your family AI won't train on your data.


