Here's the uncomfortable truth about most AI-powered apps: when you type "find my dad's insurance policy," your dad's name, his policy number, and the fact that he has insurance all get sent to a server farm in Virginia. The AI company pinky-promises not to train on it. Maybe they even mean it. But the data still left your country, still hit their servers, and still exists in their logs.
We decided that wasn't good enough for family documents.
Most AI document tools work like this: you upload a file, their system extracts the text, sends it to OpenAI or Google, and returns a summary. Your data makes a round trip through US infrastructure. The privacy policy says they won't misuse it. That's the entire security model -- a legal document.
For a recipe app, fine. For a vault that holds your family's tax returns, medical records, wills, and insurance policies? That's a different conversation.
We wanted to build an AI assistant that could actually search and answer questions about your family's documents -- without ever exposing the real content to cloud AI services. Not through legal promises. Through architecture.
Let's say you type: "What did Margaret say about the house insurance renewal?"
Here's what happens behind the scenes, broken into layers:
Your query hits our hard redaction check first. If you accidentally paste a SIN, credit card number, or bank account number into the chat, we block it before it goes anywhere. Not anonymized -- blocked. Those categories of data never leave the server, period. Next, our privacy engine (built on Microsoft's Presidio) scans your query for personal information. It finds "Margaret" and flags it as a person's name with 92% confidence. The system generates a surrogate -- a fake but realistic replacement. "Margaret" becomes "Sarah Chen." This mapping gets stored in a per-conversation vault that only your session can access. The anonymized query -- "What did Sarah Chen say about the house insurance renewal?" -- gets sent to the cloud LLM (Groq running Llama 3.3). The AI has never heard of Margaret. It only knows Sarah Chen. The AI decides it needs to search your documents. It calls our search tool with "Sarah Chen house insurance." Before executing the search, we reverse the surrogate: "Sarah Chen" becomes "Margaret" again. The search runs against your real documents on our Canadian server. Search results come back with real names and real content. Before sending them back to the cloud AI, we re-anonymize everything. "Margaret" becomes "Sarah Chen" again in the results. The AI reads the anonymized results and writes a response using "Sarah Chen." We intercept the response, swap "Sarah Chen" back to "Margaret," and show you the answer with the correct names.
The cloud AI processed your question, searched your documents, and wrote a helpful response -- without ever seeing a single real name.
The name-swapping isn't random. Each conversation gets its own vault -- a database table that maps real values to surrogates. If Margaret comes up five times in a conversation, she's "Sarah Chen" every time. If you mention her email address, it gets its own consistent surrogate too.
Surrogates are conversation-scoped. When you delete a chat thread, the entire vault is cascade-deleted. There's no persistent record mapping your family members to fake names across sessions.
The vault handles five entity types with surrogates: names, email addresses, phone numbers, IP addresses, and URLs. These get realistic replacements generated by the Faker library -- believable enough that the AI treats them as real context.
Some data is too sensitive for surrogates. We don't anonymize your SIN -- we refuse to process it at all.
Eight categories of data are hard-blocked before reaching any cloud service: Social Insurance Numbers, credit card numbers, bank account numbers, driver's licence numbers, passport numbers, medical licence numbers, cryptocurrency wallet addresses, and international bank account numbers (IBAN). If detected in your query, you'll see an error message explaining why -- and the query never leaves Canada.
The detection uses pattern-matching (regex), not AI inference. A 9-digit number matching the SIN format gets caught in about 5 milliseconds. We don't need a language model to spot 4532-XXXX-XXXX-XXXX as a credit card.
We should be upfront about what this costs:
Latency. The full anonymization pipeline adds 1-2 seconds to every query. Scanning for entities, looking up the vault, replacing text, then reversing everything on the way back -- it's real work. Most users don't notice because the AI response itself takes 3-5 seconds, but it's there.
False positives. Presidio sometimes flags words that aren't actually personal information. "Victoria" might be a person's name or a city. We tune confidence thresholds per entity type and filter obvious false positives (document headers like "Name:" and short abbreviations), but it's not perfect.
US cloud services. Our LLM providers -- Groq and Cohere -- run US infrastructure. We chose them for speed and cost, not geography. The anonymization layer exists precisely because we couldn't find a Canadian LLM provider that met our performance requirements. We're honest about this: the AI is American, but it only ever sees Canadian-generated surrogates, never your real data.
Your documents, embeddings, and vector indexes all stay on our Canadian server in Toronto. Only the anonymized query text crosses the border -- and even that contains zero real personal information.
We actually do run the search and embedding models locally. Document chunking, vector indexing, and semantic search all happen on our Canadian server using open-source models. No cloud round-trip for any of that.
But for the conversational AI -- the part that understands your question and writes a natural-language answer -- local models aren't there yet. A local 8B parameter model can embed text, but it can't reliably interpret a question like "what was the deductible on the policy Margaret mentioned last Tuesday?" and decide which tool to call. The cloud models (70B+) can.
So we split it: local for storage and search, cloud for reasoning -- with a privacy wall between them.
Every family's comfort level is different. A family managing an estate might want stricter controls than a household tracking warranties. Our redaction engine supports per-tenant policies:
Administrators can adjust confidence thresholds per entity type. If you're getting too many false positives on person names, you can raise the threshold from 0.5 to 0.7 -- fewer catches, but higher confidence in each one.
You type: "Find Margaret's insurance policy"
You type: "Find Margaret's insurance policy"
|
v
[Hard Redaction] -- SIN/credit card? -> BLOCKED
|
v (clean)
[Anonymize] -- "Margaret" -> "Sarah Chen"
|
v
[Cloud LLM] -- sees only "Sarah Chen"
|
v (tool call: search "Sarah Chen insurance")
[De-anonymize args] -- "Sarah Chen" -> "Margaret"
|
v
[Local Search] -- searches YOUR docs with real name
|
v (results with real names)
[Re-anonymize results] -- "Margaret" -> "Sarah Chen"
|
v
[Cloud LLM] -- writes response with "Sarah Chen"
|
v
[De-anonymize response] -- "Sarah Chen" -> "Margaret"
|
v
You see: "Margaret's home insurance policy expires March 2027..."The cloud AI helped you find what you needed. It never learned who Margaret is.
You can ask your vault anything -- "when does Dad's passport expire?", "what was the deductible on our home insurance?", "find the receipt for the furnace repair" -- and the AI that processes your question never learns your dad's name, never sees your address, and never touches your actual documents.
The documents stay in Canada. The names stay in Canada. Only sanitized, surrogate-filled text crosses any border. And when the conversation ends, the mapping between real and fake names is deleted.
That's what we think privacy-first AI should look like. Not a promise in a legal document -- a wall in the architecture.
Most startups spread their stack across a dozen SaaS platforms. We put everything -- website, CMS, database, analytics, and AI pipeline -- on a single server. Here's why, and what it actually costs us in ways that aren't just money.
We built an AI assistant that can search your most sensitive family documents. Then we realised we hadn't thought about where that data actually goes. Here's how we fixed it.
The #1 concern with AI tools: will my data train the model? At Archevi, the answer is no -- not by policy alone, but by architecture. Three independent layers ensure your family data never trains any AI.