Recipe: share safely with an LLM and restore
You want a cloud LLM — ChatGPT, Claude, whatever — to work on a document that can’t leave your control. The move is a round trip: anonymize on the way out, deanonymize on the way back. The cloud only ever sees fakes.
Goal: get a useful answer about a real document from a tool you don’t control, with zero real PII crossing the boundary.
The loop, in one picture
Section titled “The loop, in one picture”real doc ──anonymize──▶ safe copy ──paste──▶ cloud LLM ▲ │ └──── deanonymize ◀──── answer (in fakes) ◀────┘1. Anonymize for reversibility
Section titled “1. Anonymize for reversibility”Use synthetic (not redaction — that can’t be reversed), and attach a dictionary with add to dictionary on. Synthetic keeps the document readable, so the LLM actually does a good job; the dictionary records the swaps so you can get back even if the LLM rewrites the text. Keep detection local — the whole point is not sending the raw document anywhere.
Run it, review in the editor, Save to output.
2. Use the safe copy in the cloud tool
Section titled “2. Use the safe copy in the cloud tool”Paste the anonymized text (or upload the safe file) and ask your question:
“Summarize this patient’s cardiac history and flag any medication interactions.”
The model answers about David Romero Gil, NHC 84913366 — the fakes. It has no idea who the real patient is, because it never saw them.
3. Bring the answer back
Section titled “3. Bring the answer back”Save the model’s answer to a .txt (or keep the edited file it produced). It’s written in fake values.
4. Deanonymize it
Section titled “4. Deanonymize it”Drop that file into Piixie. Two cases:
- If it’s the unchanged safe file, Piixie recognizes it from history and offers an exact reverse.
- If it’s the LLM’s new text (a summary, an edit), Piixie won’t recognize it — choose the dictionary route. It finds the fakes in the answer and swaps them back, tolerating the case and accent changes a chat tool introduces.
LLM answer: "David Romero Gil (NHC 84913366): stable angina, review statin dose." ↓ deanonymize (dictionary)Restored: "Marcos Patel (NHC 1029384): stable angina, review statin dose."Now the summary is about your real patient — and the cloud only ever held the fake one.
What crossed the boundary
Section titled “What crossed the boundary”Only the fake version, in both directions. The real document and the real answer existed solely on your machine. That’s the guarantee that makes external tools usable on data you couldn’t otherwise share. The full reasoning is in the round-trip workflow and privacy.
Watch-outs
Section titled “Watch-outs”- Redaction breaks the trip — you can’t reverse
[REDACTED]. Use synthetic. (Why.) - If the LLM paraphrases a fake away (“the patient” instead of the fake name), there’s no fake left to restore for that value. Synthetic’s natural text makes models keep the names more often than redaction would.
- The restored file is real PII again — keep it on the trusted side.