Skip to content

Deanonymization limits & safety

Deanonymization is powerful but not magic. Knowing its limits keeps you from expecting a restore that can’t happen — and from leaking PII by assuming a file is reversible when it isn’t.

The big one: redacted values can’t be restored. If a run blacked out a value, replaced it with [REDACTED], or filled it with , the original text was never kept — there’s nothing to reverse to. Piixie will tell you:

This file was produced by a redaction run; the original content cannot be recovered.

This is a feature, not a bug. Redaction is the mode you choose precisely because it’s one-way. If you need to reverse later, use synthetic or labeled output instead.

ModeReversible?
SyntheticYes
Label ([NAME_1])Yes
RedactionNo — original is destroyed

Exact reverse matches a file to a run by fingerprint. If the file was edited after anonymization — even slightly — that match breaks, and Piixie falls back to the dictionary. Keep the untouched output if you want the exact path; keep a dictionary if you expect the file to change.

Dictionary reverse needs the fakes to survive

Section titled “Dictionary reverse needs the fakes to survive”

Dictionary reverse works by finding fake values in the text. If a downstream tool paraphrased a fake away — turned “David Romero Gil” into “the patient” — there’s no fake left to match, so that value can’t be restored. It tolerates case, accent, and spacing changes, but not a value that’s been rewritten out of existence.

If a run mapped two different originals to the same fake, reversing is ambiguous and Piixie refuses rather than guess. Rare with synthetic mode (distinct people get distinct fakes); more likely if you forced collisions by hand.

Safety: the restored file is sensitive again

Section titled “Safety: the restored file is sensitive again”

A successful reverse puts real PII back into the document. The moment it’s restored, it’s as sensitive as the source — back inside your trust boundary. Don’t reverse a file and then send that somewhere it shouldn’t go. The whole point of the round trip is that only the fake version ever leaves.

Before you rely on being able to restore a document later, confirm it was produced reversibly — synthetic or label mode, with the run record or a dictionary kept. A redacted file is gone for good; if that’s a problem, re-anonymize the original (if you still have it) in a reversible mode.

  • Anonymize in synthetic mode, not redaction.
  • Attach a dictionary with add to dictionary on.
  • Keep the original until the trip is done.
  • Treat restored files as real PII again.