Skip to content

Dictionary-based reverse

Sometimes the file you want to reverse isn’t a clean copy of a past run’s output. An external tool edited it, it came from a teammate’s machine, or its history was cleared. With no run to invert, Piixie falls back to a dictionary of known swaps.

If you try to deanonymize a file Piixie doesn’t recognize from history, it offers the dictionary route:

File deanonymization record does not exist, try with an existing dictionary?

Pick a dictionary, or All dictionaries (most recent entry wins) to search across everything you’ve got. Piixie then reverses using those pairs.

Unlike exact reverse, there’s no precise mapping to invert, so Piixie has to find the fake values in the text first:

  1. It reads the document and identifies the candidate entities (names, IDs, emails, and so on).
  2. For each candidate, it looks for a dictionary entry whose replacement matches — turning that pair around to restore the original.
  3. It applies the restorations it found.
Edited copy: "Summary for david romero gil (id 84913366): stable, discharged."
↓ dictionary reverse
Restored: "Summary for Marcos Patel (id 1029384): stable, discharged."

External tools don’t always preserve text exactly. Dictionary reverse tolerates the common drifts:

  • Casedavid romero gil still matches David Romero Gil.
  • AccentsLucia Saez still matches Lucía Sáez.
  • Spacing and surrounding punctuation — minor differences don’t block a match.

This makes it robust to a document that’s been through a summarizer, a translator, or a chat tool that lower-cased or reflowed the text.

When you reverse across all dictionaries, the same fake value might appear in more than one. Piixie resolves the conflict by recency — the most recently added entry for that value wins. Usually that’s what you want: your latest mappings reflect your current intent.

Why a dictionary makes the round trip durable

Section titled “Why a dictionary makes the round trip durable”

Exact reverse depends on the original output file staying intact. A dictionary doesn’t — it can restore values out of a document that’s been rewritten, as long as the fake values themselves survived in the text. That’s exactly the situation after an LLM edits your anonymized file. Keeping a dictionary that grows from your runs is what makes the whole round trip dependable.

If none of the dictionary’s pairs appear in the document, Piixie says so:

No dictionary entries matched this document.

That usually means the wrong dictionary, or a document whose fake values were further changed downstream. Try All dictionaries, or confirm the file is really one you anonymized with these mappings.