Building a dictionary from a run
The best way to fill a dictionary isn’t to type pairs — it’s to let a synthetic run generate them, then keep the ones you like. This turns a one-off anonymization into a permanent, reusable mapping.
The idea
Section titled “The idea”When synthetic mode runs, it produces a full set of original → fake pairs for that document. Those pairs are exactly the shape a dictionary holds. Promoting them means: “the fake identities this run invented are now the canonical ones — use them again next time.”
Turning it on in a profile
Section titled “Turning it on in a profile”In the profile editor, the Synthetic settings include dictionary options:
- Add new entries to dictionary — after a run, any newly generated pairs are written into the chosen dictionary, each stamped with the source file and date.
- Replace with existing synthetic data — before generating anything new, reuse values already in the dictionary. A person you’ve seen before keeps their established fake identity; only genuinely new values get freshly generated.
Pick the target dictionary from the dropdown in the same panel (or open the manager from there to create one first).
The workflow it enables
Section titled “The workflow it enables”Run these two switches together and you get a system that learns:
- First document mentions Marcos Patel. The run synthesizes David Romero Gil and adds the pair to the dictionary.
- Next document also mentions Marcos Patel. Because “replace with existing” is on, he’s recognized and becomes David Romero Gil again — same identity, no churn.
- New people in that second document get fresh fakes, which are added too.
Over a few documents the dictionary fills out, and your anonymized outputs become consistent across the whole set — the same real person is always the same fake person, everywhere.
Where the entries land
Section titled “Where the entries land”Each promoted entry records the document it came from, so the manager’s Source file column tells you the provenance of every pair. Manually added entries show as Manual; run-promoted ones show the file name.
When to use it
Section titled “When to use it”- Recurring documents — a monthly report on the same accounts, the same patients, the same clients.
- A document set — anonymizing a whole folder where entities recur across files.
- Test fixtures — generate a stable cast of fake people once, reuse them forever.
For a full walkthrough, see Consistent fake identities across a team.