The PII type catalog
Every entry carries a type — the model’s classification of what the value is. The type is more than a label: it decides what a synthetic replacement looks like, and it’s how you filter and search entries.
Common types
Section titled “Common types”| Type | Covers | Synthetic example |
|---|---|---|
NAME | People’s names, in any form (full, first, surname, with title) | Marcos Patel → David Romero Gil |
EMAIL | Email addresses | [email protected] → [email protected] |
PHONE | Phone and fax numbers | +34 612 345 678 → +34 698 112 530 |
ADDRESS | Street addresses, full or partial | C/ Mayor 12, Madrid → Av. del Sol 84, Sevilla |
ID | Government and record IDs — DNI/NIE, NHC, passport, SSN | 47829105T → 71530642M |
IBAN | Bank account numbers (IBAN and local formats) | ES91 2100 0418 45… → ES66 0049 1500 05… |
CARD | Payment card numbers | 4539 1488 0343 6467 → 4716 9012 3456 7890 |
DOB | Dates of birth | 14/03/1982 → 02/11/1979 |
DATE | Other identifying dates (employment spans, admission dates) | Jan 2020 – Mar 2023 → 02/2018 – 11/2021 |
ORG | Organizations, employers, institutions | Acme Ibérica SL → Globex Hispania SA |
URL | Personal sites and profile links | linkedin.com/in/mpatel → … |
IP | IP addresses | 192.168.1.42 → 10.0.4.207 |
OTHER | Anything sensitive that doesn’t fit a category |
The exact catalog the model uses is broad and may grow; these are the ones you’ll see most. The editor’s type dropdown lists the full set with descriptions.
Type decides the synthetic shape
Section titled “Type decides the synthetic shape”Synthetic mode doesn’t just swap one string for another — it generates a value of the right kind and shape. A DOB of 14/03/1982 becomes another day/month/year date, not a random number. An ID keeps its rough letter-and-digit pattern. An EMAIL reuses the fake person’s name. This is why getting the type right matters when you edit an entry: change a misclassified PHONE to ID and the regenerated value matches the real shape.
See synthetic data for the full set of generator families and how shapes are preserved.
Dates are a judgement call
Section titled “Dates are a judgement call”Not every date is PII. “Released in March 2023” in a press release is public. “Empleado de enero 2020 a marzo 2023” on a CV pins down a person’s timeline and is identifying. The model is prompted to treat personal-timeline dates as PII (DATE) and leave clearly public ones alone — a distinction a pattern matcher can’t make. See why LLMs for PII detection.
Numeric amounts
Section titled “Numeric amounts”Prices, totals, and quantities aren’t usually PII, but they can be identifying in context (a very specific salary, a unique claim amount). They’re handled separately by a per-profile switch — preserve, randomize, or zero out — rather than as typed entries.
Redaction characters (for the Redact method)
Section titled “Redaction characters (for the Redact method)”When an entry is redacted rather than replaced, you pick the fill character. The shared catalog (used in both the editor and profile settings):
█ full block · ▓ dark shade · ▒ medium shade · ░ light shade · ■ ▬ blocks · * asterisk · or a custom character or string.