Recipe: anonymize an insurance claim (PDF)
Insurance documents pack a lot of PII into one page: the insured party, a perito (assessor), DNI and policy numbers, an amount, plus a handwritten signature and a rubber stamp that have no text to detect. This recipe handles all of it on a Catalan claim letter.
Goal: a carta d'indemnització you can share with a vendor or use as a sample — readable, with every identifying detail gone, including the graphics.
1. Mode and model
Section titled “1. Mode and model”Synthetic, region Català (see synthetic data — Català uses a regional name pack on a Spanish structural base for phones and IDs). Local model is fine; reasoning Low helps it tell the insurer (keep) from the insured (scrub).
2. Run it
Section titled “2. Run it”Process the PDF. Vision reads the page even where it’s scanned. You’ll get text entries plus, possibly, flagged images.
3. Review the text entries
Section titled “3. Review the text entries”Open in the editor. Typical entries:
| Type | Original | → |
|---|---|---|
NAME | Maria Josep Solà | Joan Carles Ribera |
ID | 41234567Z (DNI) | 39876543M |
ID | pòlissa 2024-001 | pòlissa 2024-005 |
NAME | Dr. Antoni Miquel Fontana (perit) | Dra. Lucía Sáez Marín |
DATE | 12/06/2024 | 15/06/2024 |
Decide on the amount: if the indemnity figure is identifying, set the profile’s numeric amounts to randomize; if it’s just context, preserve it.
Keep the insurer (e.g. CATSEGUR) by turning its entry off — the company name is usually fine to keep, and a template can make that a permanent “keep” rule.
4. The signature and the stamp
Section titled “4. The signature and the stamp”These are graphics — no text, so detection can’t touch them. Use the drawing tool:
- Turn on the draw tool (the view flips to the original).
- Drag a box over the signature; in the dialog the Trapped text reads “(no text — coordinates only)”, confirming it’s coordinate-based. Pick Black box, Add redaction.
- Do the same over the stamp.
Both boxes appear in the Drawn areas list, each toggleable and undoable. The patches are painted into the PDF itself — genuinely gone.
5. Save
Section titled “5. Save”Save to output. The delivered PDF has fake parties, fake IDs, the insurer kept, and the signature and stamp blacked out.
6. Make it repeatable
Section titled “6. Make it repeatable”Claims from the same insurer share a shape. Save a template: keep the insurer, synthesize parties and assessors, redact policy IDs. The next claim applies it in one click — you’ll still draw the signature/stamp per file, since coordinates don’t transfer.
For a folder of claims at once, see batch-anonymize invoices — the same approach works for any look-alike PDF set.