Batch output: copies & naming

Most runs produce one safe copy. But with synthetic mode you can ask one run to produce several variants of a document — each with a different fake cast — and you can control how all your output files are named. These are the batch output controls.

Copies: many variants from one document

When a profile uses synthetic mode, the staging table gains a Copies column with a stepper. Set it above 1 and that document is anonymized that many times in one run, each pass generating a fresh set of fake values.

expediente-clinico-marcos-patel.pdf   Copies: 3
        ↓  one run
expediente…__001-1.pdf   (David Romero Gil, NHC 84913366, …)
expediente…__001-2.pdf   (Hugo Navarro Ortiz, NHC 55120947, …)
expediente…__001-3.pdf   (Iker Pardo Vega, NHC 73048815, …)

Each copy is internally consistent — within a single variant, every mention of a person still maps to one fake identity — but the variants differ from each other.

Why generate copies

Training and test data. Turn one real document into many realistic, non-identifying examples that share its structure.
Demos. Show a workflow on several “different” records without touching production data.
Robustness testing. Feed a pipeline many shape-identical inputs with varied values.

Output naming

How output files are named is set in the profile’s Output section, under Output file naming. Each output gets a token appended as name__token:

Style	Token looks like	Notes
Number	`001`, `002`, …	A per-file counter kept on this computer
Hash	`ba72c58e`	A short hash, good for uniqueness
Date	`20260612`	The run date (YYYYMMDD)

The number style increments a counter per source file that persists between runs, so the third time you anonymize informe.pdf you get informe__003. When you also generate copies, the copy index is appended too: informe__003-1, informe__003-2.

The base file name is itself anonymized — if the original was invoice-marcos-patel.pdf, the PII in the name gets the same treatment as the body. See output location.

How batches show in history

A run that produced several copies is a single history entry, tagged with a count like 3 files. Opening it offers to open all the generated files (Piixie asks first if there are many). The replacement table reflects the run; each variant’s files sit together in your output folder.

Combining with dictionaries and templates

For varied output (the point of copies), leave dictionary reuse off — you want different casts.
For consistent output across runs, that’s a dictionary job, not copies.
A template shapes how each copy is anonymized (which fields, which methods); copies just multiply it.