Your first anonymization
This walkthrough takes one document from drop to safe copy.
1. Drop a file in
Section titled “1. Drop a file in”Drag a file (or a folder) onto the Piixie window, or click the drop zone to browse. Piixie accepts .txt, .md, .pdf, and .docx files. Files land in a queue where you can review them before anything is processed.
For PDFs, Piixie counts embedded images up front so you know whether image redaction will come into play.
2. Pick a mode
Section titled “2. Pick a mode”Each queued file gets an anonymization mode. The three everyday choices:
| Mode | What happens | Good for |
|---|---|---|
| Redaction | PII becomes [REDACTED] markers | Maximum certainty, legal review |
| Replacement | Numbered tokens like [NAME_1], [EMAIL_2] | Keeping entities distinguishable |
| Synthetic | Plausible fake values (“Marcus Patel” → “Ethan Vance”) | Documents that must read naturally |
Two more modes, LLM Gen and JavaScript, are available through profiles. The full comparison is on the modes page.
You can also pick a profile instead of a bare mode. A profile carries its own mode plus prompts, field rules, and replacement dictionaries.
3. Process and watch
Section titled “3. Process and watch”Hit process. A dialog streams the model’s progress in real time: extraction, detection, and rewriting stages, with a live token feed so you can see the model working. You can cancel at any point; nothing is written until the run completes.
If the selected model can’t analyze images and the file contains some, Piixie asks before proceeding: continue text-only, blur all images automatically, or apply the choice to every file in the queue.
4. Review the output
Section titled “4. Review the output”The anonymized copy is written to Piixie’s output folder inside its app data directory, unless you set a different output location (same folder as the original, or a custom directory). The history tab records the run with a full replacement table: every detected entity, its type, the original text, and what it became.
Keep the original and the safe copy side by side until you trust the result. Models are good at this, but they are not infallible. Skim for anything that slipped through, especially short name forms and dates.
5. Use the safe copy
Section titled “5. Use the safe copy”Send the anonymized output wherever the original couldn’t go: an LLM prompt, a support ticket, a shared test fixture, a demo.