How this lab works
You're the new privacy engineer at ClaimCheck AI, an insurance claims chatbot. It has 47k daily users — and it's been quietly bleeding customer PII into three different sinks for months. Below you'll see exactly what's leaking and where. Your job: enable the right redaction rules until all three channels read 0 items leaked, then run the privacy audit to capture the flag. This is the same pattern real teams ship with Microsoft Presidio in production.
Raw customer input
A user types into the ClaimCheck chat: "my SSN is …, card is …, please reissue".
Three sinks, no filter
That text flows straight into the LLM context, the log file, and the vector DB — verbatim.
Insert redactor
You build a rule-based redactor that runs before any sink. Each rule you enable rewrites matched items in place.
Run the audit
All three sinks show 0. Validator passes. Flag drops.
Why this lab matters — real money, real lawsuits, real regulations
This isn't theoretical paranoia. Every one of the below cost real revenue, real headlines, or both — and the underlying bug class is still being shipped weekly in 2026.
Samsung — internal source-code leak
Engineers pasted proprietary code into ChatGPT. Samsung banned all external LLMs the next month.
ChatGPT history bug
Redis caching glitch exposed other users' chat titles + partial payment data for 1.2% of subscribers.
Air Canada chatbot ruling
Court held Air Canada liable for its chatbot's bad advice. Precedent: AI behaviour = company liability.
LLM02 — Sensitive Information Disclosure
The canonical risk list. This lab is the hands-on companion to that page.
Art. 25 — Privacy by Design
Legal requirement (EU): bake redaction into the architecture, not as an afterthought.
Microsoft Presidio
The de-facto open-source PII detection library. Use this in production — not regexes.