Lab: M14: test for bias, protect privacy
You'll need: your M4 setup (venv, anthropic) for Part A; Part B is pure Python (no key).
Time: ~45 minutes • Work in your breakout pair.
Heads up: this is educational and authorized: we probe our own app to make it fairer and safer; we don't build anything harmful. Bias findings are sensitive, discuss them thoughtfully. Nothing here can harm your computer.
This lab has two parts: - Part A: probe a model for unfair treatment (needs your key). - Part B: redact personal data before it's sent (no key).
flowchart LR
Same["same task"] --> Swap["swap a sensitive attribute"] --> Cmp["compare outputs"] --> Judge["human judges: bias?"]
PII["user text"] --> Red["redact PII"] --> Safe["safe to send"]
Part A: fairness probe
Step 1: Set up
Put bias_probe.py, privacy.py (from solution/) and responsible_starter.py
(from starters/) in a folder with your M4 .env. Activate your venv.
You should now see: (.venv) and those files.
Step 2: Run the bias probe
python bias_probe.py
You should now see: for each pair, two suggested salaries and either ~ same suggestion (good)
or DIFFERENT by $…, investigate. If a number moves when only the name changed, you've found
something worth investigating, that's a fairness signal you could never spot by trying it once.
Step 3: Judge it (the human part)
With your partner, look at any flagged pair. Is the difference a real bias (a stereotype), or just noise? Re-run once or twice, does the gap persist?
You should now see / say: "a difference isn't automatic proof, I decide if it reflects a stereotype, and whether it's consistent." That human judgment is the responsible part; the tool just surfaces it.
Step 4: Read how the probe works
Open bias_probe.py. Note the PROBES pairs differ only by a sensitive attribute, and the task
is numeric so differences are measurable.
You should now see / say: "same task, swap a should-not-matter attribute, compare." It's M8's eval mindset and M10's red-team mindset, pointed at fairness.
Part B: protect privacy (no key needed)
Step 5: Run the redactor
python privacy.py
You should now see: a sample message with the email, phone, SSN, and card number replaced
by [… REDACTED], and a count of what was removed. This runs entirely on your machine, perfect for
cleaning text before you send it to a hosted model.
Step 6: Add your own pattern (finish the starter)
Open responsible_starter.py. Add a PII pattern to EXTRA_PATTERNS (TODO 1), e.g. a postcode/zip, and test it on text containing one.
You should now see: your new pattern getting redacted too. (Regex is a first line, names and odd formats slip through, so also collect less data in the first place.)
Step 7: Wire it into a real app
Picture your M7 RAG or M9 agent: where would you call redact_pii so user input is cleaned before
the model (or a tool) ever sees it? Write the one line.
You should now see / say: redact at the boundary: as soon as user input arrives, before it reaches the model, logs, or tools. Privacy by design.
Step 8: The responsible-AI checklist
Skim the duties table in notes.md. For your capstone idea, name one thing you'll do
for fairness, privacy, and human oversight.
You should now see: three concrete commitments for your own app. That's responsible AI, habits, not an afterthought.
Stuck? Working examples are in
../solution/.
Your win
You can test an AI app for unfair treatment, strip personal data before it's sent, and name the responsible-AI duties you own as the engineer.
Post it to the chat wins board: "Same job, just a different name → a $7k salary gap my app suggested. Found it, flagged it. And my redactor scrubs PII before anything's sent "
Take-home (optional)
Add a fairness probe to your capstone's eval set (M8) and redact_pii at its input boundary.
Combined with M10's guardrails, that's a genuinely responsible app, exactly what the capstone's "how
would you secure it?" question is really asking.