M30: Agent data and feedback loops (Part D: Agentic Systems)
A deployed agent is not the end; it is a data source. Every real question it answers, and every thumbs up, thumbs down, or correction a user gives, is the most valuable signal you have for making it better, more valuable than any dataset you could buy. Today you build the flywheel: capture production interactions and feedback, curate them into new eval cases (M26) and fine-tuning examples (M15), and feed them back in. Usage compounds into quality, and you do it without leaking anyone's private data.
Today's win: a feedback pipeline that turns real interactions into a regression eval case AND a corrected training example, with PII redacted on the way in, all demonstrated offline.
Today you will
- Log production interactions (question, answer, sources, feedback) with PII redacted at write time
- Read the three core signals: thumbs up, thumbs down + correction, and down with no fix
- Curate logs into eval cases (M26): up becomes golden, down+correction becomes a regression
- Curate logs into fine-tuning examples (M15): learn from good answers and corrected ones
- See the flywheel: usage becomes the data that improves the agent, while respecting privacy (M14)
Run of show (about 50 minutes)
| Time | What we do |
|---|---|
| 0:00 | Hook: your agent is a data source |
| 0:05 | The one idea: capture, curate, feed back (read notes.md) |
| 0:12 | Lab Part A: log interactions with feedback, and redact PII |
| 0:28 | Lab Part B: curate into eval cases and fine-tuning examples |
| 0:45 | Show: post a down-vote that became a regression test and a training fix |
| 0:50 | Wrap |
If you get stuck
- Connects M20/M26 (evals) with M15 (fine-tuning) and M14 (privacy). The whole lab runs offline, free, no key (logging and curation are plain Python over JSONL).
- No new libraries. Nothing here can harm your computer; all the data is synthetic.
- The mental model: feedback is labeled data you already paid to generate. Curation is the judgement step that turns it into evals and training data.
Optional challenge
Open starters/add_signal.py and add an implicit feedback signal: treat
a user's edit of the answer as a correction, or "try again" as a weak negative. Implicit signals are
far more plentiful than thumbs, but noisier, so curate them carefully (and redact PII first).