Skip to content

M30: Agent data and feedback loops (Part D: Agentic Systems)

A deployed agent is not the end; it is a data source. Every real question it answers, and every thumbs up, thumbs down, or correction a user gives, is the most valuable signal you have for making it better, more valuable than any dataset you could buy. Today you build the flywheel: capture production interactions and feedback, curate them into new eval cases (M26) and fine-tuning examples (M15), and feed them back in. Usage compounds into quality, and you do it without leaking anyone's private data.

Today's win: a feedback pipeline that turns real interactions into a regression eval case AND a corrected training example, with PII redacted on the way in, all demonstrated offline.

Today you will

  • Log production interactions (question, answer, sources, feedback) with PII redacted at write time
  • Read the three core signals: thumbs up, thumbs down + correction, and down with no fix
  • Curate logs into eval cases (M26): up becomes golden, down+correction becomes a regression
  • Curate logs into fine-tuning examples (M15): learn from good answers and corrected ones
  • See the flywheel: usage becomes the data that improves the agent, while respecting privacy (M14)

Run of show (about 50 minutes)

Time What we do
0:00 Hook: your agent is a data source
0:05 The one idea: capture, curate, feed back (read notes.md)
0:12 Lab Part A: log interactions with feedback, and redact PII
0:28 Lab Part B: curate into eval cases and fine-tuning examples
0:45 Show: post a down-vote that became a regression test and a training fix
0:50 Wrap

If you get stuck

  • Connects M20/M26 (evals) with M15 (fine-tuning) and M14 (privacy). The whole lab runs offline, free, no key (logging and curation are plain Python over JSONL).
  • No new libraries. Nothing here can harm your computer; all the data is synthetic.
  • The mental model: feedback is labeled data you already paid to generate. Curation is the judgement step that turns it into evals and training data.

Optional challenge

Open starters/add_signal.py and add an implicit feedback signal: treat a user's edit of the answer as a correction, or "try again" as a weak negative. Implicit signals are far more plentiful than thumbs, but noisier, so curate them carefully (and redact PII first).