M6: Driving the model from code (the API, properly)
You can call a model and you can prompt it. Today you take the wheel: the knobs that shape every reply (length, randomness), streaming so words appear as they're written, and the big one, getting back guaranteed structured JSON your code can use, not just prose to read. By the end the API isn't a mystery box; it's a tool you drive.
Today's win: an app that turns messy free-text into clean, guaranteed-valid JSON your code
can use, plus fluency with max_tokens, temperature, and streaming.
Today you will
- Control replies with
max_tokensandtemperature(and learn why the newest Opus models drop the temperature knob) - Stream a reply so it appears word-by-word
- Get structured JSON output the API guarantees: and parse it into a Python dict you use
Run of show (~60 min)
| Time | What we do |
|---|---|
| 0:00 | Hook + the win we're chasing |
| 0:05 | The one idea: a request is data you control; a response is data you parse (full read in notes.md) |
| 0:10 | Lab Part A: max_tokens, temperature, streaming |
| 0:35 | Lab Part B: build the structured-JSON extractor |
| 0:55 | Show: post your extracted record to the wins board |
| 1:00 | Wrap + take-home |
If you get stuck
- No new setup, reuse M4's key,
.env, and libraries. Same.env/key fixes apply. - Two model facts that trip people:
temperatureerrors onclaude-opus-4-8(useclaude-haiku-4-5for the temperature demo, the lab does), and structured JSON output needs a current model (we use ones that support it). Nothing here can harm your computer. - Re-read the You should now see line and compare with your partner.
Optional challenge
Change the extractor's schema to pull fields you care about (e.g. add a priority enum to a
task extractor, or a due_date). Feed it three messy notes and confirm every reply parses on the
first try, that reliability is the whole point of structured output.