Lab: M6: driving the model from code

You'll need: your M4 setup, venv active ((.venv)), key in .env, anthropic + python-dotenv installed. No new setup. Time: ~50 min • Work in your breakout pair.

Heads up: you'll see a couple of real API quirks today (temperature only works on some models; JSON output needs a current model). Those aren't bugs, they're the kind of detail that separates "I called an API once" from "I drive the API." Errors are normal and safe.

This lab has two parts: - Part A: the controls: max_tokens, temperature, and streaming. - Part B: build the structured-JSON extractor (today's win).

flowchart LR
  Req["request you control<br/>model · max_tokens · temperature · schema"] --> API["Claude API"]
  API --> Resp["response you parse<br/>text · stream · guaranteed JSON"]

Part A: the controls

Step 1: Set up the folder

Put parameters.py, streaming.py, extract.py (from solution/) and extract_starter.py (from starters/) in a folder with your M4 .env. Activate your venv.

You should now see: (.venv) in your prompt and the files listed by ls (Windows dir).

Step 2: `max_tokens` and `temperature`

python parameters.py

This uses claude-haiku-4-5 (cheap, fast, and it accepts a temperature setting, which the newest Opus models don't).

You should now see: the max_tokens=15 answer is cut off mid-thought while max_tokens=80 is fuller (the cap limits length). The two temperature 0.0 coffee-shop names are the same or nearly so; the two temperature 1.0 names are different and more inventive. Low temperature = predictable; high = varied.

Step 3: Read why temperature is on Haiku, not Opus

Open parameters.py and read the top comment.

You should now see / say: the newest Opus models (like claude-opus-4-8) manage their own randomness and reject a temperature setting: sending one returns a 400 error. So you pick a model that supports the knob you need. Choosing models for the job is part of driving the API.

Step 4: Streaming

python streaming.py

You should now see: the pep talk appears word by word, like a chat app, instead of one delayed block. Same request as a normal call, client.messages.stream(...) just hands you the text as it's produced. (Great for anything longer than a sentence.)

Part B: guaranteed structured output (the build)

Step 5: See the M5 → M6 upgrade

In M5 you asked for JSON and parsed defensively (it could arrive fenced or broken). Open extract.py and find EXPENSE_SCHEMA and the output_config=... line. Run it and describe an expense in plain words (e.g. "grabbed lunch with the team at Nando's, about 48 quid, work expense"):

python extract.py

You should now see: your messy sentence turned into clean fields, Item, Amount, Category, Reimbursable: parsed straight from the reply with json.loads, no fence-stripping, no try/except. The API guaranteed the JSON matched the schema. That reliability is the upgrade.

Step 6: Read how the guarantee works

Look again at extract.py: output_config={"format": {"type": "json_schema", "schema": EXPENSE_SCHEMA}}.

You should now see / say: you hand the API a schema (the exact shape, with types and an enum for category), and it constrains the model to return valid JSON in that shape. You parse once, confidently, no defensive code.

Step 7: Build your own extractor (finish the starter)

Open extract_starter.py. It extracts a contact from a messy intro. Finish TODO 1: add company and email to the schema's properties and to required. Then run it:

python extract_starter.py

You should now see: the messy intro ("hi im Dana, i run ops over at BrightLeaf Coffee, reach me on dana@brightleaf.co") turned into name, company, and email fields. Change MESSY to your own intro and run again.

Step 8: Make it yours

Edit the schema to extract something from your world, a task (title, due, priority enum), a product review (product, sentiment, rating), a calendar event. Feed it messy text and confirm it parses first try.

You should now see: your own messy-text → structured-data tool, returning exactly the fields you defined, reliably. You're driving the API.

Stuck? Working examples are in ../solution/. Peek only after you've tried.

Your win

You drive the API: you set max_tokens and temperature, stream replies, and turn messy text into guaranteed-valid JSON your code can use.

Post it to the chat wins board: your messy input → structured output, e.g. "'lunch w team ~48 quid work expense' → {item: Lunch, amount: 48.0, category: food, reimbursable: true}, guaranteed JSON, first try "

Take-home (optional)

Combine M5 + M6: write a tool that takes a messy customer message, and returns JSON with a category (enum), a sentiment, and a polite suggested_reply. One call, structured output, genuinely useful. Bring it to next session, it's a real mini-app.