M5: Prompt engineering
In M4 you got a model talking. But "make this nicer" gets you mush, while a well-built prompt gets you exactly what you pictured. Today you learn to steer: the same model, the same code, wildly different results, just from the words you choose. You'll A/B two prompts on the same input and watch the quality jump.
Today's win: you build a prompt-powered tool from your own life or work, and you can reliably make the model do what you want, proven by A/B-ing a vague prompt against an engineered one.
Today you will
- Use system vs user prompts, few-shot examples, chain-of-thought, and structured (JSON) output
- A/B test two prompts on the same input and see the difference with your own eyes
- Know when prompting alone is enough: and when you'll need more (a preview of RAG and agents)
Run of show (~60 min)
| Time | What we do |
|---|---|
| 0:00 | Hook + the win we're chasing |
| 0:05 | The one idea: the prompt is the program you write for the model (full read in notes.md) |
| 0:10 | Lab Part A: A/B vague vs engineered; add few-shot |
| 0:35 | Lab Part B: structured JSON output + a chain-of-thought try; build your own tool |
| 0:55 | Show: post your A vs B difference to the wins board |
| 1:00 | Wrap + take-home |
If you get stuck
- No setup today, you reuse M4's key,
.env, and libraries. If a call fails, it's almost always the same.env/key fix from M4. - Prompts are experiments, there's no single right answer, and "worse" outputs are useful data. Re-read the You should now see line and compare with your partner.
- If a JSON reply won't parse, run it again (prompt-only JSON isn't guaranteed, M6 fixes that). Nothing here can harm your computer.
Optional challenge
Find a prompt that breaks your tool (vague input, a trick request, another language) and then add one sentence to your system prompt that fixes it. You just did real prompt engineering, and previewed M10's guardrails.