M28: Agent UX and streaming to a UI (Part D: Agentic Systems)
A correct answer delivered after eight seconds of blank screen feels broken. The same answer, with "searching..." then the words appearing live, feels fast and trustworthy, even if it took exactly as long. How an agent FEELS to use is part of whether it is good. Today you build the UX layer: the agent stops going silent and instead streams what it is doing (thinking, searching, writing) and the answer token by token, surfaces its citations and cost, and lets the user cancel a long run.
Today's win: an agent that streams its progress and its answer live to a UI, shows sources and cost, and can be cancelled mid-run, demonstrated offline and over a real Server-Sent Events endpoint.
Today you will
- See why perceived latency matters as much as real latency
- Turn the agent into an event stream: status, tool, citation, token, cost, done
- Stream the answer token by token so it appears live instead of all at once
- Surface citations and cost in the UI (from M24 and M20/M25)
- Support cancellation, and serve the stream over Server-Sent Events (SSE) with FastAPI
Run of show (about 55 minutes)
| Time | What we do |
|---|---|
| 0:00 | Hook: the blank screen problem |
| 0:05 | The one idea: emit events as you work, render them live (read notes.md) |
| 0:12 | Lab Part A: run the streamed agent and watch progress, tokens, citations, cost |
| 0:30 | Lab Part B: cancellation, then serve it over SSE and read the wire format |
| 0:50 | Show: post your streamed run |
| 0:55 | Wrap |
If you get stuck
- Builds on M6 (streaming), M24 (citations), M20/M25 (cost), and M11/M18 (serving). The core lab runs offline, free, no key (a streaming mock). The SSE step needs
fastapiplusuvicorn(from M11). - The key trick is a generator:
chat_streamyields events as it works instead of returning one blob. If a consumer stops iterating, the agent stops, that is cancellation for free. - Nothing here can harm your computer. Read
events.pyto see the whole event vocabulary in one place.
Optional challenge
Open starters/add_event.py and add a UX event of your own: a step-progress
indicator ("step 2 of 3"), a typing indicator before the first token, or a retry notice when a transient
error is recovered (M22). Small touches like these are most of what makes an agent feel polished.