Lab M22: make an agent survive the real world

You'll need: your venv and the anthropic plus python-dotenv from M4. The core lab needs no API key, costs nothing, and runs instantly (we inject fake failures and stub out the backoff waits). A live run at the end is optional. Time: about 45 minutes. Work in your breakout pair.

Heads up: we make the model fail on purpose: it errors, it hangs, it loops, it tries to do something risky, and you watch each reliability pattern handle it. The "risky" tool only pretends to send an email; nothing leaves your machine and nothing can be harmed.

This lab has two parts: - Part A: retry, timeout, and graceful degrade (surviving a flaky or down service). - Part B: step caps and the human-approval gate (surviving the agent itself).

flowchart TB
  T["task"] --> CAP{"step cap<br/>tick()"}
  CAP -->|ok| CALL["model call"]
  CALL --> TO["timeout"]
  TO --> RE["retry + backoff"]
  RE -->|still failing| DEG["safe message<br/>(degrade)"]
  RE -->|tool wanted| GATE{"risky?<br/>approval gate"}
  GATE -->|safe / approved| RUN["run tool"]
  GATE -->|denied| BLK["blocked"]
  CAP -->|over cap| STOP["stop: runaway"]

Part A: surviving a flaky or down service

Step 1: Set up

Copy the solution/ files and starters/.env.example into a folder. Activate your venv.

python -c "import anthropic, dotenv; print('deps ok')"

You should now see: deps ok. (If not: pip install anthropic python-dotenv, the M4 libraries.)

Step 2: Run the fault-injection demo

python demo_mock.py

You should now see five sections (A to E). Look at section A first:

==== A. RETRY: model fails twice, then works ====
{'answer': 'The answer is 391.', 'steps': 2, 'blocked': [], 'degraded': False}

The fake model raised an error on its first two calls, but retry waited and tried again, so the agent still got the right answer. A blip did not become a failure.

Step 3: See the timeout fire

Look at section E of the same output:

==== E. TIMEOUT: a slow call is given up on ====
caught: call timed out after 0.05s ...

A call that took too long was given up on (raising a transient error that retry would then retry). Open reliability.py and read call_with_deadline and retry. You should now see: retry only re-runs on transient errors and waits longer each time (backoff).

Step 4: See graceful degradation

Look at section C:

==== C. GRACEFUL DEGRADE: outage, all retries fail ====
{'answer': 'Service unavailable, please try again later. ...', 'degraded': True}

Here the service is fully down, so every retry fails. Instead of crashing, the agent returns a calm message and marks the result degraded. Failing safely is the goal when you cannot succeed.

Part B: surviving the agent itself

Step 5: Stop a runaway loop

Look at section B:

==== B. STEP CAP: model loops forever, cap stops it ====
{'answer': 'Stopped: exceeded 4 steps (possible runaway loop)', 'steps': 5, 'degraded': True}

The fake model never finishes; it asks for a tool again every turn. Without a cap this runs forever and spends real money. StepLimiter stopped it at the cap. This is your hard cost-safety control (and M20's observability is how you would have spotted the loop).

Step 6: Block a risky action behind human approval

Look at section D:

==== D. APPROVAL GATE: a risky action ====
default (deny):   {... 'blocked': ['send_email'] ...}
human approves:   {... 'blocked': [] ...}

The model tried to call send_email, a world-changing action. By default the agent BLOCKED it (no human said yes). When an approver returns yes, the same action runs. Open agent.py: multiply is in SAFE_TOOLS and runs freely; send_email is in RISKY_TOOLS and must pass approval_gate. The agent proposes; a human decides (human-in-the-loop, M14).

Step 7: Tune a policy yourself

In a Python shell, prove the cap is yours to set:

python -c "import agent, demo_mock as d; print(agent.run('loop', client=d.looping_client(), max_steps=2, sleep=lambda x:None))"

You should now see: Stopped: exceeded 2 steps .... You changed the safety limit and the agent obeyed it. Try max_steps=8 to see it run longer before stopping.

Step 8 (optional, costs a few tokens): a real run

Put your key in .env (copy .env.example), then:

cp .env.example .env      # then edit .env and paste your key
python agent.py

You should now see: a normal result dict with the answer containing 391 and degraded: False. The reliability wrappers are invisible when nothing fails, which is exactly the point. Steps 1 to 7 need no key.

Step 9: Show it

Post in the chat one section from the demo where a fault was handled: the retry recovery (A), the loop stopped (B), the safe degrade (C), or the blocked risky action (D).

If you get stuck

ModuleNotFoundError: anthropic -> pip install anthropic python-dotenv (M4 libraries).
demo_mock.py cannot find agent/reliability -> run it from inside the folder with the solution .py files.
The demo seems to pause -> it should not; backoff sleep is stubbed with sleep=lambda x: None. If you call agent.run yourself, pass that too or it will really wait.
ANTHROPIC_API_KEY error in Step 8 -> your .env is not named exactly .env, or the key line is wrong. See api-keys.md. Steps 1 to 7 need no key.

Check yourself

Why retry only on "transient" errors, and not on every error?

A transient error (rate limit, 503, timeout) is likely to succeed if you wait and try again. A permanent error (an invalid prompt, a 400) will fail every time, so retrying it just wastes time and money. Retry the recoverable ones; fail fast on the rest.

What does a step cap protect you from?

A runaway loop: an agent that keeps calling tools and never finishes, spending tokens (money) on every call. The cap stops the run after a set number of steps. Observability (M20) shows the loop; the cap prevents the bill.

Why gate some tools behind human approval but not others?

Reading data is safe to automate. Actions that change the world (send email, delete, spend, deploy) are irreversible or outward-facing, so a wrong or manipulated agent could do real harm. Those get a human yes first; the agent only proposes them.

What does "graceful degradation" mean here?

When the agent cannot succeed (a real outage), it returns a calm, safe message and marks the result degraded, instead of crashing with an error. A predictable safe failure beats an ugly one.