Lab: M18: orchestrate a team of security sub-agents (and deploy it)

You'll need: your venv, your .env key (from M3), and one install for the deploy part: pip install fastapi "uvicorn[standard]" (you likely have these from M11). Synthetic data only. Time: ~50 minutes • Work in your breakout pair.

Heads up: this builds on M9 (agents), M16 (connectors/MCP), and M11 (deploy). The agents here investigate and recommend, they never take any action, and all the threat data is fake. Nothing here can harm your computer.

This lab has two parts: - Part A: run the orchestrator and watch four specialist sub-agents hand off to produce a report. - Part B: add your own sub-agent, then deploy the whole system behind an API.

flowchart LR
  A["alert"] --> O["ORCHESTRATOR<br/>investigate()"]
  O --> T["triage (L1)"]
  O --> E["enrich"]
  O --> C["correlate (L2)"]
  O --> R["report (lead)"]
  E -.->|lookup_ioc| K[("threat-intel<br/>connector")]
  C -.->|search_logs| L[("log<br/>connector")]
  R --> OUT["incident report"]

Part A: run the orchestrated SOC team

Step 1: Set up

Copy connectors.py, agents.py, orchestrator.py, and app.py from solution/ into a folder, and add_subagent.py + .env.example from starters/. Activate your venv. Copy .env.example to a file named exactly .env and paste your real key.

cp .env.example .env      # then edit .env and paste your key

You should now see: the five .py files and a .env in your folder. Never commit .env.

Step 2: Try the connectors alone (no key, no LLM)

The connectors are plain functions, run them first so you trust the data the agents see:

python -c "import connectors as c; print(c.extract_indicators('login FAIL admin from 185.220.101.45 then connect 45.137.21.9')); print(c.lookup_ioc('185.220.101.45')); print(c.search_logs('185.220.101.45'))"

You should now see: the two IPs extracted, 185.220.101.45 reported MALICIOUS, and 5 log lines (failed logins → a download of customer_db.csv → an outbound connection). That's the raw material your agent team will reason over.

Step 3: Run the orchestrator

python orchestrator.py

Press Enter to accept the sample alert (or paste your own synthetic one).

You should now see: four sections print in order, === indicators ===, === severity ===, === enrichment ===, === correlation ===, and then a written === INCIDENT REPORT ===. That's four sub-agents, each doing one job, with the orchestrator passing each result to the next.

What just happened: investigate() called triage (got severity + indicators), fed those indicators to enrich (which called the intel connector) and correlate (which called the log connector), then handed everything to report to synthesize. One coordinator, four specialists, two connectors. Re-read notes.md §3 alongside the output.

Step 4: Prove the specialists are really separate

Open agents.py. Find the triage system prompt ("You are a SOC L1 triage analyst…") and change it to demand the severity in ALL CAPS with an emoji. Re-run python orchestrator.py.

You should now see: only the severity line changes style, enrichment, correlation, and the report are untouched. That's the payoff of splitting into sub-agents: you can change one specialist without disturbing the others. Put the prompt back when you're done.

Part B: add a sub-agent, then deploy

Step 5: Add a 5th specialist

Open starters/add_subagent.py. It already contains a remediation_advisor sub-agent (suggests containment steps, recommend-only). Wire it into the pipeline. In a copy of orchestrator.py, after the report line, add:

import add_subagent
remediation = add_subagent.remediation_advisor(client, incident)   # incident = the report text

and add "remediation": remediation to the returned dict (and print it in __main__). Run it.

You should now see: your report followed by 2-3 concrete containment steps. You just grew the team by one specialist, no other agent changed. (Stretch: make it a comms or MITRE mapper agent.)

Step 6: Deploy the system (the "agentic deploy")

Install the server bits if you haven't, then serve the orchestrator:

pip install fastapi "uvicorn[standard]"
uvicorn app:app --reload

In a second terminal (venv active), send an alert:

curl -s -X POST http://127.0.0.1:8000/investigate \
  -H "Content-Type: application/json" \
  -d '{"alert":"Repeated failed logins for admin from 185.220.101.45, then download of customer_db.csv and a connection to 45.137.21.9."}'

You should now see: the uvicorn terminal log POST /investigate ok latency=…s indicators=2, and the curl terminal print JSON with indicators, severity, enrichment, correlation, and a full report. Your multi-agent system is now a deployable service: anything that can POST can hand it an alert and get an investigation back. (Visit http://127.0.0.1:8000/docs to try it in the browser.) Ctrl-C to stop the server.

Step 7: Show it

Paste, in the chat, the incident report your agent team produced, plus the one-line change you made in Step 4 or your extra sub-agent from Step 5.

If you get stuck

ANTHROPIC_API_KEY error → your .env isn't named exactly .env, or the key line is wrong. Re-check M3 / install-guides/api-keys.md.
Address already in use on uvicorn → an old server is still running; Ctrl-C it, or use --port 8001 and curl that port.
ModuleNotFoundError: fastapi → pip install fastapi "uvicorn[standard]" with your venv active.
A sub-agent gives a weird answer → it's a role/prompt issue. Sharpen that sub-agent's system prompt in agents.py; the others are unaffected.
Want to test without spending tokens? The orchestrator accepts an injected client, the course's verification mocks it. You don't need this for the lab, but it's how you'd unit-test an agent pipeline.

Check yourself

Why split one agent into four sub-agents instead of one big prompt?

Focus (each has one job and a short prompt), debuggability (you know which step failed), and independent improvement (change `report` without touching `triage`), Step 4 proved the last one. Cost is the trade-off: four sub-agents = four billed calls.

What does the orchestrator actually do?

It coordinates: decides who runs and in what order, and passes each sub-agent's output into the next (`triage` → `enrich`/`correlate` → `report`). It's the coordinator, not a specialist itself.

What's a "connector" here, and where are they used?

A function that talks to a system. `enrich` uses the threat-intel connector (`lookup_ioc`) and `correlate` uses the log connector (`search_logs`). These are the shape of thing you'd expose as an MCP server (M16) so any agent could call them.

Why do these agents never block an IP or disable an account?

Human-in-the-loop (M10/M14). They investigate and recommend; a human decides on action. An agent that can act needs an approval gate, and authorization, on synthetic data only.