Skip to content

M11: Deployment & productionizing + Capstone

Your apps work, on your laptop, in your terminal. The last step of being an AI engineer is making them run for real: behind a web API anyone can call, packaged so they run the same anywhere, with your key kept safe and basic monitoring on. Then you build the thing that's yours: the capstone: design, build, and demo a complete AI app, and take a bow.

Today's win: your AI app runs as a real web service (FastAPI), packaged in a container (Docker), with your key passed safely and request latency/cost logged, then you scope your capstone.

Today you will

  • Wrap an AI app in a FastAPI web API (/health, /chat) and try it in the browser
  • Containerize it with Docker, passing your key safely at run time (callback to Course 01)
  • Add basic monitoring (latency + token/cost logging), then kick off your capstone

Run of show (~70 min + capstone)

Time What we do
0:00 Hook + the win we're chasing
0:05 The one idea: a script → a service anyone can call (full read in notes.md)
0:10 Lab Part A: wrap the app in FastAPI; hit it from the browser
0:35 Lab Part B: containerize with Docker; run it; read the logs
0:55 Capstone kickoff: pick a track, scope it (see CAPSTONE.md)
- Build your capstone, then demo it

If you get stuck

  • New installs: fastapi + uvicorn (pip), and Docker (see the Docker guide). Short on time or blocked on Docker? You can finish on the FastAPI/uvicorn path alone, the lab has a no-Docker fallback.
  • Never bake your key into the image. Pass it at run time (--env-file .env); .dockerignore + .gitignore keep .env out. Re-read the You should now see line.
  • Capstone feeling big? It isn't a new skill, it's combining M4-M10. Start from the smallest version that runs, then add.

Optional challenge

Deploy your container somewhere others can reach it (a free host, or share your screen running it), and have a classmate call your /chat endpoint. The moment someone else's request hits your AI service is the moment you're an AI engineer.