Lab: M11: deploy your AI app, then build your capstone
You'll need: your M4 setup (venv, key in .env, anthropic), plus fastapi + uvicorn,
and Docker for Part B (optional, there's a fallback). Time: ~70 min + capstone •
Work in your breakout pair.
Heads up: Part B needs Docker, which is a bigger install. If it fights you, do the no-Docker fallback in Step 7, you still finish with a real running web service. Errors are normal and safe.
This lab has two parts, then the capstone kickoff: - Part A: wrap your app in a FastAPI web API. - Part B: containerize it with Docker (or the uvicorn fallback).
flowchart LR
Script["your AI code<br/>(a function)"] --> API["FastAPI<br/>/chat endpoint"]
API --> Box["Docker image<br/>(app + dependencies)"]
Box --> Run["runs the same anywhere<br/>key passed at run time"]
Part A: wrap it in a web API
Step 1: Install FastAPI and set up the folder
With your venv active:
pip install fastapi "uvicorn[standard]"
app.py, requirements.txt, Dockerfile, .dockerignore (from solution/)
in a folder with your M4 .env.
You should now see: Successfully installed fastapi-… uvicorn-…, and those files in the folder.
Step 2: Start the server
uvicorn app:app --reload
Uvicorn running on http://127.0.0.1:8000. Your script is now a live web
service. Leave it running.
Step 3: Try it in the browser
Open http://127.0.0.1:8000/docs. This is FastAPI's auto-generated test page. Expand POST
/chat, click Try it out, enter {"message": "Say hi in one sentence"}, and Execute.
You should now see: a JSON response like {"reply": "Hi there! ..."}, and in the terminal a
log line: POST /chat ok latency=… in_tokens=… out_tokens=…. You just called your AI over HTTP, and you're already monitoring latency and cost.
Step 4: Read the API
Open app.py. Find /health (a liveness check monitors hit), the ChatRequest/ChatResponse
models (FastAPI validates these), and the logging line.
You should now see / say: "a request comes in → FastAPI validates it → my function calls the model → I log latency + tokens → the reply goes back as JSON." That's a web service.
Part B: containerize it
Step 5: Build the image
Make sure Docker is installed and running (see the Docker guide). From the app folder:
docker build -t ai-app .
Successfully tagged ai-app (or naming to …
ai-app). Your app + its dependencies are now one image.
Step 6: Run the container (key passed safely)
docker run -p 8000:8000 --env-file .env ai-app
Uvicorn running on http://0.0.0.0:8000. Open
http://127.0.0.1:8000/docs and call /chat again, same app, now running inside a container.
Your key came from --env-file at run time; it is not inside the image.
Step 7: (Fallback if Docker won't cooperate)
No Docker? You already have a real service from Part A: uvicorn app:app --host 0.0.0.0 --port 8000.
That's a legitimate deployment, containers just make it portable and reproducible.
You should now see: either a running container (Step 6) or a running uvicorn server. Either way, your AI app is deployed as a service.
Capstone kickoff
Step 8: Pick a track and scope it
Open CAPSTONE.md. Choose RAG, agent, or your idea, and write down the
smallest version that runs. Use ../starters/app_starter.py, plug your
M7/M8 RAG or M9 agent into answer_question().
You should now see: a one-paragraph plan and a running skeleton (uvicorn app_starter:app
--reload → /docs). From here you build, then demo.
Stuck? The finished service is
../solution/app.py.
Your win
Your AI app runs as a real web service, wrapped in FastAPI, containerized with Docker, key kept safe, latency and cost logged, and your capstone is scoped and started.
Post it to the chat wins board: "POST /chat → my AI, served over HTTP from a Docker container.
It runs for real now "
Take-home (optional)
Finish and demo your capstone (see CAPSTONE.md for the requirements: it runs, handles a basic
failure gracefully, and you can explain how it works and how you'd secure it). Then present it, everyone claps. You're an AI engineer.