Course 02: AI Engineering
Build and ship real AI applications, starting from your very first line of Python and ending with deployed, multi-agent systems. This course is beginner friendly and lab first: every lesson ends with something that runs.
Live website: https://thinktechconsultingllc.github.io/course-02-ai-engineering/
New here? Read START-HERE/ first (a 2 minute read).
1. What this course is
- 36 modules, M0 through M35, grouped into five parts (see the table in section 4).
- No prior programming experience required. Module 0 explains what AI engineering is; modules 1 to 3 teach Python from scratch; the rest build AI apps on top of it.
- Lab first. Each module is a short read plus a hands-on lab with numbered steps. Every step tells you exactly what you should see when it works.
- You build real things: a chatbot, a question-answering system over your own documents (RAG), agents that use tools, a deployed web API, a fine-tuning dataset, and a multi-agent system.
2. Who it is for
Complete beginners who want to become AI engineers. If you can use a web browser, you can start. You add tools (Python, an API key, Docker) only at the exact module that needs them, never before.
3. System requirements
| What | Details |
|---|---|
| A computer | Windows, macOS, or Linux. Any laptop from the last 8 years is fine. |
| Internet | Needed to use the AI model API and to install tools. |
| A web browser | Modules 0 to 2 run entirely in the browser with Google Colab. No install needed. |
| Python 3.10, 3.11, or 3.12 | Needed from Module 3 onward. Use 3.12 if unsure. A few libraries (vector stores, some agent frameworks) do not yet work on Python 3.13 or 3.14. |
| An Anthropic API key | Needed from Module 4 onward, to call the Claude model. Costs a small amount per use. Set up in Module 4. |
| Docker (optional) | Only for Module 11 (deployment). |
| Ollama (optional) | Only for Module 13 (running local models, free and offline). |
You do not need a powerful computer or a graphics card. The heavy AI work runs on Anthropic's servers.
4. The modules (in order)
Part A: Python foundation
| # | Module | You will be able to |
|---|---|---|
| M0 | AI Engineering, explained | Understand what AI engineering is, how language models work, and the model landscape (no code). |
| M1 | Your first Python program | Write and run Python: print, variables, types, input. |
| M2 | Logic & data | Make decisions and work with lists and dictionaries (the shape of API data). |
| M3 | Functions, files, libraries & errors | Organize code, install libraries, read and write files and JSON, handle errors. |
Part B: Building with AI
| # | Module | You will be able to |
|---|---|---|
| M4 | Ship your first AI app | Call a language model from Python and build a tiny chatbot. |
| M5 | Prompt engineering | Reliably get a model to do what you want. |
| M6 | Driving the model from code | Use the API fluently: parameters, streaming, structured JSON. |
| M7 | RAG I: give the AI your own knowledge | Answer questions over your own documents. |
| M8 | RAG II: make it good | Measure and improve retrieval quality. |
| M9 | Agents: tools & frameworks | Build an AI that takes actions, not just talks. |
| M10 | Evaluation, guardrails & security | Test your app and stop it being tricked. |
| M11 | Deployment & capstone | Wrap, containerize, and ship a real app. |
Part C: Breadth & responsibility
| # | Module | You will be able to |
|---|---|---|
| M12 | Multimodal AI | Give your app image understanding. |
| M13 | Open-source & local models | Run open models locally with Ollama: free, offline, private. |
| M14 | Ethics & Responsible AI | Test for bias, protect privacy, keep humans in the loop. |
| M15 | Fine-tuning & training | Build a dataset and know when and how to fine-tune. |
| M16 | Building MCP servers & clients | Expose your tools over MCP so any AI app can use them. |
| M17 | Build a language model from scratch | Train a tiny model and meet the transformer (optional deep dive). |
Part D: Agentic systems
| # | Module | You will be able to |
|---|---|---|
| M18 | Multi-agent orchestration | Deploy an orchestrator that coordinates specialist sub-agents and connectors. |
| M19 | Build agents in many frameworks | Build the same agent in LangGraph, CrewAI, AutoGen, smolagents, LlamaIndex, and no-code n8n. |
| M20 | Agent observability & evaluation | Trace every step an agent takes, score whether it was right, and catch regressions before shipping. |
| M21 | Agent memory & state | Give an agent short-term and long-term memory, plus checkpoint and resume, so it remembers across sessions. |
| M22 | Agent reliability & ops | Harden an agent for production: retries, timeouts, fallbacks, step caps, and human-approval gates. |
| M23 | Agent security | Attack and defend an agent: prompt injection, data exfiltration, least privilege, and defense in depth. |
| M24 | Agentic RAG & research agents | Make retrieval a tool: an agent that searches, reads, searches again, and answers multi-hop questions with citations. |
| M25 | Cost & performance optimization | Cut an agent's bill with caching, model routing, and trimming, and prove quality held with evals. |
| M26 | Evaluation-driven development & CI | Automate your evals: a gate that runs on every push and blocks any merge that drops quality. |
| M27 | Part D capstone: ship a complete agent | Integrate M18-M26 into one deployed, evaluated agent: RAG + memory + observability + reliability + security, behind an API. |
| M28 | Agent UX & streaming to a UI | Stream progress and the answer live, show citations and cost, support cancellation, and serve it over SSE. |
| M29 | Agent deployment & serving | Serve an agent like production: env config, health/readiness probes, graceful lifecycle, statelessness, containers. |
| M30 | Agent data & feedback loops | Turn real traffic and feedback into new eval cases and fine-tuning data, with PII redacted: the data flywheel. |
Part E: Operations Support
The safeguarding layer over everything above: it keeps what you built running, supported, and recoverable. AI engineering is the backbone; operations support protects the architecture, the databases, and the builds. Every lab runs offline.
Start with Operations Support, explained — the Part E orientation (operator vs builder; the LLMOps/AgentOps/AIOps lenses; the deploy → observe → respond → improve loop; a day in the role), then work the four modules in order.
| # | Module | You will be able to |
|---|---|---|
| M31 | Incident response & on-call | Define SLOs, get paged when the error budget burns too fast, run a runbook to mitigate, and write a blameless postmortem that becomes a regression test. |
| M32 | AI support desk & AIOps | Triage and route support tickets under an SLA, escalate breaches, send uncertain ones to a human, and correlate an alert storm into a few incidents. |
| M33 | Data & release operations | Keep the RAG index fresh, private, and restorable; canary new releases and roll back bad ones; rotate secrets with zero downtime. |
| M34 | Part E capstone: the on-call shift | Run M31–M33 on one incident: storm → page → rollback → postmortem → canaried fix, with an eval gate scoring the whole shift. |
| M35 | Operations Support, going deeper | (Optional) Five hands-on mini-labs: structured logging, a golden-signals dashboard, online eval, rate limits & quotas, and the reliability flywheel. |
Full pacing and capstone tracks are in Course-02-Syllabus.md. The visual
roadmap with progress checkboxes is in ROADMAP.md.
5. How each module folder is laid out
mN-name/
README.md a one-screen map: the hook, today's win, the run of show, help, a challenge
notes.md the in-depth read: concepts, diagrams, and a "check yourself" quiz
lab/lab.md the doing: tiny numbered steps, each with the result you should see
solution/ the worked answer (look only after you have tried)
starters/ ready-to-run starter files
assets/ images and diagrams for that module
Recommended order for each module: read README.md, read notes.md, do lab/lab.md, then compare with
solution/.
6. Installation guides (read each one when its module asks)
Do not install anything in advance. Each guide is step by step and tells you what success looks like.
- Terminal basics: the command line, for Module 3.
- Python + virtual environment: install Python and make a project box, for Module 3.
- API keys: get and safely store your Anthropic key, for Module 4.
- Vector store: the database RAG uses, for Module 7.
- Ollama (local models): run models on your own machine, for Module 13.
- Docker: package an app to ship, for Module 11.
To install all Python libraries the course uses at once: pip install -r requirements.txt (or install
just what each lab asks for, which is the recommended path).
7. Secrets and safety
- Never commit API keys. Keys go in a file named
.env, which is already ignored by Git. Each module ships a.env.examplewith a placeholder so you know the format. - Security content is educational only and uses synthetic (fake) data. The security agents in Part D investigate and recommend; they never take action on real systems.
8. Shared resources
- Glossary: every term used in the course, in plain language.
- Cheat cards: quick references.
- AI Engineering Resource Map: deeper external resources.
9. About the published website
This repository is the source. The live website is built automatically with MkDocs Material and published
to GitHub Pages whenever the main branch changes (the workflow is .github/workflows/deploy.yml).
The site is public and unencrypted.