Skip to content

Course 02: AI Engineering

Build and ship real AI applications, starting from your very first line of Python and ending with deployed, multi-agent systems. This course is beginner friendly and lab first: every lesson ends with something that runs.

Live website: https://thinktechconsultingllc.github.io/course-02-ai-engineering/ New here? Read START-HERE/ first (a 2 minute read).


1. What this course is

  • 36 modules, M0 through M35, grouped into five parts (see the table in section 4).
  • No prior programming experience required. Module 0 explains what AI engineering is; modules 1 to 3 teach Python from scratch; the rest build AI apps on top of it.
  • Lab first. Each module is a short read plus a hands-on lab with numbered steps. Every step tells you exactly what you should see when it works.
  • You build real things: a chatbot, a question-answering system over your own documents (RAG), agents that use tools, a deployed web API, a fine-tuning dataset, and a multi-agent system.

2. Who it is for

Complete beginners who want to become AI engineers. If you can use a web browser, you can start. You add tools (Python, an API key, Docker) only at the exact module that needs them, never before.

3. System requirements

What Details
A computer Windows, macOS, or Linux. Any laptop from the last 8 years is fine.
Internet Needed to use the AI model API and to install tools.
A web browser Modules 0 to 2 run entirely in the browser with Google Colab. No install needed.
Python 3.10, 3.11, or 3.12 Needed from Module 3 onward. Use 3.12 if unsure. A few libraries (vector stores, some agent frameworks) do not yet work on Python 3.13 or 3.14.
An Anthropic API key Needed from Module 4 onward, to call the Claude model. Costs a small amount per use. Set up in Module 4.
Docker (optional) Only for Module 11 (deployment).
Ollama (optional) Only for Module 13 (running local models, free and offline).

You do not need a powerful computer or a graphics card. The heavy AI work runs on Anthropic's servers.

4. The modules (in order)

Part A: Python foundation

# Module You will be able to
M0 AI Engineering, explained Understand what AI engineering is, how language models work, and the model landscape (no code).
M1 Your first Python program Write and run Python: print, variables, types, input.
M2 Logic & data Make decisions and work with lists and dictionaries (the shape of API data).
M3 Functions, files, libraries & errors Organize code, install libraries, read and write files and JSON, handle errors.

Part B: Building with AI

# Module You will be able to
M4 Ship your first AI app Call a language model from Python and build a tiny chatbot.
M5 Prompt engineering Reliably get a model to do what you want.
M6 Driving the model from code Use the API fluently: parameters, streaming, structured JSON.
M7 RAG I: give the AI your own knowledge Answer questions over your own documents.
M8 RAG II: make it good Measure and improve retrieval quality.
M9 Agents: tools & frameworks Build an AI that takes actions, not just talks.
M10 Evaluation, guardrails & security Test your app and stop it being tricked.
M11 Deployment & capstone Wrap, containerize, and ship a real app.

Part C: Breadth & responsibility

# Module You will be able to
M12 Multimodal AI Give your app image understanding.
M13 Open-source & local models Run open models locally with Ollama: free, offline, private.
M14 Ethics & Responsible AI Test for bias, protect privacy, keep humans in the loop.
M15 Fine-tuning & training Build a dataset and know when and how to fine-tune.
M16 Building MCP servers & clients Expose your tools over MCP so any AI app can use them.
M17 Build a language model from scratch Train a tiny model and meet the transformer (optional deep dive).

Part D: Agentic systems

# Module You will be able to
M18 Multi-agent orchestration Deploy an orchestrator that coordinates specialist sub-agents and connectors.
M19 Build agents in many frameworks Build the same agent in LangGraph, CrewAI, AutoGen, smolagents, LlamaIndex, and no-code n8n.
M20 Agent observability & evaluation Trace every step an agent takes, score whether it was right, and catch regressions before shipping.
M21 Agent memory & state Give an agent short-term and long-term memory, plus checkpoint and resume, so it remembers across sessions.
M22 Agent reliability & ops Harden an agent for production: retries, timeouts, fallbacks, step caps, and human-approval gates.
M23 Agent security Attack and defend an agent: prompt injection, data exfiltration, least privilege, and defense in depth.
M24 Agentic RAG & research agents Make retrieval a tool: an agent that searches, reads, searches again, and answers multi-hop questions with citations.
M25 Cost & performance optimization Cut an agent's bill with caching, model routing, and trimming, and prove quality held with evals.
M26 Evaluation-driven development & CI Automate your evals: a gate that runs on every push and blocks any merge that drops quality.
M27 Part D capstone: ship a complete agent Integrate M18-M26 into one deployed, evaluated agent: RAG + memory + observability + reliability + security, behind an API.
M28 Agent UX & streaming to a UI Stream progress and the answer live, show citations and cost, support cancellation, and serve it over SSE.
M29 Agent deployment & serving Serve an agent like production: env config, health/readiness probes, graceful lifecycle, statelessness, containers.
M30 Agent data & feedback loops Turn real traffic and feedback into new eval cases and fine-tuning data, with PII redacted: the data flywheel.

Part E: Operations Support

The safeguarding layer over everything above: it keeps what you built running, supported, and recoverable. AI engineering is the backbone; operations support protects the architecture, the databases, and the builds. Every lab runs offline.

Start with Operations Support, explained — the Part E orientation (operator vs builder; the LLMOps/AgentOps/AIOps lenses; the deploy → observe → respond → improve loop; a day in the role), then work the four modules in order.

# Module You will be able to
M31 Incident response & on-call Define SLOs, get paged when the error budget burns too fast, run a runbook to mitigate, and write a blameless postmortem that becomes a regression test.
M32 AI support desk & AIOps Triage and route support tickets under an SLA, escalate breaches, send uncertain ones to a human, and correlate an alert storm into a few incidents.
M33 Data & release operations Keep the RAG index fresh, private, and restorable; canary new releases and roll back bad ones; rotate secrets with zero downtime.
M34 Part E capstone: the on-call shift Run M31–M33 on one incident: storm → page → rollback → postmortem → canaried fix, with an eval gate scoring the whole shift.
M35 Operations Support, going deeper (Optional) Five hands-on mini-labs: structured logging, a golden-signals dashboard, online eval, rate limits & quotas, and the reliability flywheel.

Full pacing and capstone tracks are in Course-02-Syllabus.md. The visual roadmap with progress checkboxes is in ROADMAP.md.

5. How each module folder is laid out

mN-name/
  README.md      a one-screen map: the hook, today's win, the run of show, help, a challenge
  notes.md       the in-depth read: concepts, diagrams, and a "check yourself" quiz
  lab/lab.md     the doing: tiny numbered steps, each with the result you should see
  solution/      the worked answer (look only after you have tried)
  starters/      ready-to-run starter files
  assets/        images and diagrams for that module

Recommended order for each module: read README.md, read notes.md, do lab/lab.md, then compare with solution/.

6. Installation guides (read each one when its module asks)

Do not install anything in advance. Each guide is step by step and tells you what success looks like.

To install all Python libraries the course uses at once: pip install -r requirements.txt (or install just what each lab asks for, which is the recommended path).

7. Secrets and safety

  • Never commit API keys. Keys go in a file named .env, which is already ignored by Git. Each module ships a .env.example with a placeholder so you know the format.
  • Security content is educational only and uses synthetic (fake) data. The security agents in Part D investigate and recommend; they never take action on real systems.

8. Shared resources

9. About the published website

This repository is the source. The live website is built automatically with MkDocs Material and published to GitHub Pages whenever the main branch changes (the workflow is .github/workflows/deploy.yml). The site is public and unencrypted.