Skip to content

Course 02: AI Engineering

AI Engineer

Course 02: AI Engineering

Home
Start here
Roadmap
Runnable Python (demo)
M0 · AI Engineering, explained
M0 · AI Engineering, explained
- Overview
- Notes
- Lab
- Solution
M1 · Your first Python program
M1 · Your first Python program
- Overview
- Notes
- Lab
- Solution
M2 · Logic & data
M2 · Logic & data
- Overview
- Notes
- Lab
- Solution
M3 · Functions, files & libraries
M3 · Functions, files & libraries
- Overview
- Notes
- Lab
- Solution
M4 · Ship your first AI app
M4 · Ship your first AI app
- Overview
- Notes
- Lab
- Solution
M5 · Prompt engineering
M5 · Prompt engineering
- Overview
- Notes
- Lab
- Solution
M6 · Driving the model from code
M6 · Driving the model from code
- Overview
- Notes
- Lab
- Solution
M7 · RAG I
M7 · RAG I
- Overview
- Notes
- Lab
- Solution
M8 · RAG II
M8 · RAG II
- Overview
- Notes
- Lab
- Solution
M9 · Agents
M9 · Agents
- Overview
- Notes
- Lab
- Solution
M10 · Eval, guardrails & security
M10 · Eval, guardrails & security
- Overview
- Notes
- Lab
- Solution
M11 · Deployment & capstone
M11 · Deployment & capstone
- Overview
- Notes
- Lab
- Solution
- Capstone
M12 · Multimodal AI
M12 · Multimodal AI
- Overview
- Notes
- Lab
- Solution
M13 · Open-source & local models
M13 · Open-source & local models
- Overview
- Notes
- Lab
- Solution
M14 · Ethics & Responsible AI
M14 · Ethics & Responsible AI
- Overview
- Notes
- Lab
- Solution
M15 · Fine-tuning & training
M15 · Fine-tuning & training
- Overview
- Notes
- Lab
- Solution
M16 · Building MCP servers & clients
M16 · Building MCP servers & clients
- Overview
- Notes
- Lab
- Solution
M17 · Build a language model (deep-dive)
M17 · Build a language model (deep-dive)
- Overview
- Notes
- Lab
- Solution
M18 · Multi-agent orchestration (Part D)
M18 · Multi-agent orchestration (Part D)
- Overview
- Notes
- Lab
- Solution
M19 · Build agents in many frameworks (Part D)
M19 · Build agents in many frameworks (Part D)
- Overview
- Notes
- Lab
- Solution
M20 · Agent observability & evaluation (Part D)
M20 · Agent observability & evaluation (Part D)
- Overview
- Notes
- Lab
- Solution
M21 · Agent memory & state (Part D)
M21 · Agent memory & state (Part D)
- Overview
- Notes
- Lab
- Solution
M22 · Agent reliability & ops (Part D)
M22 · Agent reliability & ops (Part D)
- Overview
- Notes
- Lab
- Solution
M23 · Agent security (Part D)
M23 · Agent security (Part D)
- Overview
- Notes
- Lab
- Solution
M24 · Agentic RAG & research agents (Part D)
M24 · Agentic RAG & research agents (Part D)
- Overview
- Notes
- Lab
- Solution
M25 · Cost & performance optimization (Part D)
M25 · Cost & performance optimization (Part D)
- Overview
- Notes
- Lab
- Solution
M26 · Evaluation-driven development & CI (Part D)
M26 · Evaluation-driven development & CI (Part D)
- Overview
- Notes
- Lab
- Solution
M27 · Part D capstone: ship a complete agent
M27 · Part D capstone: ship a complete agent
- Overview
- Notes
- Lab
- Solution
M28 · Agent UX & streaming to a UI (Part D)
M28 · Agent UX & streaming to a UI (Part D)
- Overview
- Notes
- Lab
- Solution
M29 · Agent deployment & serving (Part D)
M29 · Agent deployment & serving (Part D)
- Overview
- Notes
- Lab
- Solution
M30 · Agent data & feedback loops (Part D)
M30 · Agent data & feedback loops (Part D)
- Overview
- Notes
- Lab
- Solution
Part E · Operations Support, explained
Part E · Operations Support, explained
- Overview
- Notes
M31 · Incident response & on-call (Part E)
M31 · Incident response & on-call (Part E)
- Overview
- Notes
- Lab
- Solution
M32 · AI support desk & AIOps (Part E)
M32 · AI support desk & AIOps (Part E)
- Overview
- Notes
- Lab
- Solution
M33 · Data & release operations (Part E)
M33 · Data & release operations (Part E)
- Overview
- Notes
- Lab
- Solution
M34 · Part E capstone: on-call shift (Part E)
M34 · Part E capstone: on-call shift (Part E)
- Overview
- Notes
- Lab
- Solution
M35 · Operations Support, going deeper (Part E)
M35 · Operations Support, going deeper (Part E)
- Overview
- Notes
- Lab
- Solution
Resources
Resources

AI Engineer

Introduction

What is an AI Engineer
AI Engineer vs ML Engineer
Impact on product development
AI vs AGI
Using AI to improve UX

How LLMs work

Large Language Models (LLMs)
How LLMs work
Neural networks
Transformers
Tokenization & token counting
Next-token prediction
Inference
Understanding model capabilities

Models & selection

Pre-trained models
Closed vs open-source models
Choosing the right model
Model selection
Understanding model capabilities
Smaller models
SKU / model variants
Model families
OpenAI GPT
Anthropic Claude
Google Gemini
Meta Llama
Mistral
Gemma
DeepSeek
Cohere
Perplexity

Using model APIs

OpenAI API
Claude Messages API
Google Gemini API
Hugging Face Inference SDK
Input format
Output format
Structured outputs
System prompts
Token counting

Prompt engineering

Prompt engineering
System prompts
Few-shot prompting
Chain-of-thought (CoT)
Context engineering
Prompt optimization
Prompt compression
Constraining inputs & outputs

AI safety & ethics

AI safety and ethics
Bias and fairness
Content moderation APIs
Conducting adversarial testing
Safety evaluation
Data classification
Anomaly detection
Adding end-user IDs in prompts
Know your customers / use cases

Open-source & local models

Open-source models
Ollama
LM Studio
Hugging Face
Hub
Models
Tasks
Inference SDK
Connect to a local server
Connect to a remote server
Quantization
Llama / Gemma locally

Embeddings

Embeddings
Vector embeddings
Embedding models
Indexing embeddings
Semantic search
Gemini embedding
Cohere embed
Jina

Vector databases

Vector databases
Chroma
FAISS
LanceDB
Pinecone
Qdrant
Weaviate

RAG (Retrieval-Augmented Generation)

RAG / retrieval-augmented generation
Chunking
Indexing embeddings
Retrieval
Ranking
Re-ranking
Generation
Data layer
Semantic search
RAG chatbots
RAG over multimodal documents

AI agents

What are AI agents
Agent use cases
Function calling
Tools
Manual implementation (the loop)
ReAct
External memory
Context compaction
Context isolation
Frameworks & tools
LangChain
LlamaIndex
Haystack
Development tools
Agent SDKs / coding agents
Claude Agent SDK
Google ADK
Claude Code
OpenAI Codex
Cursor

MCP (Model Context Protocol)

What is MCP
Building an MCP client
Building an MCP server
MCP client
MCP server

Multimodal AI

Multimodal
Image understanding
Image generation
DALL-E API
Vision-language models
Audio processing
Speech recognition
Whisper
Video understanding
Multimodal RAG
LangChain for multimodal apps
LlamaIndex for multimodal apps

Classic NLP tasks

NLP tasks
Sentiment analysis
Summarization
Data classification
Anomaly detection
Web search integration

Evaluation, optimization & monitoring

Model evaluation
Safety evaluation
Model optimization
Quantization
Monitoring
Monitoring LLM apps
Inference cost/latency

Fine-tuning & training

Fine-tuning
Training custom models
Transformers
Neural networks

Data & integrations

SQL databases
Web search integration
Data layer