Skip to content

AI Engineer

Introduction

  • What is an AI Engineer
  • AI Engineer vs ML Engineer
  • Impact on product development
  • AI vs AGI
  • Using AI to improve UX

How LLMs work

  • Large Language Models (LLMs)
  • How LLMs work
  • Neural networks
  • Transformers
  • Tokenization & token counting
  • Next-token prediction
  • Inference
  • Understanding model capabilities

Models & selection

  • Pre-trained models
  • Closed vs open-source models
  • Choosing the right model
  • Model selection
  • Understanding model capabilities
  • Smaller models
  • SKU / model variants
  • Model families
  • OpenAI GPT
  • Anthropic Claude
  • Google Gemini
  • Meta Llama
  • Mistral
  • Gemma
  • DeepSeek
  • Cohere
  • Perplexity

Using model APIs

  • OpenAI API
  • Claude Messages API
  • Google Gemini API
  • Hugging Face Inference SDK
  • Input format
  • Output format
  • Structured outputs
  • System prompts
  • Token counting

Prompt engineering

  • Prompt engineering
  • System prompts
  • Few-shot prompting
  • Chain-of-thought (CoT)
  • Context engineering
  • Prompt optimization
  • Prompt compression
  • Constraining inputs & outputs

AI safety & ethics

  • AI safety and ethics
  • Bias and fairness
  • Content moderation APIs
  • Conducting adversarial testing
  • Safety evaluation
  • Data classification
  • Anomaly detection
  • Adding end-user IDs in prompts
  • Know your customers / use cases

Open-source & local models

  • Open-source models
  • Ollama
  • LM Studio
  • Hugging Face
  • Hub
  • Models
  • Tasks
  • Inference SDK
  • Connect to a local server
  • Connect to a remote server
  • Quantization
  • Llama / Gemma locally

Embeddings

  • Embeddings
  • Vector embeddings
  • Embedding models
  • Indexing embeddings
  • Semantic search
  • Gemini embedding
  • Cohere embed
  • Jina

Vector databases

  • Vector databases
  • Chroma
  • FAISS
  • LanceDB
  • Pinecone
  • Qdrant
  • Weaviate

RAG (Retrieval-Augmented Generation)

  • RAG / retrieval-augmented generation
  • Chunking
  • Indexing embeddings
  • Retrieval
  • Ranking
  • Re-ranking
  • Generation
  • Data layer
  • Semantic search
  • RAG chatbots
  • RAG over multimodal documents

AI agents

  • What are AI agents
  • Agent use cases
  • Function calling
  • Tools
  • Manual implementation (the loop)
  • ReAct
  • External memory
  • Context compaction
  • Context isolation
  • Frameworks & tools
  • LangChain
  • LlamaIndex
  • Haystack
  • Development tools
  • Agent SDKs / coding agents
  • Claude Agent SDK
  • Google ADK
  • Claude Code
  • OpenAI Codex
  • Cursor

MCP (Model Context Protocol)

  • What is MCP
  • Building an MCP client
  • Building an MCP server
  • MCP client
  • MCP server

Multimodal AI

  • Multimodal
  • Image understanding
  • Image generation
  • DALL-E API
  • Vision-language models
  • Audio processing
  • Speech recognition
  • Whisper
  • Video understanding
  • Multimodal RAG
  • LangChain for multimodal apps
  • LlamaIndex for multimodal apps

Classic NLP tasks

  • NLP tasks
  • Sentiment analysis
  • Summarization
  • Data classification
  • Anomaly detection
  • Web search integration

Evaluation, optimization & monitoring

  • Model evaluation
  • Safety evaluation
  • Model optimization
  • Quantization
  • Monitoring
  • Monitoring LLM apps
  • Inference cost/latency

Fine-tuning & training

  • Fine-tuning
  • Training custom models
  • Transformers
  • Neural networks

Data & integrations

  • SQL databases
  • Web search integration
  • Data layer