Skip to content

Notes: M15: Fine-tuning & training

You've steered models with prompts (M5), fed them your data (RAG, M7), and given them tools (M9). Fine-tuning is the fourth lever: actually training a pre-trained model a bit further on your own examples so a behaviour, a voice, a format, a narrow task, is baked in, no big prompt required. This module demystifies how training works (enough to use it, no maths), has you build the thing that actually decides quality (the dataset), and, most importantly, teaches when fine-tuning is the right tool and when it isn't.

How models are trained (the 2-minute version)

Recall M0: an LLM is a neural network: a web of simple units with billions of adjustable numbers (weights), built in the transformer design, whose attention mechanism lets it weigh which earlier words matter. "Training" means adjusting those weights so the model's outputs get better, by showing it examples and nudging the weights toward the right answer (a process called gradient descent, you don't need the maths). Modern models are trained in stages:

flowchart LR
  PT["1 · Pre-training<br/>read the internet → learn language<br/>(huge, $$$, the model maker)"]
  PT --> SFT["2 · Fine-tuning (SFT)<br/>learn to follow instructions / a style<br/>(small data, you can do this)"]
  SFT --> RLHF["3 · RLHF / alignment<br/>learn to be helpful & safe<br/>(human feedback)"]
  • Pre-training makes the raw model from enormous text, months, millions of dollars, done by the maker. You never do this.
  • Fine-tuning (supervised fine-tuning, SFT) continues training on a small set of your (input → ideal output) examples. This is the part you can do with a modest dataset.
  • RLHF (reinforcement learning from human feedback) is how makers make models helpful/safe; a detail to know the name of, not something you'll typically do.

What fine-tuning is good for (and what it isn't)

Fine-tuning changes how a model responds, learned from examples: - Style / tone / voice: always reply in your brand's voice, no long prompt. - Format consistency: always output exactly your JSON/label scheme. - A narrow task done reliably: a specific classification or transformation, cheaper/faster than a big prompt on every call. - Teaching new facts: fine-tuning is bad at this and tends to blur or hallucinate facts. For knowledge, use RAG (M7), not fine-tuning.

The rule of thumb (the whole module in one line): prompt first → add RAG for knowledge → fine-tune for behaviour at scale. Reach for fine-tuning when a good prompt works but is long, repetitive, or inconsistent, and you call it often enough that baking the behaviour in pays off.

The dataset is the whole game

A fine-tune is only as good as its examples. The format (for chat models) is JSONL: one example per line, each a little conversation ending in the ideal assistant reply:

{"messages": [
  {"role": "system", "content": "You are GreenLeaf Support: warm, brief, solution-first."},
  {"role": "user", "content": "My order is late."},
  {"role": "assistant", "content": "So sorry for the wait! I've flagged it, tracking within the hour. "}
]}
What makes a dataset good: - Consistency: every example shows the same style/format you want. Inconsistent examples teach inconsistency. - Enough examples: dozens at minimum for a clear style; hundreds+ for a robust task. (Quality and consistency beat raw count.) - Realistic inputs: examples should look like what real users will actually send. - A held-out set: keep some examples aside to evaluate the fine-tune (M8's eval mindset).

Building and validating this file is exactly the lab, and it needs no GPU and no key.

Running a fine-tune

Two paths, same dataset:

A) Hosted (easiest): upload your JSONL to a provider's fine-tuning API, start a job, wait (minutes-hours), then call your new model by its id. The lab uses OpenAI's fine-tuning API (Anthropic doesn't offer a simple public one), which costs a few dollars on a small dataset:

f = client.files.create(file=open("train.jsonl","rb"), purpose="fine-tune")
job = client.fine_tuning.jobs.create(training_file=f.id, model="gpt-4o-mini-2024-07-18")
# ...wait until job.status == "succeeded", then use job.fine_tuned_model like any model

B) Local / open (free, needs a GPU): fine-tune an open model (Llama, Gemma) on your own machine with Hugging Face transformers + peft (or the friendly unsloth). The key trick is LoRA: instead of retraining all billions of weights, you train a few small adapter layers, which is far cheaper and fits on a modest GPU. Then run it with Ollama (M13). Same dataset idea.

Evaluating, and the risks

Treat a fine-tune like any change, measure it (M8): run your held-out examples and a few real prompts through the new model and compare to the base. Watch for: - Overfitting: it parrots the training examples but generalizes poorly (too few/too-similar examples). - Catastrophic forgetting: it gets worse at general tasks while learning your narrow one. - Staleness & cost: a fine-tune is frozen at training time; new data means re-training. It also costs money/time to train and (often) to host. RAG updates instantly and is usually cheaper, another reason to try it first.

Go deeper (optional, not needed for today's win) - **Hyperparameters:** number of **epochs** (passes over the data), learning rate, batch size, start with the provider's defaults; tune only if results are off. - **LoRA/QLoRA:** "parameter-efficient fine-tuning" (PEFT), trains small adapters (QLoRA adds quantization, M13) so you can fine-tune big open models on one GPU. - **Other providers/tools:** Together, Fireworks, and Hugging Face AutoTrain also host fine-tuning; `unsloth` speeds up local LoRA. - **Building a model from scratch** (your own transformer) is a deep-learning project, not AI engineering, if curious, Karpathy's "nanoGPT" / "Zero to Hero" is the classic on-ramp. As an AI engineer you fine-tune existing models, not train new ones from nothing. - **Distillation**: training a small model to imitate a big one, is a related way to get a cheap, fast model for a narrow job.

Check yourself

Lock in today's win, answer each in your head, then reveal.

1. What does "training" actually change, and what are the three stages?

Show answer

Training adjusts the model's weights (the billions of numbers in its neural network) so its outputs improve. Stages: pre-training (learn language from huge text, the maker), fine-tuning / SFT (learn a style/task from your small dataset, you), and RLHF (align to be helpful/safe, the maker).

2. What is fine-tuning good for, and what should you NOT use it for?

Show answer

Good for style/tone, format consistency, and a narrow task done reliably without a long prompt. Not for teaching new facts: it blurs/hallucinates them; use RAG for knowledge.

3. State the prompt/RAG/fine-tune rule of thumb.

Show answer

Prompt first → add RAG for knowledge → fine-tune for behaviour at scale. Reach for fine-tuning when a good prompt works but is long/repetitive/inconsistent and you call it often enough that baking the behaviour in pays off.

4. Why is the dataset the most important part of a fine-tune?

Show answer

The model learns from the examples, so quality = the examples' quality. You need consistent style/format, enough realistic examples (dozens+), and a held-out set to evaluate. Inconsistent or too-few examples → a bad fine-tune, no matter the code.

5. What is LoRA, and why does it matter for open models?

Show answer

LoRA trains a few small adapter layers instead of all the model's weights, "parameter- efficient fine-tuning." It's far cheaper and fits a big open model's fine-tune onto a modest GPU, making local fine-tuning practical.


New words (also in resources/glossary.md): training, pre-training (recap), fine-tuning / SFT, RLHF, weights/gradient descent, epoch, dataset (JSONL), LoRA / PEFT, QLoRA, overfitting, catastrophic forgetting, distillation, when-to-fine-tune.

Source: original, written for this course. The fine-tuning workflow follows OpenAI's documented fine-tuning API (verified against the installed SDK; dataset prep verified for real, the API calls mocked, see the solution README). The local path names Hugging Face transformers/peft/LoRA and unsloth as neutral reference. Diagrams are original.