M7: RAG I: give the AI your own knowledge
Ask any model about your meeting notes, your company policy, your class handouts and it guesses or makes things up, it never read them. Today you fix that. You'll give an AI a document of your choice and build a Q&A app that answers from it, with receipts. This is RAG, the technique behind almost every "chat with your docs" product you've seen.
Today's win: a working Q&A app that answers questions about a document you chose, by finding the relevant parts and answering only from them.
Today you will
- Understand why models don't know your data, and what embeddings and vector search are
- Install & configure a vector store (Chroma) that finds text by meaning
- Build the retrieve → augment → generate RAG loop over your own document
Run of show (~70 min)
| Time | What we do |
|---|---|
| 0:00 | Hook + the win we're chasing |
| 0:05 | The one idea: don't fine-tune, retrieve the right text and put it in the prompt (full read in notes.md) |
| 0:10 | Lab Part A: install Chroma; index a document; watch retrieval find the right chunk |
| 0:40 | Lab Part B: wire up the answer step; ask your own document real questions |
| 1:05 | Show: post a Q&A from your doc to the wins board |
| 1:10 | Wrap + take-home |
If you get stuck
- The one new install is Chroma. If
pip install chromadbthrows a build error, your Python is too new, use Python 3.12 (see the vector-store guide). The first run downloads a small model once (needs internet). - Re-read the You should now see line and compare with your partner. Nothing here can harm your computer.
- If an answer is wrong, look at which chunks were retrieved, bad answers usually mean bad retrieval, which is exactly what M8 fixes.
Instructor note: the vector-store install is the friction point, have everyone on Python 3.12 and confirm
import chromadbworks before the lab. Budget time for the one-time model download on first run.
Optional challenge
Ask your app a question whose answer isn't in the document. A good RAG app should say "I don't know based on the document" rather than invent an answer, check whether yours does, and tighten the prompt if it doesn't. (That honesty is a guardrail you'll formalize in M10.)