Skip to content

Lab M33: operate the data and gate the deploys

You'll need: Python and your venv. No API key, no cost, instant and deterministic (the index is in-memory and embeddings are stubbed as a hash, so the operations are real but nothing calls a model). Time: about 45 minutes. Work in your breakout pair.

Heads up: there is no real vector store or deploy here. Index stands in for the M7 store and the "releases" are plain functions, so you can practice the operations (reindex, backup, canary, rollback) safely and reproducibly.

This lab has two parts: - Part A: data ops, redact on write, reindex what changed, retention, backup/restore. - Part B: release ops, canary a good and a bad build, roll back, and rotate a secret.

flowchart TB
  subgraph Data
    S["source changed"] --> ST{"stale?"}
    ST -->|yes| RE["reindex only it"]
    BK["snapshot"] -.restore.-> IX["index"]
  end
  subgraph Release
    C["candidate build"] --> CN{"canary vs live<br/>on eval set"}
    CN -->|holds quality| PR["promote"]
    CN -->|regresses| RB["rollback"]
  end

Part A: data operations

Step 1: Set up

Copy the solution/ files into a folder and activate your venv. Nothing to install.

python -c "import index_ops, release_ops; print('ops ok')"
You should now see: ops ok. (If not: run it from inside the folder with the .py files.)

Step 2: Build the index and watch PII get redacted

python demo_mock.py
You should now see six sections, A to F. Look at section A:
==== A. INDEX BUILD + PII REDACTION ====
indexed 4 docs
policy doc as stored: Contact admin at [email], SSN on file [ssn].
The policy source contained an email and an SSN; redact_pii scrubbed them before anything was stored. Open index_ops.py and read redact_pii and upsert. You should now see: the safest data is the data you never wrote down.

Step 3: Detect a stale doc and reindex only it

Look at section B:

==== B. STALENESS -> REINDEX (only what changed) ====
stale docs detected: ['pricing']
reindexed only: ['pricing']  (re-embedding all 4 would waste money)
The pricing source changed upstream, so its indexed copy was stale; reindex re-embedded only that one. Read is_stale, it compares the stored source hash to the live source. You should now see: you never re-embed the whole store for a one-doc change.

Step 4: Retention and restore

Sections C and D sweep the expired promo doc (its TTL passed), then restore the whole index from the snapshot taken in section B:

==== D. BACKUP / RESTORE ====
before restore: 3 docs ['faq', 'policy', 'pricing']
after restore : 4 docs ['faq', 'policy', 'pricing', 'promo']  (promo & old pricing are back)
You should now see: the snapshot brought back the swept and changed docs. A backup is only real if you can restore it.

Step 5: Prove redaction yourself

python -c "import index_ops as i; print(i.redact_pii('ping me at sam@acme.com or 555-12-3456'))"
You should now see: ping me at [email] or [ssn]. You just stopped PII at the door.


Part B: release operations

Step 6: Canary a good build and promote it

Look at section E:

==== E. RELEASE: canary a GOOD version -> promote ====
candidate v2-good: pass=1.0 vs baseline=1.0 -> PROMOTE
live is now: v2-good   rollback target: v1-baseline
The candidate was scored against the live baseline on a small eval set; it held quality, so it was promoted, and the old version was recorded as the rollback target. Open release_ops.py and read canary and release.

Step 7: Canary a bad build and watch it roll back

Look at section F:

candidate v3-bad: pass=0.4 vs baseline=1.0 -> ROLLBACK
live stayed: v2-good   (the bad build never reached users)
emergency rollback -> live is now: v1-baseline
The bad candidate scored 0.4, a clear regression, so it was not promoted; live stayed on the good version. Then an emergency rollback() returned to the last-good build. You should now see: a green CI run is not proof a build is safe; the canary is.

Step 8: See zero-downtime secret rotation

The end of section F rotates a secret:

during grace: old 'key-v1' valid? True  new 'key-v2' valid? True
after grace : old 'key-v1' valid? False  new 'key-v2' valid? True
During the grace window both keys work, so clients still using the old one keep running; after expire_grace only the new key is valid. That is how you rotate a credential without an outage.

Step 9: Tune the canary bar (it has two guards)

The canary blocks the bad build for two reasons: its score (0.4) is below min_pass (0.8), and it regresses against the live baseline (0.4 < 1.0). To promote it you must relax both. Open demo_mock.py, and in section F change min_pass=0.8 to min_pass=0.3, regress_tol=1.0, then re-run. You should now see the bad build PROMOTE, which is exactly why those two guards, and the eval set behind them, are the most important decision in a release.

Step 10: Show it

Post in the chat your section B (the stale doc reindexed) and section F (the bad build rolled back).


If you get stuck

  • ModuleNotFoundError -> run from inside the folder with index_ops.py and release_ops.py.
  • "Nothing was stale" -> is_stale compares hashes; if you did not change a source's text, it is correctly not stale. Edit a value in sources and re-run.
  • The bad build promoted -> check min_pass. If you lowered it below the bad build's score, you told the canary that score was acceptable.

Check yourself

How does the index know a document is stale? Each record stores the hash of the source it was built from. `is_stale` re-hashes the current source and compares; if they differ, the source changed since indexing and the record is stale. Then you re-embed only the changed docs, not the whole store.
Why redact PII on write instead of when you read it back? Because once raw PII is in the store it can leak through search, logs, or a backup. Redacting on write means the sensitive data is never stored at all, the safest data is the data you never wrote down (M14/M30).
What does a canary prove that CI (M26) does not? CI proves the build passed your current tests. A canary proves the build is at least as good as what is already live, by scoring the candidate against the baseline on an eval set before it serves real traffic. It catches a regression that did not happen to have a test yet.
Why keep the previous secret valid during a grace window? A hard swap would instantly reject every client still using the old key, causing an outage. The grace window lets old and new keys both work until in-flight clients move to the new one; then you close the window. Zero-downtime rotation.