M33 solution: data and release operations
Two dependency-free toolkits, plus a demo that exercises both: keeping the RAG index healthy (data ops) and shipping new versions safely (release ops). Offline, deterministic, no API key.
Files
| File | Role |
|---|---|
index_ops.py |
Index (a stand-in vector store), redact_pii (scrub PII on write), is_stale/stale_ids/reindex (re-embed only changed docs), sweep_expired (retention/TTL), and snapshot/restore (backups you can actually restore). |
release_ops.py |
ReleaseManager (version registry, canary against the live baseline, release = canary-then-promote-or-stay, one-call rollback) and SecretStore (rotate a secret with a grace window, zero downtime). |
demo_mock.py |
A→F: build+redact the index, detect a stale doc and reindex only it, sweep an expired doc, restore from backup, then canary a good release (promote) and a bad one (rollback), and rotate a secret. Start here. |
../starters/canary_ramp.py |
Your turn: a progressive canary that ramps traffic 1%→10%→50%→100% and rolls back on any breach. |
Run it
# offline, free, instant, deterministic:
python demo_mock.py
.env are needed for this module.
The ideas, and what they safeguard
- PII redaction on write — never let raw PII into the index in the first place (privacy first, M14/M30).
- staleness + selective reindex — a source doc changed, so its indexed copy is stale; re-embed only the docs that changed, not the whole store (re-embedding everything is the expensive mistake).
- retention (TTL) — sweep out expired docs so the index does not grow forever or keep data too long.
- backup / restore — a snapshot you can restore. A backup you have never restored is just a hope.
- canary + rollback — passing CI (M26) is not proof a release is safe; run the candidate next to the live baseline on a small eval set and promote only if quality holds, otherwise the bad build never reaches users. Builds directly on the eval sets from M20/M26.
- secret rotation with a grace window — keep the previous secret valid briefly so rotating a key never causes an outage for in-flight clients.
Verified (offline)
- Data ops: the
policydoc is stored asContact admin at [email], SSN on file [ssn].; changing thepricingsource marks onlypricingstale and reindexes only it;promo(ttl=5) is swept at t=6;restorebrings the index back to all 4 docs from the snapshot. - Release ops:
v2-goodscores 1.0 vs baseline 1.0 → promote (live=v2-good, rollback target=v1);v3-badscores 0.4 → rollback (live stays v2-good); emergencyrollback()returns to v1; a rotated secret keeps the old key valid during the grace window and rejects it afterexpire_grace(). index_ops.pyandrelease_ops.pyare dependency-free and import without a key.