SnowOps — Project State
Current Phase
- M1: Sign-offs in flight
- M2a: Core Baseline FUNCTIONALLY CODE-COMPLETE (v0.40) — every non-postponed asset is 🟦+. W-series postponed (D35).
- M2b: 14-asset core CODE-COMPLETE — E0 (v0.42) + V2 + V3 (v0.43) + S1 (v0.44) + S2 (v0.45) + L1 (v0.46) + L2 (v0.47) + L4 (v0.48) + F12 (v0.49) + B6 (v0.50) + F11 (v0.51) shipped; K1 + K2 (PR #12) + D5 (PR #13) landed externally and merged. 14/14 in-repo — nothing external remaining. Focus now: runbook sign-offs.
- GTM: Track A COMPLETE (v0.34) — 39 files under
gtm/. Awaiting human sign-offs.
Last Updated: 2026-05-31 (v0.51)
In Flight
| Item | Status | Notes |
|---|---|---|
| M2b 14-asset core | 🟦 code-complete | all 14 in-repo (E0+V2+V3+S1+S2+L1+L2+L4+F12+B6+F11 + K1+K2+D5); nothing external remaining |
| M2a runbook sign-offs | 🟧 | X7+U+N+M+J offline parts are ~5 min each |
| GTM Track A | 🟦 complete | Human sign-offs pending (see §0) |
Open Issues / Tech Debt (v0.55 repo review — 2026-06-04)
Surfaced by a full-repo review. Full detail + IDs in docs/context/10-gap-register.md (G15–G20). Fixed in the same pass except G18 (seeded, ongoing).
| # | Issue | Severity | Status |
|---|---|---|---|
| G15 | B1 test red — apps/github-onboarder/src/load-template.test.ts asserted a stale workflow file list |
High | ✅ Fixed — test now walks disk dynamically; 38/38 pass |
| G16 | No PR CI job ran apps/* unit tests; B1 had no workflow |
High | ✅ Fixed — added .github/workflows/app-tests.yml (dynamic matrix, 13 apps) |
| G17 | CLAUDE.md §2 repo layout stale | Med | ✅ Fixed — §2 refreshed |
| G18 | Zero ADRs despite 50 decisions; DoD #5 + §3 require ADRs | Med | 🟧 Seeded — docs/adr/ README + template + ADRs 0001–0004; backfill rest ongoing |
| G19 | F7 live/validate.sh false-positive hclfmt warning (deprecated CLI flags) |
Low | ✅ Fixed — modern hcl fmt --check, now a hard fail |
| G20 | U3/K3 scaffold READMEs described absent behavior | Low | ✅ Fixed — "SCAFFOLD — postponed" banners + §2 tags |
Verified passing in this review: terraform
fmt -check(modules/live/sandbox) ✅ · Go terratestvet+ compile ✅ · all 14 app suites now green (github-onboarder fixed) ·live/validate.sh✅. Tooling: node 26, go 1.26, terraform 1.15.
Runbook Sign-Off Backlog (Ordered)
Execute offline Parts A+B first (~5 min each). Cloud parts (C/D) are optional promotion-path extenders.
| Priority | Runbook | Cloud? | Time | Cost |
|---|---|---|---|---|
| 1 | V2 docs/runbooks/test/V2.md |
No | ~5 min | $0 |
| 2 | V3 docs/runbooks/test/V3.md |
No | ~5 min | $0 |
| 3 | E0 docs/runbooks/test/E0.md |
Partial (Part C) | ~6 min offline | $0 |
| 4 | X7 docs/runbooks/test/X7.md |
No | ~5 min | $0 |
| 5 | U1 docs/runbooks/test/U1.md |
Yes (Part C) | ~5 min offline | $0 |
| 6 | U2 docs/runbooks/test/U2.md |
Yes (Part C/D) | ~5 min offline | $0 |
| 7 | N5 docs/runbooks/test/N5.md |
Yes (Part C/D) | ~5 min offline | $0 |
| 8 | N6 docs/runbooks/test/N6.md |
Yes (Part C/D) | ~5 min offline | $0 |
| 9 | M1 docs/runbooks/test/M1.md |
Yes | ~5 min offline | $0 |
| 10 | M2 docs/runbooks/test/M2.md |
Yes (Part C) | ~5 min offline | ~$1 |
| 11 | M3 docs/runbooks/test/M3.md |
Yes | ~5 min offline | $0 |
| 12 | M6 docs/runbooks/test/M6.md |
Yes | ~5 min offline | $0 |
| 13 | J1 docs/runbooks/test/J1.md |
Yes (Part C/D) | ~5 min offline | $0 |
| 14 | J2 docs/runbooks/test/J2.md |
Yes (Part C/D) | ~5 min offline | $0 |
| 15 | J6 docs/runbooks/test/J6.md |
Yes (Part C/D) | ~5 min offline | $0 |
| 16 | H5 docs/runbooks/test/H5.md |
Yes (Part C) | ~5 min offline | $0 |
| 17 | H7 docs/runbooks/test/H7.md |
Yes (Part C/D) | ~5 min offline | $0 (needs P1) |
| 18 | F8 docs/runbooks/test/F8.md |
Optional (kind) | ~5 min offline | $0 |
| 19 | B5 docs/runbooks/test/B5.md |
Yes (Part C) | ~5 min offline | $0 (needs P2) |
| 20 | B4 docs/runbooks/test/B4.md |
Yes (Part C) | ~8 min | $0 |
| 21 | B3 docs/runbooks/test/B3.md |
Yes (Part C) | ~12 min | $0 |
| 22 | B2 docs/runbooks/test/B2.md |
Yes (Part C) | ~25 min | $0 |
| 23 | C3 docs/runbooks/test/C3.md |
Yes (Parts C–F) | ~75 min | $0 |
| 24 | C2 docs/runbooks/test/C2.md |
Yes (Parts C–E) | ~40 min | <$1 |
| 25 | H1 docs/runbooks/test/H1.md |
Yes | ~25 min | $0 |
| 26 | H2 docs/runbooks/test/H2.md |
Yes (needs P1) | ~30 min | $0 |
| 27 | H3 docs/runbooks/test/H3.md |
Yes (needs P2) | ~30 min | $0 |
| 28 | F3 docs/runbooks/test/F3.md |
Yes (Part C) | ~30 min | ~$10 |
| 29 | F5 docs/runbooks/test/F5.md |
Yes (Part C) | ~25 min | ~$2 |
| 30 | F4 docs/runbooks/test/F4.md |
Yes (Part C) | ~30 min | ~$5 |
| 31 | D4 docs/runbooks/test/D4.md |
Optional (kind) | ~5 min offline | $0 |
| 32 | F2 docs/runbooks/test/F2.md |
Yes (Part C) | ~35 min | ~$5 |
| 33 | F0 docs/runbooks/test/F0.md |
No | ~15 min | $0 |
| 34 | B1 docs/runbooks/test/B1.md |
Yes | ~60 min | $0 |
| 35+ | D2, X1, X2, C1, G0–G6, A1, A5, F1, F6 | Mix | Various | Various |
M2b 14-Asset Core Progress (D36)
| # | Asset | Scope | Status | Est. Time | Cloud Cost |
|---|---|---|---|---|---|
| 1 | ✅ E0 | Compliance snapshot (Policy + Defender score, wired to C1) | 🟦 v0.42 | — | — |
| 2 | ✅ V2 | Architecture diagram generator (apps/diagram-generator/) |
🟦 v0.43 | — | — |
| 3 | ✅ V3 | Runbook generator (apps/runbook-generator/) |
🟦 v0.43 | — | — |
| 4 | ✅ S1 | Drift detection (scheduled terraform plan → ticket via TicketPlatform) |
🟦 v0.44 | — | — |
| 5 | ✅ S2 | Azure Policy compliance dashboard (apps/compliance-dashboard/) |
🟦 v0.45 | — | — |
| 6 | ✅ K1 | IR runbook library (docs/runbooks/incident/) |
🟦 (external, merged) | — | — |
| 7 | ✅ K2 | On-call integration (modules/azure/oncall-integration/) |
🟦 (external, merged) | — | — |
| 8 | ✅ L1 | Azure Backup policy module (modules/azure/backup-policy/) |
🟦 v0.46 | — | — |
| 9 | ✅ L2 | Cross-region replication (object replication + SQL failover group) | 🟦 v0.47 | — | — |
| 10 | ✅ L4 | Automated restore drill (apps/restore-drill/ → S2 DR panel) |
🟦 v0.48 | — | — |
| 11 | ✅ D5 | Policy waiver engine (waivers/, OPA exception records, CI expiry enforcement) |
🟦 (external, merged PR #13) | — | — |
| 12 | ✅ F12 | Brownfield import library (modules/azure/import-blocks/, 9 modules) |
🟦 v0.49 | — | — |
| 13 | ✅ B6 | Client self-service bootstrap (prerequisite checker + validator) (apps/client-bootstrap/) |
🟦 v0.50 | — | — |
| 14 | ✅ F11 | Module versioning + private registry (apps/module-registry/) |
🟦 v0.51 | — | — |
Rule (D36): ALL other M2b/M3 assets are POSTPONED until these 14 are code-complete + signed off. Depth before breadth.
Next 5 Selected (v0.53 — D48)
After the M2b 14-asset core went code-complete, the next 5 most-important postponed/unbuilt
items were selected. Reconciliation: C5 (ADO pipelines) was found already built in commit
51c7fc4 — docs were stale; it's now marked 🟦 code-complete and dropped from the candidate set.
| # | Asset | Rationale | Status |
|---|---|---|---|
| 1 | I3 — CodeQL SAST | No code-analysis layer existed; D2 covered only IaC/secrets | 🟦 code-complete (v0.53) |
| 2 | I2 — Dependency scanning | dependabot.yml existed but no PR gate / alert digest |
🟦 code-complete (v0.53) |
| 3 | I1 — Container image scanning | Closes G6 (non-K8s container security); reusable image scan | 🟦 code-complete (v0.53) |
| 4 | E7 — TicketPlatform adapters | Closes G8; unblocks E6/I5/K4/P3; generalizes the S1 seed | 🟦 code-complete (v0.54) |
| 5 | F7 — Terragrunt live-infra reference | The missing per-env/region module wiring repo; deploy enablement | 🟦 code-complete (v0.55) |
Done (v0.53): I1 + I2 + I3 — the M2a CI security-scanning suite (docs/runbooks/test/I1.md, I2.md, I3.md).
Done (v0.54): E7 — apps/ticket-platform/ (GitHub/Jira/Linear/ADO adapters + CLI, 26 tests); S1 repointed (interface-compatible). Runbook docs/runbooks/test/E7.md.
Done (v0.55): F7 — live/ Terragrunt reference (root + _envcommon + bootstrap + per-env/region units; baseline→net/kv/acr DAG; offline validate.sh). Runbook docs/runbooks/test/F7.md.
✅ Next-5 (D48) COMPLETE. Candidate next batch: runbook sign-offs (in parallel), then M3 tail (W4 client offboarding) / M2b additional (J4 alert pack, X5 pipeline integration tests, X8 synthetic monitoring) / M4 advanced.
Next 5 Selected (v0.56 — D51)
With the next-5 (D48) complete, the next 5 most-important postponed items were selected. Priority rule: depth before breadth (D36) + milestone order — finish the remaining M2b "additional" assets (M2b §84) before advancing to M3 tail / M4. All five are M2b. The heavier network items (N3 WAF, N4 DDoS — CO-owned, cloud-cost) are deferred to a later batch.
| # | Asset | Rationale | Status |
|---|---|---|---|
| 1 | J4 — Alert rule pack | No detection-rule layer existed; identity/network/privilege/data-exfil KQL alerts over the J1 LAW, wired to K2 action groups | 🟦 code-complete (v0.56) |
| 2 | I5 — Defender → ticket via E7 | Newly unblocked by E7 (D49); first consumer proving the TicketPlatform adapter; closes the Defender-alert→ticket loop |
🟦 code-complete (v0.58) |
| 3 | X5 — Pipeline integration tests | M2a CI gates (C1–C3) had no integration test consumers; reusable-workflow test repos | 🟦 code-complete (v0.57) |
| 4 | X8 — Synthetic monitoring | No availability/latency synthetic probes; Azure Monitor standard webtests + alert rules | 🟦 code-complete (v0.57) |
| 5 | R2 — Production change log | Merged-PR → changelog generator; reuses E7 for change-record tickets where required | 🟦 code-complete (v0.59) |
✅ D51 batch COMPLETE (all 5): J4 (v0.56) · X5 + X8 (v0.57) · I5 (v0.58) · R2 (v0.59).
Done (v0.56): J4 — modules/azure/alert-rule-pack/ (curated scheduled-query alert rules across four threat domains — identity/privilege/network/data-exfil; domain toggles + per-rule overrides + freeform custom rules; consumes the J1 workspace + K2 action groups by ARM ID). Offline TestAlertRulePackValidate green. Runbook docs/runbooks/test/J4.md.
Done (v0.57) — X series complete (X5 + X8):
- X5 — tests/pipeline-integration/ reusable-workflow contract gate (contract_check.py + test_contract_check.py, 11 unit tests; offline validate.sh; CI .github/workflows/pipeline-integration.yml) + live it-{container-build-sign,aks-deploy,terraform-plan-apply}.yml consumers driving the existing fixtures against the sandbox. Catches workflow_call interface drift across all callers offline. Runbook docs/runbooks/test/X5.md.
- X8 — modules/azure/synthetic-monitoring/ (App Insights standard availability tests + per-test availability metric alerts; optional workspace-based AI component; consumes the J1 workspace + K2 action groups by ARM ID). Offline TestSyntheticMonitoringValidate green. Runbook docs/runbooks/test/X8.md.
X series: X1✅ X2✅ X3🟩 X4✅ X5✅ X6(ongoing runbooks) X7✅ X8✅ — all X assets now code-complete/shipped except the ongoing X6 runbook track.
Done (v0.58): I5 — apps/defender-ticketer/ (Defender for Cloud alerts → idempotent tickets via the E7 snowops-ticket CLI; the first E7 consumer). Pure normalize/filter/dedupe core behind a Collector seam; consumes E7 at run time (no build-coupling, per D49). 21 jest tests; offline dry-run + E7 output-contract verified. Runbook docs/runbooks/test/I5.md.
Done (v0.59): R2 — apps/change-log/ (production change log: merged PRs / squash commits → categorized Keep-a-Changelog markdown; pure categorize/render core behind a collector seam — git log / gh pr list / fixture; optional E7 change-record ticket per release via the same run-time bridge as I5). 32 jest tests; offline + live git log + prepend verified. Runbook docs/runbooks/test/R2.md. D51 batch complete — next: open backlog (runbook sign-offs in parallel; M4 advanced; heavier net N3/N4).
Sequenced Full Roadmap (Track B — Sagar)
| Priority | Action | Status | Milestone |
|---|---|---|---|
| Next code | K1 + K2 (IR + on-call) | 🟦 external (merged) | M2b |
| After K-series | L1 + L2 + L4 (backup + DR + restore drill) | ⬜ | M2b |
| After L-series | D5 (policy waivers) | 🟦 external (merged PR #13) | M2b |
| After D5 | F12 (brownfield imports) | ⬜ | M3 |
| After F12 | B6 (self-service bootstrap) | 🟦 v0.50 | M3 |
| After B6 | F11 (module versioning) | 🟦 v0.51 | M3 |
| Then | C5, E7, W4 (ADO + ticket adapters + client offboarding) | ⬜ | M3 |
| Then | Advanced package (E1–E6 full, J3, O, P, Q, T, V1, V4) | ⬜ | M4 |
| Then | Multi-cloud (F9, F10, W5, U3) | ⬜ | M5 |
| Last | W1–W3 (multi-tenant) | ⏸️ postponed | after M2b/M3 |
GTM Track A — Status
| Batch | Assets | Status | Notes |
|---|---|---|---|
| A1 | Y0, Y1, Y2, §3.8 | 🟦 drafted (v0.33) | Awaiting Nidhi (Y1 claims) + Sagar (Y2 real numbers) |
| A2 | Y3, Y4 | 🟦 drafted (v0.33) | Awaiting Sagar's 50-account seed list |
| A3 | Y5, Y6, Y7 | 🟦 drafted (v0.34) | — |
| A4 | Y8, Y9 | 🟦 drafted (v0.34) | Y8 needs brand assets; Y9 is synthetic |
| A5 | Z0, Z1 | 🟦 drafted (v0.34) | — |
| A6 | Y10, Y11, Y12, Y13 | 🟦 drafted (v0.34) | Y12 needs counsel; Y13 needs HubSpot config |
| A7 | Z2, Z3 | 🟦 drafted (v0.34) | Unshipped delta assets flagged with milestone |
Human prerequisites before going live with outbound: - Nidhi: compliance-claim review on Y1/Y5/Y7/Y9/Z2/Z3 + Y9 sanitization - Sagar: Y2 real numbers, Y3 50-account seed list, Y13 HubSpot pipeline config - Counsel: Y12 contract pack - Brand: Y8 deck design assets