SnowOps — Project State

Current Phase

M1: Sign-offs in flight
M2a: Core Baseline FUNCTIONALLY CODE-COMPLETE (v0.40) — every non-postponed asset is 🟦+. W-series postponed (D35).
M2b: 14-asset core CODE-COMPLETE — E0 (v0.42) + V2 + V3 (v0.43) + S1 (v0.44) + S2 (v0.45) + L1 (v0.46) + L2 (v0.47) + L4 (v0.48) + F12 (v0.49) + B6 (v0.50) + F11 (v0.51) shipped; K1 + K2 (PR #12) + D5 (PR #13) landed externally and merged. 14/14 in-repo — nothing external remaining. Focus now: runbook sign-offs.
GTM: Track A COMPLETE (v0.34) — 39 files under gtm/. Awaiting human sign-offs.

Last Updated: 2026-05-31 (v0.51)

In Flight

Item	Status	Notes
M2b 14-asset core	🟦 code-complete	all 14 in-repo (E0+V2+V3+S1+S2+L1+L2+L4+F12+B6+F11 + K1+K2+D5); nothing external remaining
M2a runbook sign-offs	🟧	X7+U+N+M+J offline parts are ~5 min each
GTM Track A	🟦 complete	Human sign-offs pending (see §0)

Open Issues / Tech Debt (v0.55 repo review — 2026-06-04)

Surfaced by a full-repo review. Full detail + IDs in docs/context/10-gap-register.md (G15–G20). Fixed in the same pass except G18 (seeded, ongoing).

#	Issue	Severity	Status
G15	B1 test red — `apps/github-onboarder/src/load-template.test.ts` asserted a stale workflow file list	High	✅ Fixed — test now walks disk dynamically; 38/38 pass
G16	No PR CI job ran `apps/*` unit tests; B1 had no workflow	High	✅ Fixed — added `.github/workflows/app-tests.yml` (dynamic matrix, 13 apps)
G17	CLAUDE.md §2 repo layout stale	Med	✅ Fixed — §2 refreshed
G18	Zero ADRs despite 50 decisions; DoD #5 + §3 require ADRs	Med	🟧 Seeded — `docs/adr/` README + template + ADRs 0001–0004; backfill rest ongoing
G19	F7 `live/validate.sh` false-positive hclfmt warning (deprecated CLI flags)	Low	✅ Fixed — modern `hcl fmt --check`, now a hard fail
G20	U3/K3 scaffold READMEs described absent behavior	Low	✅ Fixed — "SCAFFOLD — postponed" banners + §2 tags

Verified passing in this review: terraform fmt -check (modules/live/sandbox) ✅ · Go terratest vet + compile ✅ · all 14 app suites now green (github-onboarder fixed) · live/validate.sh ✅. Tooling: node 26, go 1.26, terraform 1.15.

Runbook Sign-Off Backlog (Ordered)

Execute offline Parts A+B first (~5 min each). Cloud parts (C/D) are optional promotion-path extenders.

Priority	Runbook	Cloud?	Time	Cost
1	V2 `docs/runbooks/test/V2.md`	No	~5 min	$0
2	V3 `docs/runbooks/test/V3.md`	No	~5 min	$0
3	E0 `docs/runbooks/test/E0.md`	Partial (Part C)	~6 min offline	$0
4	X7 `docs/runbooks/test/X7.md`	No	~5 min	$0
5	U1 `docs/runbooks/test/U1.md`	Yes (Part C)	~5 min offline	$0
6	U2 `docs/runbooks/test/U2.md`	Yes (Part C/D)	~5 min offline	$0
7	N5 `docs/runbooks/test/N5.md`	Yes (Part C/D)	~5 min offline	$0
8	N6 `docs/runbooks/test/N6.md`	Yes (Part C/D)	~5 min offline	$0
9	M1 `docs/runbooks/test/M1.md`	Yes	~5 min offline	$0
10	M2 `docs/runbooks/test/M2.md`	Yes (Part C)	~5 min offline	~$1
11	M3 `docs/runbooks/test/M3.md`	Yes	~5 min offline	$0
12	M6 `docs/runbooks/test/M6.md`	Yes	~5 min offline	$0
13	J1 `docs/runbooks/test/J1.md`	Yes (Part C/D)	~5 min offline	$0
14	J2 `docs/runbooks/test/J2.md`	Yes (Part C/D)	~5 min offline	$0
15	J6 `docs/runbooks/test/J6.md`	Yes (Part C/D)	~5 min offline	$0
16	H5 `docs/runbooks/test/H5.md`	Yes (Part C)	~5 min offline	$0
17	H7 `docs/runbooks/test/H7.md`	Yes (Part C/D)	~5 min offline	$0 (needs P1)
18	F8 `docs/runbooks/test/F8.md`	Optional (kind)	~5 min offline	$0
19	B5 `docs/runbooks/test/B5.md`	Yes (Part C)	~5 min offline	$0 (needs P2)
20	B4 `docs/runbooks/test/B4.md`	Yes (Part C)	~8 min	$0
21	B3 `docs/runbooks/test/B3.md`	Yes (Part C)	~12 min	$0
22	B2 `docs/runbooks/test/B2.md`	Yes (Part C)	~25 min	$0
23	C3 `docs/runbooks/test/C3.md`	Yes (Parts C–F)	~75 min	$0
24	C2 `docs/runbooks/test/C2.md`	Yes (Parts C–E)	~40 min	<$1
25	H1 `docs/runbooks/test/H1.md`	Yes	~25 min	$0
26	H2 `docs/runbooks/test/H2.md`	Yes (needs P1)	~30 min	$0
27	H3 `docs/runbooks/test/H3.md`	Yes (needs P2)	~30 min	$0
28	F3 `docs/runbooks/test/F3.md`	Yes (Part C)	~30 min	~$10
29	F5 `docs/runbooks/test/F5.md`	Yes (Part C)	~25 min	~$2
30	F4 `docs/runbooks/test/F4.md`	Yes (Part C)	~30 min	~$5
31	D4 `docs/runbooks/test/D4.md`	Optional (kind)	~5 min offline	$0
32	F2 `docs/runbooks/test/F2.md`	Yes (Part C)	~35 min	~$5
33	F0 `docs/runbooks/test/F0.md`	No	~15 min	$0
34	B1 `docs/runbooks/test/B1.md`	Yes	~60 min	$0
35+	D2, X1, X2, C1, G0–G6, A1, A5, F1, F6	Mix	Various	Various

M2b 14-Asset Core Progress (D36)

#	Asset	Scope	Status	Est. Time	Cloud Cost
1	✅ E0	Compliance snapshot (Policy + Defender score, wired to C1)	🟦 v0.42	—	—
2	✅ V2	Architecture diagram generator (`apps/diagram-generator/`)	🟦 v0.43	—	—
3	✅ V3	Runbook generator (`apps/runbook-generator/`)	🟦 v0.43	—	—
4	✅ S1	Drift detection (scheduled `terraform plan` → ticket via TicketPlatform)	🟦 v0.44	—	—
5	✅ S2	Azure Policy compliance dashboard (`apps/compliance-dashboard/`)	🟦 v0.45	—	—
6	✅ K1	IR runbook library (`docs/runbooks/incident/`)	🟦 (external, merged)	—	—
7	✅ K2	On-call integration (`modules/azure/oncall-integration/`)	🟦 (external, merged)	—	—
8	✅ L1	Azure Backup policy module (`modules/azure/backup-policy/`)	🟦 v0.46	—	—
9	✅ L2	Cross-region replication (object replication + SQL failover group)	🟦 v0.47	—	—
10	✅ L4	Automated restore drill (`apps/restore-drill/` → S2 DR panel)	🟦 v0.48	—	—
11	✅ D5	Policy waiver engine (`waivers/`, OPA exception records, CI expiry enforcement)	🟦 (external, merged PR #13)	—	—
12	✅ F12	Brownfield import library (`modules/azure/import-blocks/`, 9 modules)	🟦 v0.49	—	—
13	✅ B6	Client self-service bootstrap (prerequisite checker + validator) (`apps/client-bootstrap/`)	🟦 v0.50	—	—
14	✅ F11	Module versioning + private registry (`apps/module-registry/`)	🟦 v0.51	—	—

Rule (D36): ALL other M2b/M3 assets are POSTPONED until these 14 are code-complete + signed off. Depth before breadth.

Next 5 Selected (v0.53 — D48)

After the M2b 14-asset core went code-complete, the next 5 most-important postponed/unbuilt items were selected. Reconciliation: C5 (ADO pipelines) was found already built in commit 51c7fc4 — docs were stale; it's now marked 🟦 code-complete and dropped from the candidate set.

#	Asset	Rationale	Status
1	I3 — CodeQL SAST	No code-analysis layer existed; D2 covered only IaC/secrets	🟦 code-complete (v0.53)
2	I2 — Dependency scanning	`dependabot.yml` existed but no PR gate / alert digest	🟦 code-complete (v0.53)
3	I1 — Container image scanning	Closes G6 (non-K8s container security); reusable image scan	🟦 code-complete (v0.53)
4	E7 — TicketPlatform adapters	Closes G8; unblocks E6/I5/K4/P3; generalizes the S1 seed	🟦 code-complete (v0.54)
5	F7 — Terragrunt live-infra reference	The missing per-env/region module wiring repo; deploy enablement	🟦 code-complete (v0.55)

Done (v0.53): I1 + I2 + I3 — the M2a CI security-scanning suite (docs/runbooks/test/I1.md, I2.md, I3.md). Done (v0.54): E7 — apps/ticket-platform/ (GitHub/Jira/Linear/ADO adapters + CLI, 26 tests); S1 repointed (interface-compatible). Runbook docs/runbooks/test/E7.md. Done (v0.55): F7 — live/ Terragrunt reference (root + _envcommon + bootstrap + per-env/region units; baseline→net/kv/acr DAG; offline validate.sh). Runbook docs/runbooks/test/F7.md.

✅ Next-5 (D48) COMPLETE. Candidate next batch: runbook sign-offs (in parallel), then M3 tail (W4 client offboarding) / M2b additional (J4 alert pack, X5 pipeline integration tests, X8 synthetic monitoring) / M4 advanced.

Next 5 Selected (v0.56 — D51)

With the next-5 (D48) complete, the next 5 most-important postponed items were selected. Priority rule: depth before breadth (D36) + milestone order — finish the remaining M2b "additional" assets (M2b §84) before advancing to M3 tail / M4. All five are M2b. The heavier network items (N3 WAF, N4 DDoS — CO-owned, cloud-cost) are deferred to a later batch.

#	Asset	Rationale	Status
1	J4 — Alert rule pack	No detection-rule layer existed; identity/network/privilege/data-exfil KQL alerts over the J1 LAW, wired to K2 action groups	🟦 code-complete (v0.56)
2	I5 — Defender → ticket via E7	Newly unblocked by E7 (D49); first consumer proving the `TicketPlatform` adapter; closes the Defender-alert→ticket loop	🟦 code-complete (v0.58)
3	X5 — Pipeline integration tests	M2a CI gates (C1–C3) had no integration test consumers; reusable-workflow test repos	🟦 code-complete (v0.57)
4	X8 — Synthetic monitoring	No availability/latency synthetic probes; Azure Monitor standard webtests + alert rules	🟦 code-complete (v0.57)
5	R2 — Production change log	Merged-PR → changelog generator; reuses E7 for change-record tickets where required	🟦 code-complete (v0.59)

✅ D51 batch COMPLETE (all 5): J4 (v0.56) · X5 + X8 (v0.57) · I5 (v0.58) · R2 (v0.59).

Done (v0.56): J4 — modules/azure/alert-rule-pack/ (curated scheduled-query alert rules across four threat domains — identity/privilege/network/data-exfil; domain toggles + per-rule overrides + freeform custom rules; consumes the J1 workspace + K2 action groups by ARM ID). Offline TestAlertRulePackValidate green. Runbook docs/runbooks/test/J4.md.

Done (v0.57) — X series complete (X5 + X8): - X5 — tests/pipeline-integration/ reusable-workflow contract gate (contract_check.py + test_contract_check.py, 11 unit tests; offline validate.sh; CI .github/workflows/pipeline-integration.yml) + live it-{container-build-sign,aks-deploy,terraform-plan-apply}.yml consumers driving the existing fixtures against the sandbox. Catches workflow_call interface drift across all callers offline. Runbook docs/runbooks/test/X5.md. - X8 — modules/azure/synthetic-monitoring/ (App Insights standard availability tests + per-test availability metric alerts; optional workspace-based AI component; consumes the J1 workspace + K2 action groups by ARM ID). Offline TestSyntheticMonitoringValidate green. Runbook docs/runbooks/test/X8.md.

X series: X1✅ X2✅ X3🟩 X4✅ X5✅ X6(ongoing runbooks) X7✅ X8✅ — all X assets now code-complete/shipped except the ongoing X6 runbook track.

Done (v0.58): I5 — apps/defender-ticketer/ (Defender for Cloud alerts → idempotent tickets via the E7 snowops-ticket CLI; the first E7 consumer). Pure normalize/filter/dedupe core behind a Collector seam; consumes E7 at run time (no build-coupling, per D49). 21 jest tests; offline dry-run + E7 output-contract verified. Runbook docs/runbooks/test/I5.md.

Done (v0.59): R2 — apps/change-log/ (production change log: merged PRs / squash commits → categorized Keep-a-Changelog markdown; pure categorize/render core behind a collector seam — git log / gh pr list / fixture; optional E7 change-record ticket per release via the same run-time bridge as I5). 32 jest tests; offline + live git log + prepend verified. Runbook docs/runbooks/test/R2.md. D51 batch complete — next: open backlog (runbook sign-offs in parallel; M4 advanced; heavier net N3/N4).

Sequenced Full Roadmap (Track B — Sagar)

Priority	Action	Status	Milestone
Next code	K1 + K2 (IR + on-call)	🟦 external (merged)	M2b
After K-series	L1 + L2 + L4 (backup + DR + restore drill)	⬜	M2b
After L-series	D5 (policy waivers)	🟦 external (merged PR #13)	M2b
After D5	F12 (brownfield imports)	⬜	M3
After F12	B6 (self-service bootstrap)	🟦 v0.50	M3
After B6	F11 (module versioning)	🟦 v0.51	M3
Then	C5, E7, W4 (ADO + ticket adapters + client offboarding)	⬜	M3
Then	Advanced package (E1–E6 full, J3, O, P, Q, T, V1, V4)	⬜	M4
Then	Multi-cloud (F9, F10, W5, U3)	⬜	M5
Last	W1–W3 (multi-tenant)	⏸️ postponed	after M2b/M3

GTM Track A — Status

Batch	Assets	Status	Notes
A1	Y0, Y1, Y2, §3.8	🟦 drafted (v0.33)	Awaiting Nidhi (Y1 claims) + Sagar (Y2 real numbers)
A2	Y3, Y4	🟦 drafted (v0.33)	Awaiting Sagar's 50-account seed list
A3	Y5, Y6, Y7	🟦 drafted (v0.34)	—
A4	Y8, Y9	🟦 drafted (v0.34)	Y8 needs brand assets; Y9 is synthetic
A5	Z0, Z1	🟦 drafted (v0.34)	—
A6	Y10, Y11, Y12, Y13	🟦 drafted (v0.34)	Y12 needs counsel; Y13 needs HubSpot config
A7	Z2, Z3	🟦 drafted (v0.34)	Unshipped delta assets flagged with milestone

Human prerequisites before going live with outbound: - Nidhi: compliance-claim review on Y1/Y5/Y7/Y9/Z2/Z3 + Y9 sanitization - Sagar: Y2 real numbers, Y3 50-account seed list, Y13 HubSpot pipeline config - Counsel: Y12 contract pack - Brand: Y8 deck design assets