Skip to content

SnowOps — Full Asset Catalog

Status: ⬜ Not Started · 🟨 Scaffolding · 🟧 In Progress · 🟦 Code Complete · 🟩 Shipped For compact status table: docs/context/04-asset-status.md For full version history: docs/context/05-history.md


A. Sales & CRM Automation (HubSpot) [X][SO]

  • 🟦 A1 — HubSpot Private App + lead-enrichment Custom Code Action (Clearbit/Apollo → contact props). [M1][CA] → Test: Clearbit/Apollo lookups enriching contact properties. Manual runbook: docs/runbooks/test/A1.md.

  • A2 — ICP scoring Custom Code Action (TS) → routes to Sagar/Nidhi. [M2b] ⏸️ postponed → Test: fixture leads covering ICP / near-ICP / out-of-ICP; assert routing.

  • A3 — Proposal generator (Deal stage → templated PDF; DocRaptor or Google Docs API). [M2b] ⏸️ postponed → Test: golden-file diff on rendered PDF text content for 3 sample deals.

  • A4 — Project kickoff webhook (Closed Won → triggers B1 + Notion + Slack). [M2b] ⏸️ postponed → Test: simulate webhook payload → assert repo created + Slack channel.

  • 🟦 A5 — Discovery trigger Custom Code Action (Qualified → dispatches G4 + sends offer email). [M1][CA] → Test: Deal stage change → G4 dispatched, offer email sent. Manual runbook: docs/runbooks/test/A5.md.


B. Client Onboarding [B]

  • 🟦 B1 — GitHub App snowops-onboarder (Probot/TS): repo + branch protection + CODEOWNERS + checks + env vars + federated OIDC to client Azure AD. [M1][SO→CO] → Test: install on test org, trigger create → assert all settings via GH API. Manual runbook: docs/runbooks/test/B1.md.

  • 🟦 B2modules/azure/client-bootstrap/ — Azure AD application + service principal + federated identity credentials (GitHub OIDC only, repo: prefix validated) + role assignments (sub/ACR/KV/freeform). No client secret ever created. [M2a][SO→CO] → Test: Terratest validates SP + fed creds + role-assignment IDs. Manual runbook: docs/runbooks/test/B2.md.

  • 🟦 B3modules/azure/subscription-baseline/ — composes F1 + group RBAC (6 named shortcuts: Owner/Contributor/Reader/Security Admin/Security Reader/UAA + freeform escape hatch) + MCSB regulatory-compliance initiative (system-assigned identity, audit-only until remediation role granted out-of-band). Defender ON by default. azurerm-only (RBAC binds by group object ID). [M2a][CO] → Test: Terratest validates (offline composition + RBAC flatten + MCSB block) + integration applies to sandbox asserts LAW + SnowOps Standard set + MCSB assignment + system-assigned identity GUID + 3 RBAC role-assignment IDs. Manual runbook: docs/runbooks/test/B3.md.

  • 🟦 B4modules/azure/client-state-backend/ — wraps F6 + Blob Data RBAC (Contributor/Reader/Owner, principal_type unset for mixed SP+group) + optional Private Endpoint + optional network-rule lockdown (standalone azurerm_storage_account_network_rules, default-deny; precondition requires reach-path) + optional diagnostics. backend_config forces use_azuread_auth = true. azurerm-only. [M2a][CO] → Test: Terratest validates (offline composition + RBAC flatten) + integration asserts SA ARM ID + backend_config + Blob Data role-assignment IDs + contract versioning_enabled. Manual runbook: docs/runbooks/test/B4.md.

  • 🟦 B5modules/azure/pim-azure-resources/ — PIM for Azure resource roles. Tier-0 (Owner + UAA) → MFA + justification + ticketing + approval + max 8h; tier-1 (Contributor) → MFA + justification + max 4h. Uses azurerm's native azurerm_pim_eligible_role_assignment + azurerm_role_management_policy. Eligibility permanent; time-box is on activation. Requires Entra ID P2 at apply. Precondition: ≥1 break-glass Owner so PIM can't lock the sub out. [M2a][CO] → Test: Terratest validates (offline). Live activation drill is manual (can't Terratest MFA/approval flow). Manual runbook: docs/runbooks/test/B5.md.

  • 🟦 B6apps/client-bootstrap/ — client self-service bootstrap: prerequisite checker + Azure permission validator a prospective client runs in their OWN tenant pre-engagement. Pure evaluator over an EnvironmentSnapshot behind a Collector seam (FixtureCollector tests/offline; AzureCliCollector live az). Read-only, no secrets. Checks tooling (az ≥ 2.50, terraform ≥ 1.6), auth, permissions (assign RBAC = Owner/UAA not Contributor; create Entra apps+SPs; required providers registered), Entra ID P2 (warn). READY only when every required check passes; summary.md ends in a remediation list; status.json carries ready/blockers/warnings. Single entrypoint bootstrap.sh (exit 0/2). [M3][CA] (14-asset M2b core) → Test: examples/snapshot.ok.json → READY/exit 0; examples/snapshot.restricted.json (restricted SP) → NOT READY + clear remediation/exit 2. Manual runbook: docs/runbooks/test/B6.md.


C. CI/CD & Delivery Pipelines [B]

  • 🟦 C1terraform-plan-apply.yml GitHub Actions reusable workflow (Azure OIDC, plan-on-PR comment, apply-on-merge, environment gates, conftest OPA check post-plan). [M2a][CA] ← KEYSTONE → Test: create test repo consuming workflow; open PR → plan comment; merge → apply against sandbox. Manual runbook: docs/runbooks/test/C1.md.

  • 🟦 C2container-build-sign.yml (build → ACR push by digest → Notation v2 sign via AKV plugin → Syft SBOM → Grype scan with severity cutoff). Signing by digest not tag. 13 inputs; grype_severity_cutoff defaults critical; fail_on_scan_findings defaults true. [M2a][CA] → Test: clean image passes; planted-CVE image fails at critical cutoff; notation verify succeeds. Manual runbook: docs/runbooks/test/C2.md.

  • 🟦 C3aks-deploy.yml (ArgoCD image override → sync → wait Healthy → smoke probe → optional rollback drill; Kustomize XOR Helm modes, rejected if both/neither set). ArgoCD-token-only auth. [M2a][CA] → Test: happy-path Kustomize deploy + smoke green; rollback drill exercises revert + roll-forward; ambiguous + empty image-set inputs both rejected. Manual runbook: docs/runbooks/test/C3.md.

  • 🟩 C4 — GitOps branching standard doc + client-repo template + branch protection rules. [M1][QW][CA] → Test: branching standard doc + template structure + protection validation. Manual runbook: docs/runbooks/test/C4.md. Signed off 26/05 (Sagar).

  • 🟦 C5pipelines/azure-devops/ — ADO Pipeline templates: terraform-plan-apply.yml (mirrors C1), container-build-sign.yml (C2), aks-deploy.yml (C3), quality-gates.yml (D2) + caller examples + README. Same underlying tools as the GH Actions workflows; only the CI wrapper changes. Callers reference via resources: repositories:. [M3][CA] 🟦 code-complete (commit 51c7fc4) → Test: ADO test project consuming pipeline; PR triggers plan; stage approval gates apply on merge. Runbook: docs/runbooks/test/C5.md (pending).


D. Quality & Security Gates (PR-time) [B]

  • 🟩 D1.pre-commit-config.yaml (tflint, fmt, checkov, tfsec, gitleaks, trivy fs, conftest verify on pre-push). [M1][QW][CA] → Test: planted bad commits blocked locally. Manual runbook: docs/runbooks/test/D1.md. Signed off 26/05 (Sagar).

  • 🟦 D2 — PR-blocking GH Actions mirroring D1 (.github/workflows/quality-gates.yml). [M1][QW][CA] → Test: planted bad PRs blocked; clean PRs pass. Manual runbook: docs/runbooks/test/D2.md.

  • 🟩 D3 — Conftest/OPA policy bundle for terraform plan JSON (encryption, tags, no public network, allowed regions, cost caps). [M1][QW][CA] → Test: conftest verify suite (X3) — every rule has pass + fail fixture. Manual runbook: docs/runbooks/test/D3.md. Signed off 26/05.

  • 🟦 D4 — Kyverno policy bundle for AKS — 5 ClusterPolicies (disallow-latest-tag, require-signed-images, require-pod-labels, disallow-privileged-containers, require-network-policy with generate of default-deny NetPol). All Enforce. System namespaces excluded via exclude.any.resources.namespaces. [M2a][CO] → Test: kyverno test per rule (X4) — 5 suites / 21 assertions. Live admission round-trip is manual. Manual runbook: docs/runbooks/test/D4.md.

  • 🟦 D5waivers/ + policy/opa/rules/main.rego waiver engine — time-boxed OPA exception records (waivers/exceptions.yaml: rule_prefix + resource_address + expiry_date + owner + justification) + CI enforcement. The D3 rule files (tags/locations/network/encryption/cost) now emit raw_violation; main.rego filters them through has_active_waiver (suppress matching, non-expired) and hard-denies expired waivers (snowops.waiver_expired) so they fail the pipeline. Wired into terraform-plan-apply.yml via conftest test plan.json --data waivers/exceptions.yaml. [M2b][CA] 🟦 code-complete (external, gemini-work PR #13). Runbook docs/runbooks/test/D5.md. → Test: unexpired waiver suppresses D3 finding; expired waiver causes CI failure.


E. Automated Evidence Collection [A]

  • 🟦 E0apps/evidence-collector/ — Lightweight compliance snapshot [B]: read-only TS tool collecting Azure Policy compliance state (summarize) + Defender secure score (secureScores/ascScore) into a versioned JSON artifact (schemaVersion 1.0). diffSnapshots is the regression signal. Wired into C1 as continue-on-error post-apply step. Reader + Security Reader only. [M2b][SH] ⬜ in scope (14-asset core) → Test: jest (3 suites / 18 tests) — required-field validation, seeded-policy-violation diff, markdown render. Manual runbook: docs/runbooks/test/E0.md.

  • E1EvidencePlatform TS interface (apps/evidence-collector/src/platforms/). [M4][SO] ⏸️ postponed

  • E2VantaAdapter. [M4][SO] ⏸️ postponed
  • E3DrataAdapter stub. [M4][SO] ⏸️ postponed
  • E4 — Azure Resource Graph query library (SOC2 CC + ISO27001 A.x). [M4][SH] ⏸️ postponed
  • E5 — Defender for Cloud → Vanta scheduled sync. [M4][SO] ⏸️ postponed
  • E6 — Quarterly access review automation → CSV → ticket via E7. [M4][SO] ⏸️ postponed
  • 🟦 E7apps/ticket-platform/ (@snowops/ticket-platform) — platform-neutral TicketPlatform interface + adapters (GitHub Issues, Jira REST v2, Linear GraphQL, Azure DevOps Boards) + DryRunTicketPlatform + selectPlatform factory + snowops-ticket CLI. One shared marker-based upsertByMarker (idempotent create-or-update via an HTML-comment dedupe marker); each adapter implements the same MarkerUpsertApi seam (listOpen/create/update) with an injectable fetch. Generalizes the S1 seed (D15/D39); closes G8. [M3][CO/CA] 🟦 code-complete (v0.54) → Test: npm test (6 suites / 26 — shared upsert + each adapter's HTTP mapping + factory) + dry-run CLI. Runbook: docs/runbooks/test/E7.md.

F. SnowOps Module Library (Azure-First, Cloud-Agnostic Contracts) [B]/[A]

F0 sequencing constraint: F0 must land before any new F-module. F1 and F6 retrofitted in v0.17.

  • 🟦 F0modules/_contracts/ — 7 contracts: network, identity, cluster, registry, kv, observability, object_store. Each: typed variable "candidate" + echoing output "candidate" + no providers. [M2a][B][SO] → Test: every contract validates standalone; F1/F6 conformance tests; 4 negative-literal tests. Manual runbook: docs/runbooks/test/F0.md.

  • 🟦 F1modules/azure/baseline/ (Mgmt Group, Subs, Policy, Defender, Log Analytics, Activity Log). Emits identity_contract + observability_contract. [M2a][B][CO] → Test: Terratest applies to sandbox; asserts policy assignment + Defender plans. Manual runbook: docs/runbooks/test/F1.md.

  • 🟦 F2modules/azure/network-hub/ (hub-spoke vNets, optional Azure Firewall, optional Private DNS zones, NSG flow logs to F1, per-spoke route-table forcing 0.0.0.0/0 through firewall). Emits F0 spoke_network_contracts map. [M2a][B][CO] → Test: Terratest validates topology + routing. Manual runbook: docs/runbooks/test/F2.md.

  • 🟦 F3modules/azure/aks-secure/ (private AKS, Workload Identity, OIDC issuer, AAD-RBAC + local accounts disabled, Azure CNI Overlay + Calico NetworkPolicy, Defender for Containers, KEDA, Image Cleaner, AKV CSI driver, system + user node pools across 3 AZs on AzureLinux+Ephemeral OS). Emits F0 cluster_contract. [M2a][B][CO] → Test: Terratest provisions cluster; AAD-only kubectl smoke; private API confirmed. Manual runbook: docs/runbooks/test/F3.md.

  • 🟦 F4modules/azure/acr/ (Premium SKU, Private Endpoint + auto-A-record in privatelink.azurecr.io, AAD-only auth, public access disabled, optional geo-replication + AcrPull bindings + Defender scanning passthrough). Emits F0 registry_contract. [M2a][B][CO] → Test: Terratest provisions; push + sign sample image; pull from private endpoint only. Manual runbook: docs/runbooks/test/F4.md.

  • 🟦 F5modules/azure/key-vault/ (Premium SKU default, RBAC mode enforced, purge protection enforced, default-deny network ACLs + AzureServices bypass, Private Endpoint + auto-A-record in privatelink.vaultcore.azure.net, optional role bindings across 5 built-in KV roles, optional diag forward). Emits F0 kv_contract. [M2a][B][CO] → Test: Terratest; secret CRUD via Workload Identity; public access denied. Manual runbook: docs/runbooks/test/F5.md.

  • 🟦 F6modules/azure/state-backend/ (state SA + container per env, used by B4). Emits object_store_contract. [M2a][B][CO] → Test: Terratest applies; init a dummy Terraform stack against it; lease lock observed. Manual runbook: docs/runbooks/test/F6.md.

  • 🟦 F7live/ — Terragrunt live-infra reference: root.hcl (remote state in the F6 backend, generated OIDC azurerm provider, common §3 tags) + DRY _envcommon/ templates (F1 baseline, F2 network-hub, F5 key-vault, F4 acr) + bootstrap/ (F6 state account, local state — breaks the chicken-and-egg) + per-env/per-region units (prod eastus full chain + westus2, staging, sandbox). Real dependency DAG: baseline → network-hub/key-vault/acr via dependency.baseline.outputs.log_analytics_workspace_id (mock_outputs for pre-apply plan). Variance isolated in env.hcl/region.hcl; units are 3-line includes. In-repo source via get_repo_root(); F11 registry-pin form documented for external use. Offline live/validate.sh structural gate + terragrunt hcl validate. [M2b][B][CO] 🟦 code-complete (v0.55) → Test: live/validate.sh (offline structural gate, A) + terragrunt hcl validate/hclfmt --check (B); live: run-all plan/apply against sandbox (C). Runbook: docs/runbooks/test/F7.md.

  • 🟦 F8gitops/ — K8s reference manifests bundle as ArgoCD app-of-apps (cert-manager + Kyverno + ESO wave 0 → ingress-nginx wave 1 → D4 policies + ClusterSecretStore wave 2). D4 reused not forked. [M2b][B][CO] → Test: gitops/validate.sh (offline — 13 files / 7 Applications) + kyverno test. Live kind-cluster bootstrap is runbook. Manual runbook: docs/runbooks/test/F8.md.

  • F9modules/aws/* parity. [M5][B]/[A] ⏸️ deferred

  • F10modules/gcp/* parity. [M5][B]/[A] ⏸️ deferred
  • 🟦 F11apps/module-registry/ + modules/registry.json + per-module CHANGELOG.md + .github/workflows/module-release.yml — module versioning + private Terraform registry. Private registry = the monorepo itself: modules publish as git tags <module>/v<version>, consumers pin via source = "git::…//<path>?ref=<module>/vX.Y.Z" (no hosted service). Manifest is the source of truth (10 modules: F0 0.1.0; F1–F6 + J1/J2/J6 1.0.0). TS tool (B6/L4 mold; pure core over a RegistrySnapshot behind a Collector seam): validate (unique names/paths, strict semver, CHANGELOG top == manifest version, no version-regression), buildIndex, planReleases, auditPins (flags unpinned/ref-mismatch/unknown-version in a consumer tree). module-release workflow tags + GitHub-Releases pending modules on merge to main (CHANGELOG section as body, idempotent). 3 jest suites / 27 tests incl. a guard over the real manifest+CHANGELOGs. Convention: docs/conventions/module-versioning.md. [M3][SO] 🟦 code-complete (v0.51) → Test: npm test (27); --manifest modules/registry.json --fail-on-issues → OK/exit 0; --consumer-dir examples/consumer-unpinned --fail-on-issues → exit 2. Manual runbook: docs/runbooks/test/F11.md.
  • 🟦 F12modules/azure/import-blocks/ — brownfield import library [B]: config-driven Terraform import {} blocks (one <module>.tf per module) that adopt pre-existing Azure resources into the F-modules — covers F1 baseline, F2 network-hub, F3 aks-secure, F4 acr, F5 key-vault, F6 state-backend + J1 log-analytics, J2 policy-diagnostics, J6 audit-log-archive (9 modules). Each file pairs the import blocks with a placeholder module call so the whole directory is self-validatingterraform validate confirms every to = address resolves (incl. count[0] + for_each["key"] instances); for_each key schemes are derived from source + documented per file. Offline TestImportBlocksValidate gate. Each covered module's README brownfield section now points at its real file; adoption procedure in docs/runbooks/import/F12.md. [M3][B][CO] 🟦 code-complete (v0.49) → Test: docs/runbooks/import/F12.md — offline validate/fmt + TestImportBlocksValidate (A+B); live: adopt a real sandbox resource with a zero-change plan (C).

G. Pre-Sales Discovery & Audit Automation [X][SH]

Azure-only through M3. AWS discovery mode (G7) lands in M4.

  • 🟦 G0 — Client-side scoped Reader + Security Reader SP bootstrap script + Bicep alt; time-boxed federated cred; no secrets leave client tenant. [M1]
  • 🟦 G1apps/discovery-auditor/ (Node/TS) collectors: Resource Graph KQL, Defender REST, Azure Policy state, AAD audit logs, Cost Mgmt. [M1]
  • 🟦 G2 — YAML rule pack mapped to SOC2 CC + ISO27001 A.x + CIS Azure Benchmark; severity/evidence/remediation/effort. Each finding includes remediation_asset_id. 11 rules / 22 fixtures. [M1]
  • 🟦 G3 — Report renderer (Markdown → PDF via Pandoc/Playwright; branded cover, exec summary, control table, prioritized roadmap). [M1]
  • 🟦 G4.github/workflows/discovery-run.yml (manual dispatch with tenant_id + sub_id; artifact upload; Slack notify with reviewer checklist). [M1]
  • 🟦 G5 — HubSpot integration (A5 → Deal property discovery_report_url). [M1]
  • 🟦 G6 — Immutable run audit log (client/scope/timestamp/operator/findings hash → WORM blob, SHA-256 hash chain). [M1]
  • G7 — AWS discovery mode. [M4][X][SH] ⏸️ postponed

H. Identity & Access Management [B]/[A]

  • 🟦 H1modules/azure/aad-baseline/ (IP + country named locations via azuread_named_location, custom Authentication Strength Policy for phishing-resistant MFA, password protection + tenant branding emitted as *_patch_body JSON for az rest PATCH). Precondition: verified custom domain required. [M2a][B][CO] → Test: Terratest validate; H1 runbook applies in sandbox + applies Graph PATCHes.

  • 🟦 H2modules/azure/conditional-access/ (6 SnowOps CA policies: MFA Mandatory / Tier-0 Phishing-Resistant+Compliant Device / Block Legacy Auth / Geo-Block / High-Risk Block / Medium-Risk MFA; every policy excludes break-glass group; risk policies gated on P2). [M2a][B][CO] → Test: Terratest validate; H2 runbook applies in report-only → CA What-If → enforce + live sign-in.

  • 🟦 H3modules/azure/pim-templates/ (tier-0 + tier-1 AAD role eligibility via azuread_directory_role_eligibility_schedule_request; activation rule bodies emitted as JSON for Graph az rest PATCH since roleManagementPolicies has no TF resource). Precondition: ≥1 permanent break-glass tier-0 holder. [M2a][B][CO] → Test: Terratest validate; H3 runbook applies eligibility + Graph PATCHes + live activation drill.

  • H4 — SCIM provisioning from Azure AD to SaaS. [M4][A][CO] ⏸️ postponed

  • 🟦 H5apps/sp-inventory/ (read-only-Graph TS, Application.Read.All) + .github/workflows/sp-inventory-rotation.yml (scheduled reusable workflow). Inventories app registration credentials; flags aged (≥ threshold_days=90) / expiring-soon (within expiry_warning_days=30) / expired. Opens/idempotently updates rotation PR. Never rotates a secret itself. Federated-OIDC-only SPs never stale. 2 test suites / 19 tests. [M2a][B][CA] → Test: jest unit suite covers stale SP path + federated-OIDC-only SP path. Live tenant read + PR drill is manual.

  • H6 — Access review automation. [M4][A][CO] ⏸️ postponed

  • 🟦 H7modules/azure/break-glass/ — dual-provider (azuread + azurerm). Role-assignable group + azuread_group_member per member + permanent (active, non-PIM) Global Administrator + severity-0 sign-in alert (azurerm_monitor_scheduled_query_rules_alert_v2, KQL on UserId, threshold=0). Producer of break-glass group H2/B3/B5 consume. Takes existing account object IDs as input (no account or password creation — Identity > Secrets). [M2a][B][CO] → Test: Terratest validate (offline — 4 preconditions). Live sign-in drill is manual (needs P1 + real LAW).


I. Vulnerability & Patch Management [B]/[A]

  • 🟦 I1.github/workflows/image-scan.yml — reusable (workflow_call) Trivy image scan; fails on High/Critical OS+library CVEs, SARIF → Code Scanning, optional registry login, ignore_unfixed/fail_on_findings/severity_cutoff inputs. Closes G6 (container security for non-K8s clients). Distinct from C2 (build-time grype): I1 scans an arbitrary image ref. [M2a][B][CA] 🟦 code-complete (v0.53) → Test: YAML lint (offline) + dispatch scan of an old image fails / current passes. Runbook: docs/runbooks/test/I1.md.
  • 🟦 I2.github/dependabot.yml (4 ecosystems) + .github/workflows/dependency-review.yml (PR-blocking SCA gate, fail-on-severity: high + licence deny-list) + .github/workflows/dependency-digest.yml (weekly idempotent Dependabot-alert digest issue). [M2a][B][CA] 🟦 code-complete (v0.53) → Test: config lint (offline) + PR introduces vuln dep → review fails; digest run upserts one rolling issue. Runbook: docs/runbooks/test/I2.md.
  • 🟦 I3.github/workflows/codeql.yml — CodeQL SAST over javascript-typescript (apps/) + go (terratest), security-extended,security-and-quality queries, PR + push + weekly schedule, SARIF → Code Scanning. [M2a][B][CA] 🟦 code-complete (v0.53) → Test: YAML lint (offline) + PR with a planted CWE finding surfaces in Code Scanning. Runbook: docs/runbooks/test/I3.md.
  • I4–I7 — DAST, Defender→ticket, Azure Update Manager, CVE triage. [M4][A] ⏸️ postponed

J. Logging, Monitoring & SIEM [B]/[A]

  • 🟦 J1modules/azure/log-analytics/ — standalone hardened LAW: per-table retention (30-730d interactive + archive, total >= retention validated), CanNotDelete management lock, scoped RBAC (Log Analytics Reader/Contributor + Monitoring Reader + freeform), self-audit azurerm_monitor_diagnostic_setting (who ran KQL queries). AAD-only by default. Optional daily_quota_gb cost cap. Emits F0 observability_contract. azurerm-only. [M2a][B][CO] → Test: Terratest validate + TestJ1ObservabilityContractConformance + build-tagged integration (~$0) asserts workspace + self-audit diag + contract shape.

  • 🟦 J2modules/azure/policy-diagnostics/ — custom azurerm_policy_set_definition (DINE initiative, not Deny) bundling built-in DeployIfNotExists diagnostic policies. GUID-agnostic (caller supplies GUIDs via diagnostic_policies input map sourced from az policy definition list). Sub- or MG-scope. System-assigned identity + remediation roles. Emits az policy remediation create command. Validate-only in CI. [M2a][B][CO] → Test: Terratest validate (offline). Live apply + DINE remediation drill is manual.

  • J3 — Microsoft Sentinel deployment. [M4][A][CO] ⏸️ postponed

  • J4 — Alert rule pack. [M2b][B][CO] ⏸️ postponed (outside 14-asset core)
  • J5 — Managed Grafana dashboards-as-code. [M4][A][CO] ⏸️ postponed

  • 🟦 J6modules/azure/audit-log-archive/ — RA-GZRS StorageV2 with account-level time-based immutability (allow_protected_append_writes = true, state defaults Unlocked for teardown safety). Forwards subscription Activity Log; optional Log Analytics data export; optional Storage Blob Data Reader grants. shared_access_key_enabled defaults true (platform diagnostic writer requires it). Distinct from F6 (state backend). [M2a][B][CO] → Test: Terratest validate + build-tagged integration (~$0, immutability OFF). WORM mutation-refused drill is manual.

  • J7 — Cost-controlled log strategy. [M4][A][CO] ⏸️ postponed


K. Incident Response & SecOps [B]/[A]

  • 🟦 K1 — IR runbook library (docs/runbooks/incident/: compromise, ransomware, data leak, DDoS, vendor breach). [M2b][B][CO] 🟦 code-complete (external, gemini-work PR #12)
  • 🟦 K2modules/azure/oncall-integration/ — PagerDuty/Opsgenie + Slack (Sentinel incidents → on-call). [M2b][B][CO] 🟦 code-complete (external, gemini-work PR #12)
  • K3–K5 — Sentinel SOAR playbooks / Post-incident review / Tabletop exercise. [M4][A] ⏸️ postponed

L. Backup & Disaster Recovery [B]/[A]

  • 🟦 L1modules/azure/backup-policy/ — Azure Backup policy module [B]: creates (toggleable) a GeoRedundant Recovery Services vault + a Data Protection Backup vault and the four per-env-retention backup policies — VM (azurerm_backup_policy_vm), Azure Files/"Storage" (azurerm_backup_policy_file_share), SQL-in-VM (azurerm_backup_policy_vm_workload), AKS (azurerm_data_protection_backup_policy_kubernetes_cluster). Per-env profiles (dev 7d / staging 14d+5w / prod 30d+12w+12m+7y) expand daily/weekly/monthly/yearly tiers via dynamic blocks; plan-time preconditions enforce CRR⇒GeoRedundant and the yearly⇒monthly⇒weekly nesting. Defines reusable policies (not per-instance bindings); vault MIs exported for consumers. GeoRedundant + cross_region_restore_enabled = the L2 on-ramp. Offline TestBackupPolicyValidate gate. [M2b][B][CO] 🟦 code-complete (v0.46) → Test: docs/runbooks/test/L1.md — fmt/validate + Terratest validate (A+B); live: apply both vaults + four policies to sandbox, assert redundancy/retention, destroy (C).
  • 🟦 L2modules/azure/cross-region-replication/ — cross-region replication wiring [B]: blob object replication (azurerm_storage_object_replication, source→DR account, rule per container mapping; optionally creates the destination containers) + geo-redundant SQL failover group (azurerm_mssql_failover_group, primary↔partner server, per-env failover posture). Consumes existing accounts/servers by ARM ID (brownfield-safe wiring, not resource creation — same stance as L1). Per-env SQL failover: dev Manual / staging Automatic 60m / prod Automatic 120m; preconditions enforce cross-region locations differ, distinct accounts/servers, and Automatic⇔grace / Manual⇔no-grace coherence. The active-replication half of DR; L1 is the recoverability half; L4 is the drill. Offline TestCrossRegionReplicationValidate gate. [M2b][B][CO] 🟦 code-complete (v0.47) → Test: docs/runbooks/test/L2.md — fmt/validate + Terratest validate (A+B); live: apply two storage accounts + two SQL servers + the links to sandbox, assert failover group Automatic/120m + cross-region, destroy (C).
  • L3 — DR runbook templates. [M4][A][CO] ⏸️ postponed
  • 🟦 L4apps/restore-drill/ — automated restore drill [B]: standalone TS tool (pure offline logic + thin executor seam + jest, same mold as E0/S1/S2) that restores an L1 backup (or fails over an L2 SQL failover group) into an ephemeral sandbox RG → validates → tears down → records a versioned RestoreDrillReport (schemaVersion 1.0). Outcome classified passed/partial/failed (partial = recovered but RTO missed or teardown failed); measured RTO = restore+validate duration; diffReports is the recoverability-regression signal. Executors: DryRunExecutor (deterministic — tests/demos/workflow rehearsal) + AzureCliExecutor (live az). Reports land in the compliance/restore-drills/ evidence store; S2 gains an additive --restore-drills-dir "DR restore drills" panel (gated, so its golden output is unchanged) — that's how pass/fail reaches the dashboard. Scheduled via .github/workflows/restore-drill.yml (monthly cron; dispatch defaults to dry-run, schedule runs live; commits the report). Teardown always runs (X7 backstop). [M2b][B][CO] 🟦 code-complete (v0.48) → Test: docs/runbooks/test/L4.md — offline classify/orchestrate/render + S2 panel wiring (A+B, 17 + 34 tests); live: real restore→validate→teardown in the sandbox, dated report to the evidence store (C).
  • L5 — RTO/RPO doc generator. [M4][A][CO] ⏸️ postponed

M. Data Protection & Privacy [B]/[A]

  • 🟦 M1modules/azure/encryption-policy/ — custom azurerm_policy_set_definition Deny initiative: encryption-at-rest built-ins (storage infrastructure encryption, SQL CMK, managed-disk double-encryption + CMK). Initiative-level effect parameter (Audit/Deny/Disabled). No system-assigned identity (Deny effect, not DINE). GUIDs caller-overridable. [M2a][B][CO] → Test: Terratest validate (offline). Live Audit→Deny rollout + "unencrypted create denied" is manual.

  • 🟦 M2modules/azure/cmk/ — Customer-Managed Key: HSM-backed azurerm_key_vault_key (RSA-HSM/EC-HSM only, software keys rejected) + auto-rotation policy (rotate_before_expiry_days < expire_after_days precondition) in an EXISTING F5 Premium RBAC-mode vault. Optional user-assigned identity auto-granted Crypto Service Encryption User. Consumers wire to versionless key ID for transparent rotation. [M2a][B][CO] → Test: Terratest validate + TestCMKModule integration (~$1 — Premium vault + deployer Crypto Officer + HSM key, asserts versionless ID + rotation policy).

  • 🟦 M3modules/azure/tls-policy/ — custom Deny initiative: secure-transport built-ins (storage secure-transfer, storage min-TLS, App Service + Function HTTPS-only). Two initiative parameters: effect + minimumTlsVersion. The storage_min_tls reference threads BOTH via explicit parameter_values. [M2a][B][CO] → Test: Terratest validate (offline). Live Audit→Deny + "HTTP/TLS<1.2 create denied" is manual.

  • M4–M5, M7 — Purview baseline / DLP policies / GDPR-CCPA evidence. [M4][A] ⏸️ postponed

  • 🟦 M6modules/azure/data-residency-policy/ — custom Deny initiative: Allowed-locations built-ins (resources + optional resource groups) with listOfAllowedLocations. No effect parameter (Allowed-locations is intrinsic-Deny); rollout uses enforce=falsetrue. Standalone residency boundary distinct from F1's bundled allowed-locations. [M2a][B][CO] → Test: Terratest validate (offline). enforce=falsetrue + "out-of-region create denied" is manual.


N. Network Security [B]/[A]

  • N1 — Landing-zone hub-spoke (extends F2). [M2a][B][CO] ⏸️ postponed
  • N2–N4 — Azure Firewall Premium / WAF / DDoS. [M2b/M4] ⏸️ postponed

  • 🟦 N5modules/azure/private-endpoint-policy/ — custom Deny initiative: "disable public network access" built-ins (storage / Key Vault / Cosmos DB / SQL). Initiative-level effect parameter. No system-assigned identity. Pairs with F4/F5 (PEs) + F2 (Private DNS). Curated, caller-overridable GUIDs. [M2a][B][CO] → Test: Terratest validate (offline). Audit→Deny rollout + "public PaaS create denied" is manual.

  • 🟦 N6modules/azure/nsg-baseline/ — hardened NSG (dynamic security_rule over merged baseline_rules + custom_rules map; curated defaults deny SSH/RDP/Internet-inbound; cross-map key collision = plan-time merge error) + optional subnet associations + optional NSG flow logs + Traffic Analytics (10-min interval, gated on BOTH workspace ID AND flow-log storage account ID). Standalone counterpart to F2's bundled NSG. [M2a][B][CO] → Test: Terratest validate + TestNSGBaselineModule integration (~$0, flow logs OFF). "Flow logs landing within 10 min" is manual Part D.

  • N7 — Zero-trust reference architecture. [M4][A][CO] ⏸️ postponed


O–Q. Endpoint, Vendor Risk, HR Security [A]

All items O1–O4, P1–P4, Q1–Q5 are ⏸️ postponed to M4.


R. Change Management [B]/[A]

  • 🟩 R1 — PR template enforcement + required-fields validation workflow. [M1][QW][CA]Signed off 26/05 (Sagar Chhabra).

  • R2–R4 — Production change log / Emergency change / CAB automation via E7. ⏸️ postponed


S. Continuous Compliance Monitoring & Drift [B]/[A]

  • 🟦 S1apps/drift-detector/ — Drift detection [B]: read-only TS tool that turns terraform show -json into a versioned DriftReport (schemaVersion 1.0), classifies managed-resource changes (create/update/delete/replace; no-op + data reads excluded), and files/updates one ticket per stack. diffReports is the change signal (mirrors E0's diffSnapshots). Ships the TicketPlatform interface (D15 — E7 seed) with GitHub Issues + dry-run adapters; idempotent upsert via an embedded dedupe marker. Scheduled via .github/workflows/drift-detection.yml (daily cron, per-stack matrix). Plans only, never applies. [M2b][B][CO] 🟦 code-complete (v0.44) → Test: docs/runbooks/test/S1.md — offline classifier/diff/ticket (A+B); live: mutate sandbox resource, next run opens/updates the issue (C).

  • 🟦 S2apps/compliance-dashboard/ — Azure Policy compliance dashboard [B]: fully offline TS tool that renders a history of E0 ComplianceSnapshots into a versioned ComplianceDashboard (schemaVersion 1.0) plus a self-contained static HTML page + markdown summary. Current posture, latest-vs-previous regression delta (vendors E0's diffSnapshots), a trend line, and a best-effort name-based framework rollup (SOC 2 / ISO 27001 / CIS Azure / HIPAA; MCSB fans out; unmatched → "Unmapped"). Reads the compliance/snapshots/ evidence store (fed by E0); never touches Azure. Scheduled via .github/workflows/compliance-dashboard.yml (E0 collect → S2 render → artifact; Pages opt-in). [M2b][B][CO] 🟦 code-complete (v0.45) → Test: docs/runbooks/test/S2.md — offline aggregation/framework/render golden (A+B); live: workflow collects a snapshot and uploads a renderable dashboard (C).

  • S3–S4 — Auto-remediation playbooks / Compliance scorecard. [M4][A] ⏸️ postponed


T. Trust Center & Customer-Facing [A]

T1–T4 all ⏸️ postponed to M4.


U. Cost Governance [B]/[A]

  • 🟦 U1modules/azure/budget-alert/azurerm_consumption_budget_subscription (Monthly, configurable thresholds) + dynamic notification blocks (actual + forecasted) + optional dedicated azurerm_monitor_action_group (email/sms/webhook, reusing H7 pattern) + optional filter_tag/filter_resource_groups. 4 preconditions (≤5 notifications per Azure cap, ≥1 recipient). Build-tagged integration test (~$0). [M2a][B][CO] → Test: Terratest validate + TestBudgetAlertModule integration asserts budget ARM ID + action-group ID + 4 notification keys + recipient_count.

  • 🟦 U2modules/azure/tag-policy/ — mandatory-tag Deny initiative: Require a tag on resources + Require a tag on resource groups per tag. Per-reference literal tagName (initiative has no parameters — each reference needs a distinct value, unlike M6's shared parameter). Default §8 set minus ManagedBy. enforce=falsetrue rollout (Deny is intrinsic to these built-ins). [M2a][B][CO] → Test: Terratest validate (offline). enforce=falsetrue + "untagged create denied" is manual.

  • U3–U5 — Idle resource cleanup / FinOps dashboard / Cost anomaly. [M4/M5] ⏸️ postponed


V. Documentation & Policy Management [B]/[A]

  • V1 — Policy repo template (InfoSec, AUP, IR, BCP, Change Mgmt, Vendor). [M4][A][CO] ⏸️ postponed

  • 🟦 V2apps/diagram-generator/ — zero-cloud TS tool: terraform output -json → F0 contracts → cloud-neutral StackModeld2lang architecture diagram. Shape-based contract detection (name-agnostic; classifies by fields not output names). Deterministic renderer (stable slug() node ids + sorted emission → clean golden-file diffs). [M2b][B][CO] ⬜ in scope (14-asset core) → Test: jest (2 suites / 9 tests) — adapter normalizes all F0 contracts from sample stack + golden-file byte match + determinism + empty-stack safety. Manual runbook: docs/runbooks/test/V2.md.

  • 🟦 V3apps/runbook-generator/ — zero-cloud TS CLI: terraform output -json → shape-based adapter (all 7 F0 contracts) → per-domain operational runbooks (infrastructure.md index + identity/network/compute/registry/secrets/storage/observability) with key-facts tables, Day-Zero Hardening posture digest (✅/⚠️), ops CLI + failure modes. Mirrors V2's adapt→model→render pattern. Legacy Handlebars mode kept via --templates. [M2b][B][CO] ⬜ in scope (14-asset core) → Test: jest (4 suites / 16 tests) — parser + legacy Handlebars + shape-based adapt + model-driven render. Manual runbook: docs/runbooks/test/V3.md.

  • V4 — Compliance manual generator. [M4][A][SH] ⏸️ postponed


W. Multi-Tenant Client Management [X][SO] ⏸️ ALL POSTPONED (D35)

W1–W3 are marked [M2a→postponed] and are NOT part of the M2a-complete bar. Pull up only when a second concurrent client forces the multi-tenant isolation question, or after M2b/M3 are underway.

  • ⬜⏸️ W1 — Client repo template + provisioning (extends B1).
  • ⬜⏸️ W2 — Per-client state backend (extends F6).
  • ⬜⏸️ W3 — Per-client secret scoping in SnowOps GH org (environments).
  • W4 — Client offboarding playbook. [M3]
  • W5 — SnowOps internal client dashboard. [M5]

X. Testing Framework & Sandbox [X]

Nothing in F/B/E/etc. can be marked 🟩 Shipped without X1, X2, X6 in place.

  • 🟦 X1 — SnowOps Azure sandbox subscription (Terraform-managed): isolated tenant or sub, budget-capped, auto-cleanup tag ephemeral=true. [M1]
  • 🟦 X2 — Terratest harness (tests/terratest/): Go-based, parallel-safe, sandbox-scoped; mandatory per F module. Currently 35 top-level tests. [M1]
  • 🟩 X3 — Conftest test suite for D3 (policy/opa/tests/). [M1]
  • 🟦 X4 — Kyverno test framework for D4 (policy/kyverno/tests/). 5 sub-suites / 21 assertions + run-tests.sh wrapper + pre-push hook. [M2a]
  • X5 — Pipeline integration tests (test consumer repos for reusable workflows). [M2b] ⏸️ postponed
  • X6 — Manual test runbooks (docs/runbooks/test/<asset_id>.md). [ongoing]
  • 🟦 X7sandbox/cleanup/ + .github/workflows/sandbox-cleanup.yml — nightly cleanup of ephemeral=true RGs. Three guards: (1) ephemeral=true tag only, (2) protected-name globs, (3) --min-age-hours (default 6). 12 offline assertions. Cron 03:17 UTC (always deletes); dispatch (defaults dry-run). [M2a][SO]
  • X8 — Synthetic monitoring. [M2b] ⏸️ postponed

Y. Go-To-Market & Sales Engine [X][SO] — all 🟦

See docs/context/04-asset-status.md § Y for full table. All 14 assets (Y0–Y13) drafted under gtm/. Human sign-offs pending.

Z. Reference Architectures [X][SO→CO] — all 🟦

See docs/context/04-asset-status.md § Z for full table. All 4 assets (Z0–Z3) drafted under gtm/z/. Human sign-offs pending.