SnowOps — Full Asset Catalog
Status: ⬜ Not Started · 🟨 Scaffolding · 🟧 In Progress · 🟦 Code Complete · 🟩 Shipped For compact status table:
docs/context/04-asset-status.mdFor full version history:docs/context/05-history.md
A. Sales & CRM Automation (HubSpot) [X][SO]
-
🟦 A1 — HubSpot Private App + lead-enrichment Custom Code Action (Clearbit/Apollo → contact props).
[M1][CA]→ Test: Clearbit/Apollo lookups enriching contact properties. Manual runbook:docs/runbooks/test/A1.md. -
⬜ A2 — ICP scoring Custom Code Action (TS) → routes to Sagar/Nidhi.
[M2b]⏸️ postponed → Test: fixture leads covering ICP / near-ICP / out-of-ICP; assert routing. -
⬜ A3 — Proposal generator (Deal stage → templated PDF; DocRaptor or Google Docs API).
[M2b]⏸️ postponed → Test: golden-file diff on rendered PDF text content for 3 sample deals. -
⬜ A4 — Project kickoff webhook (
Closed Won→ triggers B1 + Notion + Slack).[M2b]⏸️ postponed → Test: simulate webhook payload → assert repo created + Slack channel. -
🟦 A5 — Discovery trigger Custom Code Action (
Qualified→ dispatches G4 + sends offer email).[M1][CA]→ Test: Deal stage change → G4 dispatched, offer email sent. Manual runbook:docs/runbooks/test/A5.md.
B. Client Onboarding [B]
-
🟦 B1 — GitHub App
snowops-onboarder(Probot/TS): repo + branch protection + CODEOWNERS + checks + env vars + federated OIDC to client Azure AD.[M1][SO→CO]→ Test: install on test org, trigger create → assert all settings via GH API. Manual runbook:docs/runbooks/test/B1.md. -
🟦 B2 —
modules/azure/client-bootstrap/— Azure AD application + service principal + federated identity credentials (GitHub OIDC only,repo:prefix validated) + role assignments (sub/ACR/KV/freeform). No client secret ever created.[M2a][SO→CO]→ Test: Terratest validates SP + fed creds + role-assignment IDs. Manual runbook:docs/runbooks/test/B2.md. -
🟦 B3 —
modules/azure/subscription-baseline/— composes F1 + group RBAC (6 named shortcuts: Owner/Contributor/Reader/Security Admin/Security Reader/UAA + freeform escape hatch) + MCSB regulatory-compliance initiative (system-assigned identity, audit-only until remediation role granted out-of-band). Defender ON by default. azurerm-only (RBAC binds by group object ID).[M2a][CO]→ Test: Terratest validates (offline composition + RBAC flatten + MCSB block) + integration applies to sandbox asserts LAW + SnowOps Standard set + MCSB assignment + system-assigned identity GUID + 3 RBAC role-assignment IDs. Manual runbook:docs/runbooks/test/B3.md. -
🟦 B4 —
modules/azure/client-state-backend/— wraps F6 + Blob Data RBAC (Contributor/Reader/Owner,principal_typeunset for mixed SP+group) + optional Private Endpoint + optional network-rule lockdown (standaloneazurerm_storage_account_network_rules, default-deny; precondition requires reach-path) + optional diagnostics.backend_configforcesuse_azuread_auth = true. azurerm-only.[M2a][CO]→ Test: Terratest validates (offline composition + RBAC flatten) + integration asserts SA ARM ID +backend_config+ Blob Data role-assignment IDs + contractversioning_enabled. Manual runbook:docs/runbooks/test/B4.md. -
🟦 B5 —
modules/azure/pim-azure-resources/— PIM for Azure resource roles. Tier-0 (Owner + UAA) → MFA + justification + ticketing + approval + max 8h; tier-1 (Contributor) → MFA + justification + max 4h. Uses azurerm's nativeazurerm_pim_eligible_role_assignment+azurerm_role_management_policy. Eligibility permanent; time-box is on activation. Requires Entra ID P2 at apply. Precondition: ≥1 break-glass Owner so PIM can't lock the sub out.[M2a][CO]→ Test: Terratest validates (offline). Live activation drill is manual (can't Terratest MFA/approval flow). Manual runbook:docs/runbooks/test/B5.md. -
🟦 B6 —
apps/client-bootstrap/— client self-service bootstrap: prerequisite checker + Azure permission validator a prospective client runs in their OWN tenant pre-engagement. Pure evaluator over anEnvironmentSnapshotbehind aCollectorseam (FixtureCollectortests/offline;AzureCliCollectorliveaz). Read-only, no secrets. Checks tooling (az ≥ 2.50, terraform ≥ 1.6), auth, permissions (assign RBAC = Owner/UAA not Contributor; create Entra apps+SPs; required providers registered), Entra ID P2 (warn). READY only when every required check passes;summary.mdends in a remediation list;status.jsoncarriesready/blockers/warnings. Single entrypointbootstrap.sh(exit 0/2).[M3][CA](14-asset M2b core) → Test:examples/snapshot.ok.json→ READY/exit 0;examples/snapshot.restricted.json(restricted SP) → NOT READY + clear remediation/exit 2. Manual runbook:docs/runbooks/test/B6.md.
C. CI/CD & Delivery Pipelines [B]
-
🟦 C1 —
terraform-plan-apply.ymlGitHub Actions reusable workflow (Azure OIDC, plan-on-PR comment, apply-on-merge, environment gates, conftest OPA check post-plan).[M2a][CA]← KEYSTONE → Test: create test repo consuming workflow; open PR → plan comment; merge → apply against sandbox. Manual runbook:docs/runbooks/test/C1.md. -
🟦 C2 —
container-build-sign.yml(build → ACR push by digest → Notation v2 sign via AKV plugin → Syft SBOM → Grype scan with severity cutoff). Signing by digest not tag. 13 inputs;grype_severity_cutoffdefaultscritical;fail_on_scan_findingsdefaults true.[M2a][CA]→ Test: clean image passes; planted-CVE image fails at critical cutoff;notation verifysucceeds. Manual runbook:docs/runbooks/test/C2.md. -
🟦 C3 —
aks-deploy.yml(ArgoCD image override → sync → wait Healthy → smoke probe → optional rollback drill; Kustomize XOR Helm modes, rejected if both/neither set). ArgoCD-token-only auth.[M2a][CA]→ Test: happy-path Kustomize deploy + smoke green; rollback drill exercises revert + roll-forward; ambiguous + empty image-set inputs both rejected. Manual runbook:docs/runbooks/test/C3.md. -
🟩 C4 — GitOps branching standard doc + client-repo template + branch protection rules.
[M1][QW][CA]→ Test: branching standard doc + template structure + protection validation. Manual runbook:docs/runbooks/test/C4.md. Signed off 26/05 (Sagar). -
🟦 C5 —
pipelines/azure-devops/— ADO Pipeline templates:terraform-plan-apply.yml(mirrors C1),container-build-sign.yml(C2),aks-deploy.yml(C3),quality-gates.yml(D2) + caller examples + README. Same underlying tools as the GH Actions workflows; only the CI wrapper changes. Callers reference viaresources: repositories:.[M3][CA]🟦 code-complete (commit 51c7fc4) → Test: ADO test project consuming pipeline; PR triggers plan; stage approval gates apply on merge. Runbook:docs/runbooks/test/C5.md(pending).
D. Quality & Security Gates (PR-time) [B]
-
🟩 D1 —
.pre-commit-config.yaml(tflint, fmt, checkov, tfsec, gitleaks, trivy fs, conftest verify on pre-push).[M1][QW][CA]→ Test: planted bad commits blocked locally. Manual runbook:docs/runbooks/test/D1.md. Signed off 26/05 (Sagar). -
🟦 D2 — PR-blocking GH Actions mirroring D1 (
.github/workflows/quality-gates.yml).[M1][QW][CA]→ Test: planted bad PRs blocked; clean PRs pass. Manual runbook:docs/runbooks/test/D2.md. -
🟩 D3 — Conftest/OPA policy bundle for
terraform planJSON (encryption, tags, no public network, allowed regions, cost caps).[M1][QW][CA]→ Test:conftest verifysuite (X3) — every rule has pass + fail fixture. Manual runbook:docs/runbooks/test/D3.md. Signed off 26/05. -
🟦 D4 — Kyverno policy bundle for AKS — 5 ClusterPolicies (
disallow-latest-tag,require-signed-images,require-pod-labels,disallow-privileged-containers,require-network-policywithgenerateof default-deny NetPol). All Enforce. System namespaces excluded viaexclude.any.resources.namespaces.[M2a][CO]→ Test:kyverno testper rule (X4) — 5 suites / 21 assertions. Live admission round-trip is manual. Manual runbook:docs/runbooks/test/D4.md. -
🟦 D5 —
waivers/+policy/opa/rules/main.regowaiver engine — time-boxed OPA exception records (waivers/exceptions.yaml:rule_prefix+resource_address+expiry_date+owner+justification) + CI enforcement. The D3 rule files (tags/locations/network/encryption/cost) now emitraw_violation;main.regofilters them throughhas_active_waiver(suppress matching, non-expired) and hard-denies expired waivers (snowops.waiver_expired) so they fail the pipeline. Wired intoterraform-plan-apply.ymlviaconftest test plan.json --data waivers/exceptions.yaml.[M2b][CA]🟦 code-complete (external, gemini-work PR #13). Runbookdocs/runbooks/test/D5.md. → Test: unexpired waiver suppresses D3 finding; expired waiver causes CI failure.
E. Automated Evidence Collection [A]
-
🟦 E0 —
apps/evidence-collector/— Lightweight compliance snapshot[B]: read-only TS tool collecting Azure Policy compliance state (summarize) + Defender secure score (secureScores/ascScore) into a versioned JSON artifact (schemaVersion 1.0).diffSnapshotsis the regression signal. Wired into C1 ascontinue-on-errorpost-apply step. Reader + Security Reader only.[M2b][SH]⬜ in scope (14-asset core) → Test: jest (3 suites / 18 tests) — required-field validation, seeded-policy-violation diff, markdown render. Manual runbook:docs/runbooks/test/E0.md. -
⬜ E1 —
EvidencePlatformTS interface (apps/evidence-collector/src/platforms/).[M4][SO]⏸️ postponed - ⬜ E2 —
VantaAdapter.[M4][SO]⏸️ postponed - ⬜ E3 —
DrataAdapterstub.[M4][SO]⏸️ postponed - ⬜ E4 — Azure Resource Graph query library (SOC2 CC + ISO27001 A.x).
[M4][SH]⏸️ postponed - ⬜ E5 — Defender for Cloud → Vanta scheduled sync.
[M4][SO]⏸️ postponed - ⬜ E6 — Quarterly access review automation → CSV → ticket via E7.
[M4][SO]⏸️ postponed - 🟦 E7 —
apps/ticket-platform/(@snowops/ticket-platform) — platform-neutralTicketPlatforminterface + adapters (GitHub Issues, Jira REST v2, Linear GraphQL, Azure DevOps Boards) +DryRunTicketPlatform+selectPlatformfactory +snowops-ticketCLI. One shared marker-basedupsertByMarker(idempotent create-or-update via an HTML-comment dedupe marker); each adapter implements the sameMarkerUpsertApiseam (listOpen/create/update) with an injectablefetch. Generalizes the S1 seed (D15/D39); closes G8.[M3][CO/CA]🟦 code-complete (v0.54) → Test:npm test(6 suites / 26 — shared upsert + each adapter's HTTP mapping + factory) + dry-run CLI. Runbook:docs/runbooks/test/E7.md.
F. SnowOps Module Library (Azure-First, Cloud-Agnostic Contracts) [B]/[A]
F0 sequencing constraint: F0 must land before any new F-module. F1 and F6 retrofitted in v0.17.
-
🟦 F0 —
modules/_contracts/— 7 contracts: network, identity, cluster, registry, kv, observability, object_store. Each: typedvariable "candidate"+ echoingoutput "candidate"+ no providers.[M2a][B][SO]→ Test: every contract validates standalone; F1/F6 conformance tests; 4 negative-literal tests. Manual runbook:docs/runbooks/test/F0.md. -
🟦 F1 —
modules/azure/baseline/(Mgmt Group, Subs, Policy, Defender, Log Analytics, Activity Log). Emitsidentity_contract+observability_contract.[M2a][B][CO]→ Test: Terratest applies to sandbox; asserts policy assignment + Defender plans. Manual runbook:docs/runbooks/test/F1.md. -
🟦 F2 —
modules/azure/network-hub/(hub-spoke vNets, optional Azure Firewall, optional Private DNS zones, NSG flow logs to F1, per-spoke route-table forcing 0.0.0.0/0 through firewall). Emits F0spoke_network_contractsmap.[M2a][B][CO]→ Test: Terratest validates topology + routing. Manual runbook:docs/runbooks/test/F2.md. -
🟦 F3 —
modules/azure/aks-secure/(private AKS, Workload Identity, OIDC issuer, AAD-RBAC + local accounts disabled, Azure CNI Overlay + Calico NetworkPolicy, Defender for Containers, KEDA, Image Cleaner, AKV CSI driver, system + user node pools across 3 AZs on AzureLinux+Ephemeral OS). Emits F0cluster_contract.[M2a][B][CO]→ Test: Terratest provisions cluster; AAD-only kubectl smoke; private API confirmed. Manual runbook:docs/runbooks/test/F3.md. -
🟦 F4 —
modules/azure/acr/(Premium SKU, Private Endpoint + auto-A-record inprivatelink.azurecr.io, AAD-only auth, public access disabled, optional geo-replication + AcrPull bindings + Defender scanning passthrough). Emits F0registry_contract.[M2a][B][CO]→ Test: Terratest provisions; push + sign sample image; pull from private endpoint only. Manual runbook:docs/runbooks/test/F4.md. -
🟦 F5 —
modules/azure/key-vault/(Premium SKU default, RBAC mode enforced, purge protection enforced, default-deny network ACLs + AzureServices bypass, Private Endpoint + auto-A-record inprivatelink.vaultcore.azure.net, optional role bindings across 5 built-in KV roles, optional diag forward). Emits F0kv_contract.[M2a][B][CO]→ Test: Terratest; secret CRUD via Workload Identity; public access denied. Manual runbook:docs/runbooks/test/F5.md. -
🟦 F6 —
modules/azure/state-backend/(state SA + container per env, used by B4). Emitsobject_store_contract.[M2a][B][CO]→ Test: Terratest applies; init a dummy Terraform stack against it; lease lock observed. Manual runbook:docs/runbooks/test/F6.md. -
🟦 F7 —
live/— Terragrunt live-infra reference:root.hcl(remote state in the F6 backend, generated OIDC azurerm provider, common §3 tags) + DRY_envcommon/templates (F1 baseline, F2 network-hub, F5 key-vault, F4 acr) +bootstrap/(F6 state account, local state — breaks the chicken-and-egg) + per-env/per-region units (prod eastus full chain + westus2, staging, sandbox). Real dependency DAG: baseline → network-hub/key-vault/acr viadependency.baseline.outputs.log_analytics_workspace_id(mock_outputs for pre-apply plan). Variance isolated inenv.hcl/region.hcl; units are 3-line includes. In-reposourceviaget_repo_root(); F11 registry-pin form documented for external use. Offlinelive/validate.shstructural gate +terragrunt hcl validate.[M2b][B][CO]🟦 code-complete (v0.55) → Test:live/validate.sh(offline structural gate, A) +terragrunt hcl validate/hclfmt --check(B); live:run-all plan/applyagainst sandbox (C). Runbook:docs/runbooks/test/F7.md. -
🟦 F8 —
gitops/— K8s reference manifests bundle as ArgoCD app-of-apps (cert-manager + Kyverno + ESO wave 0 → ingress-nginx wave 1 → D4 policies + ClusterSecretStore wave 2). D4 reused not forked.[M2b][B][CO]→ Test:gitops/validate.sh(offline — 13 files / 7 Applications) +kyverno test. Live kind-cluster bootstrap is runbook. Manual runbook:docs/runbooks/test/F8.md. -
⬜ F9 —
modules/aws/*parity.[M5][B]/[A]⏸️ deferred - ⬜ F10 —
modules/gcp/*parity.[M5][B]/[A]⏸️ deferred - 🟦 F11 —
apps/module-registry/+modules/registry.json+ per-moduleCHANGELOG.md+.github/workflows/module-release.yml— module versioning + private Terraform registry. Private registry = the monorepo itself: modules publish as git tags<module>/v<version>, consumers pin viasource = "git::…//<path>?ref=<module>/vX.Y.Z"(no hosted service). Manifest is the source of truth (10 modules: F0 0.1.0; F1–F6 + J1/J2/J6 1.0.0). TS tool (B6/L4 mold; pure core over aRegistrySnapshotbehind aCollectorseam): validate (unique names/paths, strict semver, CHANGELOG top == manifest version, no version-regression), buildIndex, planReleases, auditPins (flagsunpinned/ref-mismatch/unknown-versionin a consumer tree).module-releaseworkflow tags + GitHub-Releases pending modules on merge to main (CHANGELOG section as body, idempotent). 3 jest suites / 27 tests incl. a guard over the real manifest+CHANGELOGs. Convention:docs/conventions/module-versioning.md.[M3][SO]🟦 code-complete (v0.51) → Test:npm test(27);--manifest modules/registry.json --fail-on-issues→ OK/exit 0;--consumer-dir examples/consumer-unpinned --fail-on-issues→ exit 2. Manual runbook:docs/runbooks/test/F11.md. - 🟦 F12 —
modules/azure/import-blocks/— brownfield import library[B]: config-driven Terraformimport {}blocks (one<module>.tfper module) that adopt pre-existing Azure resources into the F-modules — covers F1 baseline, F2 network-hub, F3 aks-secure, F4 acr, F5 key-vault, F6 state-backend + J1 log-analytics, J2 policy-diagnostics, J6 audit-log-archive (9 modules). Each file pairs the import blocks with a placeholdermodulecall so the whole directory is self-validating —terraform validateconfirms everyto =address resolves (incl.count[0]+for_each["key"]instances); for_each key schemes are derived from source + documented per file. OfflineTestImportBlocksValidategate. Each covered module's README brownfield section now points at its real file; adoption procedure indocs/runbooks/import/F12.md.[M3][B][CO]🟦 code-complete (v0.49) → Test:docs/runbooks/import/F12.md— offlinevalidate/fmt+TestImportBlocksValidate(A+B); live: adopt a real sandbox resource with a zero-change plan (C).
G. Pre-Sales Discovery & Audit Automation [X][SH]
Azure-only through M3. AWS discovery mode (G7) lands in M4.
- 🟦 G0 — Client-side scoped Reader + Security Reader SP bootstrap script + Bicep alt; time-boxed federated cred; no secrets leave client tenant.
[M1] - 🟦 G1 —
apps/discovery-auditor/(Node/TS) collectors: Resource Graph KQL, Defender REST, Azure Policy state, AAD audit logs, Cost Mgmt.[M1] - 🟦 G2 — YAML rule pack mapped to SOC2 CC + ISO27001 A.x + CIS Azure Benchmark; severity/evidence/remediation/effort. Each finding includes
remediation_asset_id. 11 rules / 22 fixtures.[M1] - 🟦 G3 — Report renderer (Markdown → PDF via Pandoc/Playwright; branded cover, exec summary, control table, prioritized roadmap).
[M1] - 🟦 G4 —
.github/workflows/discovery-run.yml(manual dispatch withtenant_id+sub_id; artifact upload; Slack notify with reviewer checklist).[M1] - 🟦 G5 — HubSpot integration (A5 → Deal property
discovery_report_url).[M1] - 🟦 G6 — Immutable run audit log (client/scope/timestamp/operator/findings hash → WORM blob, SHA-256 hash chain).
[M1] - ⬜ G7 — AWS discovery mode.
[M4][X][SH]⏸️ postponed
H. Identity & Access Management [B]/[A]
-
🟦 H1 —
modules/azure/aad-baseline/(IP + country named locations viaazuread_named_location, custom Authentication Strength Policy for phishing-resistant MFA, password protection + tenant branding emitted as*_patch_bodyJSON foraz restPATCH). Precondition: verified custom domain required.[M2a][B][CO]→ Test: Terratest validate; H1 runbook applies in sandbox + applies Graph PATCHes. -
🟦 H2 —
modules/azure/conditional-access/(6 SnowOps CA policies: MFA Mandatory / Tier-0 Phishing-Resistant+Compliant Device / Block Legacy Auth / Geo-Block / High-Risk Block / Medium-Risk MFA; every policy excludes break-glass group; risk policies gated on P2).[M2a][B][CO]→ Test: Terratest validate; H2 runbook applies in report-only → CA What-If → enforce + live sign-in. -
🟦 H3 —
modules/azure/pim-templates/(tier-0 + tier-1 AAD role eligibility viaazuread_directory_role_eligibility_schedule_request; activation rule bodies emitted as JSON for Graphaz restPATCH sinceroleManagementPolicieshas no TF resource). Precondition: ≥1 permanent break-glass tier-0 holder.[M2a][B][CO]→ Test: Terratest validate; H3 runbook applies eligibility + Graph PATCHes + live activation drill. -
⬜ H4 — SCIM provisioning from Azure AD to SaaS.
[M4][A][CO]⏸️ postponed -
🟦 H5 —
apps/sp-inventory/(read-only-Graph TS,Application.Read.All) +.github/workflows/sp-inventory-rotation.yml(scheduled reusable workflow). Inventories app registration credentials; flags aged (≥threshold_days=90) / expiring-soon (withinexpiry_warning_days=30) / expired. Opens/idempotently updates rotation PR. Never rotates a secret itself. Federated-OIDC-only SPs never stale. 2 test suites / 19 tests.[M2a][B][CA]→ Test: jest unit suite covers stale SP path + federated-OIDC-only SP path. Live tenant read + PR drill is manual. -
⬜ H6 — Access review automation.
[M4][A][CO]⏸️ postponed -
🟦 H7 —
modules/azure/break-glass/— dual-provider (azuread + azurerm). Role-assignable group +azuread_group_memberper member + permanent (active, non-PIM) Global Administrator + severity-0 sign-in alert (azurerm_monitor_scheduled_query_rules_alert_v2, KQL on UserId, threshold=0). Producer of break-glass group H2/B3/B5 consume. Takes existing account object IDs as input (no account or password creation — Identity > Secrets).[M2a][B][CO]→ Test: Terratest validate (offline — 4 preconditions). Live sign-in drill is manual (needs P1 + real LAW).
I. Vulnerability & Patch Management [B]/[A]
- 🟦 I1 —
.github/workflows/image-scan.yml— reusable (workflow_call) Trivy image scan; fails on High/Critical OS+library CVEs, SARIF → Code Scanning, optional registry login,ignore_unfixed/fail_on_findings/severity_cutoffinputs. Closes G6 (container security for non-K8s clients). Distinct from C2 (build-time grype): I1 scans an arbitrary image ref.[M2a][B][CA]🟦 code-complete (v0.53) → Test: YAML lint (offline) + dispatch scan of an old image fails / current passes. Runbook:docs/runbooks/test/I1.md. - 🟦 I2 —
.github/dependabot.yml(4 ecosystems) +.github/workflows/dependency-review.yml(PR-blocking SCA gate,fail-on-severity: high+ licence deny-list) +.github/workflows/dependency-digest.yml(weekly idempotent Dependabot-alert digest issue).[M2a][B][CA]🟦 code-complete (v0.53) → Test: config lint (offline) + PR introduces vuln dep → review fails; digest run upserts one rolling issue. Runbook:docs/runbooks/test/I2.md. - 🟦 I3 —
.github/workflows/codeql.yml— CodeQL SAST overjavascript-typescript(apps/) +go(terratest),security-extended,security-and-qualityqueries, PR + push + weekly schedule, SARIF → Code Scanning.[M2a][B][CA]🟦 code-complete (v0.53) → Test: YAML lint (offline) + PR with a planted CWE finding surfaces in Code Scanning. Runbook:docs/runbooks/test/I3.md. - ⬜ I4–I7 — DAST, Defender→ticket, Azure Update Manager, CVE triage.
[M4][A]⏸️ postponed
J. Logging, Monitoring & SIEM [B]/[A]
-
🟦 J1 —
modules/azure/log-analytics/— standalone hardened LAW: per-table retention (30-730d interactive + archive,total >= retentionvalidated),CanNotDeletemanagement lock, scoped RBAC (Log Analytics Reader/Contributor + Monitoring Reader + freeform), self-auditazurerm_monitor_diagnostic_setting(who ran KQL queries). AAD-only by default. Optionaldaily_quota_gbcost cap. Emits F0observability_contract. azurerm-only.[M2a][B][CO]→ Test: Terratest validate +TestJ1ObservabilityContractConformance+ build-tagged integration (~$0) asserts workspace + self-audit diag + contract shape. -
🟦 J2 —
modules/azure/policy-diagnostics/— customazurerm_policy_set_definition(DINE initiative, not Deny) bundling built-in DeployIfNotExists diagnostic policies. GUID-agnostic (caller supplies GUIDs viadiagnostic_policiesinput map sourced fromaz policy definition list). Sub- or MG-scope. System-assigned identity + remediation roles. Emitsaz policy remediation createcommand. Validate-only in CI.[M2a][B][CO]→ Test: Terratest validate (offline). Live apply + DINE remediation drill is manual. -
⬜ J3 — Microsoft Sentinel deployment.
[M4][A][CO]⏸️ postponed - ⬜ J4 — Alert rule pack.
[M2b][B][CO]⏸️ postponed (outside 14-asset core) -
⬜ J5 — Managed Grafana dashboards-as-code.
[M4][A][CO]⏸️ postponed -
🟦 J6 —
modules/azure/audit-log-archive/— RA-GZRS StorageV2 with account-level time-based immutability (allow_protected_append_writes = true,statedefaultsUnlockedfor teardown safety). Forwards subscription Activity Log; optional Log Analytics data export; optional Storage Blob Data Reader grants.shared_access_key_enableddefaults true (platform diagnostic writer requires it). Distinct from F6 (state backend).[M2a][B][CO]→ Test: Terratest validate + build-tagged integration (~$0, immutability OFF). WORM mutation-refused drill is manual. -
⬜ J7 — Cost-controlled log strategy.
[M4][A][CO]⏸️ postponed
K. Incident Response & SecOps [B]/[A]
- 🟦 K1 — IR runbook library (
docs/runbooks/incident/: compromise, ransomware, data leak, DDoS, vendor breach).[M2b][B][CO]🟦 code-complete (external, gemini-work PR #12) - 🟦 K2 —
modules/azure/oncall-integration/— PagerDuty/Opsgenie + Slack (Sentinel incidents → on-call).[M2b][B][CO]🟦 code-complete (external, gemini-work PR #12) - ⬜ K3–K5 — Sentinel SOAR playbooks / Post-incident review / Tabletop exercise.
[M4][A]⏸️ postponed
L. Backup & Disaster Recovery [B]/[A]
- 🟦 L1 —
modules/azure/backup-policy/— Azure Backup policy module[B]: creates (toggleable) a GeoRedundant Recovery Services vault + a Data Protection Backup vault and the four per-env-retention backup policies — VM (azurerm_backup_policy_vm), Azure Files/"Storage" (azurerm_backup_policy_file_share), SQL-in-VM (azurerm_backup_policy_vm_workload), AKS (azurerm_data_protection_backup_policy_kubernetes_cluster). Per-env profiles (dev 7d / staging 14d+5w / prod 30d+12w+12m+7y) expand daily/weekly/monthly/yearly tiers via dynamic blocks; plan-time preconditions enforce CRR⇒GeoRedundant and the yearly⇒monthly⇒weekly nesting. Defines reusable policies (not per-instance bindings); vault MIs exported for consumers. GeoRedundant +cross_region_restore_enabled= the L2 on-ramp. OfflineTestBackupPolicyValidategate.[M2b][B][CO]🟦 code-complete (v0.46) → Test:docs/runbooks/test/L1.md— fmt/validate + Terratest validate (A+B); live: apply both vaults + four policies to sandbox, assert redundancy/retention, destroy (C). - 🟦 L2 —
modules/azure/cross-region-replication/— cross-region replication wiring[B]: blob object replication (azurerm_storage_object_replication, source→DR account, rule per container mapping; optionally creates the destination containers) + geo-redundant SQL failover group (azurerm_mssql_failover_group, primary↔partner server, per-env failover posture). Consumes existing accounts/servers by ARM ID (brownfield-safe wiring, not resource creation — same stance as L1). Per-env SQL failover: dev Manual / staging Automatic 60m / prod Automatic 120m; preconditions enforce cross-region locations differ, distinct accounts/servers, and Automatic⇔grace / Manual⇔no-grace coherence. The active-replication half of DR; L1 is the recoverability half; L4 is the drill. OfflineTestCrossRegionReplicationValidategate.[M2b][B][CO]🟦 code-complete (v0.47) → Test:docs/runbooks/test/L2.md— fmt/validate + Terratest validate (A+B); live: apply two storage accounts + two SQL servers + the links to sandbox, assert failover group Automatic/120m + cross-region, destroy (C). - ⬜ L3 — DR runbook templates.
[M4][A][CO]⏸️ postponed - 🟦 L4 —
apps/restore-drill/— automated restore drill[B]: standalone TS tool (pure offline logic + thin executor seam + jest, same mold as E0/S1/S2) that restores an L1 backup (or fails over an L2 SQL failover group) into an ephemeral sandbox RG → validates → tears down → records a versionedRestoreDrillReport(schemaVersion 1.0). Outcome classified passed/partial/failed (partial = recovered but RTO missed or teardown failed); measured RTO = restore+validate duration;diffReportsis the recoverability-regression signal. Executors:DryRunExecutor(deterministic — tests/demos/workflow rehearsal) +AzureCliExecutor(liveaz). Reports land in thecompliance/restore-drills/evidence store; S2 gains an additive--restore-drills-dir"DR restore drills" panel (gated, so its golden output is unchanged) — that's how pass/fail reaches the dashboard. Scheduled via.github/workflows/restore-drill.yml(monthly cron; dispatch defaults to dry-run, schedule runs live; commits the report). Teardown always runs (X7 backstop).[M2b][B][CO]🟦 code-complete (v0.48) → Test:docs/runbooks/test/L4.md— offline classify/orchestrate/render + S2 panel wiring (A+B, 17 + 34 tests); live: real restore→validate→teardown in the sandbox, dated report to the evidence store (C). - ⬜ L5 — RTO/RPO doc generator.
[M4][A][CO]⏸️ postponed
M. Data Protection & Privacy [B]/[A]
-
🟦 M1 —
modules/azure/encryption-policy/— customazurerm_policy_set_definitionDeny initiative: encryption-at-rest built-ins (storage infrastructure encryption, SQL CMK, managed-disk double-encryption + CMK). Initiative-leveleffectparameter (Audit/Deny/Disabled). No system-assigned identity (Deny effect, not DINE). GUIDs caller-overridable.[M2a][B][CO]→ Test: Terratest validate (offline). Live Audit→Deny rollout + "unencrypted create denied" is manual. -
🟦 M2 —
modules/azure/cmk/— Customer-Managed Key: HSM-backedazurerm_key_vault_key(RSA-HSM/EC-HSM only, software keys rejected) + auto-rotation policy (rotate_before_expiry_days < expire_after_daysprecondition) in an EXISTING F5 Premium RBAC-mode vault. Optional user-assigned identity auto-granted Crypto Service Encryption User. Consumers wire to versionless key ID for transparent rotation.[M2a][B][CO]→ Test: Terratest validate +TestCMKModuleintegration (~$1 — Premium vault + deployer Crypto Officer + HSM key, asserts versionless ID + rotation policy). -
🟦 M3 —
modules/azure/tls-policy/— custom Deny initiative: secure-transport built-ins (storage secure-transfer, storage min-TLS, App Service + Function HTTPS-only). Two initiative parameters:effect+minimumTlsVersion. Thestorage_min_tlsreference threads BOTH via explicitparameter_values.[M2a][B][CO]→ Test: Terratest validate (offline). Live Audit→Deny + "HTTP/TLS<1.2 create denied" is manual. -
⬜ M4–M5, M7 — Purview baseline / DLP policies / GDPR-CCPA evidence.
[M4][A]⏸️ postponed -
🟦 M6 —
modules/azure/data-residency-policy/— custom Deny initiative: Allowed-locations built-ins (resources + optional resource groups) withlistOfAllowedLocations. Noeffectparameter (Allowed-locations is intrinsic-Deny); rollout usesenforce=false→true. Standalone residency boundary distinct from F1's bundled allowed-locations.[M2a][B][CO]→ Test: Terratest validate (offline).enforce=false→true+ "out-of-region create denied" is manual.
N. Network Security [B]/[A]
- ⬜ N1 — Landing-zone hub-spoke (extends F2).
[M2a][B][CO]⏸️ postponed -
⬜ N2–N4 — Azure Firewall Premium / WAF / DDoS.
[M2b/M4]⏸️ postponed -
🟦 N5 —
modules/azure/private-endpoint-policy/— custom Deny initiative: "disable public network access" built-ins (storage / Key Vault / Cosmos DB / SQL). Initiative-leveleffectparameter. No system-assigned identity. Pairs with F4/F5 (PEs) + F2 (Private DNS). Curated, caller-overridable GUIDs.[M2a][B][CO]→ Test: Terratest validate (offline). Audit→Deny rollout + "public PaaS create denied" is manual. -
🟦 N6 —
modules/azure/nsg-baseline/— hardened NSG (dynamic security_ruleover mergedbaseline_rules+custom_rulesmap; curated defaults deny SSH/RDP/Internet-inbound; cross-map key collision = plan-time merge error) + optional subnet associations + optional NSG flow logs + Traffic Analytics (10-min interval, gated on BOTH workspace ID AND flow-log storage account ID). Standalone counterpart to F2's bundled NSG.[M2a][B][CO]→ Test: Terratest validate +TestNSGBaselineModuleintegration (~$0, flow logs OFF). "Flow logs landing within 10 min" is manual Part D. -
⬜ N7 — Zero-trust reference architecture.
[M4][A][CO]⏸️ postponed
O–Q. Endpoint, Vendor Risk, HR Security [A]
All items O1–O4, P1–P4, Q1–Q5 are ⏸️ postponed to M4.
R. Change Management [B]/[A]
-
🟩 R1 — PR template enforcement + required-fields validation workflow.
[M1][QW][CA]→ Signed off 26/05 (Sagar Chhabra). -
⬜ R2–R4 — Production change log / Emergency change / CAB automation via E7. ⏸️ postponed
S. Continuous Compliance Monitoring & Drift [B]/[A]
-
🟦 S1 —
apps/drift-detector/— Drift detection[B]: read-only TS tool that turnsterraform show -jsoninto a versionedDriftReport(schemaVersion 1.0), classifies managed-resource changes (create/update/delete/replace; no-op + data reads excluded), and files/updates one ticket per stack.diffReportsis the change signal (mirrors E0'sdiffSnapshots). Ships theTicketPlatforminterface (D15 — E7 seed) with GitHub Issues + dry-run adapters; idempotent upsert via an embedded dedupe marker. Scheduled via.github/workflows/drift-detection.yml(daily cron, per-stack matrix). Plans only, never applies.[M2b][B][CO]🟦 code-complete (v0.44) → Test:docs/runbooks/test/S1.md— offline classifier/diff/ticket (A+B); live: mutate sandbox resource, next run opens/updates the issue (C). -
🟦 S2 —
apps/compliance-dashboard/— Azure Policy compliance dashboard[B]: fully offline TS tool that renders a history of E0ComplianceSnapshots into a versionedComplianceDashboard(schemaVersion 1.0) plus a self-contained static HTML page + markdown summary. Current posture, latest-vs-previous regressiondelta(vendors E0'sdiffSnapshots), a trend line, and a best-effort name-based framework rollup (SOC 2 / ISO 27001 / CIS Azure / HIPAA; MCSB fans out; unmatched → "Unmapped"). Reads thecompliance/snapshots/evidence store (fed by E0); never touches Azure. Scheduled via.github/workflows/compliance-dashboard.yml(E0 collect → S2 render → artifact; Pages opt-in).[M2b][B][CO]🟦 code-complete (v0.45) → Test:docs/runbooks/test/S2.md— offline aggregation/framework/render golden (A+B); live: workflow collects a snapshot and uploads a renderable dashboard (C). -
⬜ S3–S4 — Auto-remediation playbooks / Compliance scorecard.
[M4][A]⏸️ postponed
T. Trust Center & Customer-Facing [A]
T1–T4 all ⏸️ postponed to M4.
U. Cost Governance [B]/[A]
-
🟦 U1 —
modules/azure/budget-alert/—azurerm_consumption_budget_subscription(Monthly, configurable thresholds) + dynamicnotificationblocks (actual + forecasted) + optional dedicatedazurerm_monitor_action_group(email/sms/webhook, reusing H7 pattern) + optionalfilter_tag/filter_resource_groups. 4 preconditions (≤5 notifications per Azure cap, ≥1 recipient). Build-tagged integration test (~$0).[M2a][B][CO]→ Test: Terratest validate +TestBudgetAlertModuleintegration asserts budget ARM ID + action-group ID + 4 notification keys + recipient_count. -
🟦 U2 —
modules/azure/tag-policy/— mandatory-tag Deny initiative:Require a tag on resources+Require a tag on resource groupsper tag. Per-reference literaltagName(initiative has no parameters — each reference needs a distinct value, unlike M6's shared parameter). Default §8 set minus ManagedBy.enforce=false→truerollout (Deny is intrinsic to these built-ins).[M2a][B][CO]→ Test: Terratest validate (offline).enforce=false→true+ "untagged create denied" is manual. -
⬜ U3–U5 — Idle resource cleanup / FinOps dashboard / Cost anomaly.
[M4/M5]⏸️ postponed
V. Documentation & Policy Management [B]/[A]
-
⬜ V1 — Policy repo template (InfoSec, AUP, IR, BCP, Change Mgmt, Vendor).
[M4][A][CO]⏸️ postponed -
🟦 V2 —
apps/diagram-generator/— zero-cloud TS tool:terraform output -json→ F0 contracts → cloud-neutralStackModel→d2langarchitecture diagram. Shape-based contract detection (name-agnostic; classifies by fields not output names). Deterministic renderer (stableslug()node ids + sorted emission → clean golden-file diffs).[M2b][B][CO]⬜ in scope (14-asset core) → Test: jest (2 suites / 9 tests) — adapter normalizes all F0 contracts from sample stack + golden-file byte match + determinism + empty-stack safety. Manual runbook:docs/runbooks/test/V2.md. -
🟦 V3 —
apps/runbook-generator/— zero-cloud TS CLI:terraform output -json→ shape-based adapter (all 7 F0 contracts) → per-domain operational runbooks (infrastructure.mdindex + identity/network/compute/registry/secrets/storage/observability) with key-facts tables, Day-Zero Hardening posture digest (✅/⚠️), ops CLI + failure modes. Mirrors V2's adapt→model→render pattern. Legacy Handlebars mode kept via--templates.[M2b][B][CO]⬜ in scope (14-asset core) → Test: jest (4 suites / 16 tests) — parser + legacy Handlebars + shape-based adapt + model-driven render. Manual runbook:docs/runbooks/test/V3.md. -
⬜ V4 — Compliance manual generator.
[M4][A][SH]⏸️ postponed
W. Multi-Tenant Client Management [X][SO] ⏸️ ALL POSTPONED (D35)
W1–W3 are marked [M2a→postponed] and are NOT part of the M2a-complete bar. Pull up only when a second concurrent client forces the multi-tenant isolation question, or after M2b/M3 are underway.
- ⬜⏸️ W1 — Client repo template + provisioning (extends B1).
- ⬜⏸️ W2 — Per-client state backend (extends F6).
- ⬜⏸️ W3 — Per-client secret scoping in SnowOps GH org (environments).
- ⬜ W4 — Client offboarding playbook.
[M3] - ⬜ W5 — SnowOps internal client dashboard.
[M5]
X. Testing Framework & Sandbox [X]
Nothing in F/B/E/etc. can be marked 🟩 Shipped without X1, X2, X6 in place.
- 🟦 X1 — SnowOps Azure sandbox subscription (Terraform-managed): isolated tenant or sub, budget-capped, auto-cleanup tag
ephemeral=true.[M1] - 🟦 X2 — Terratest harness (
tests/terratest/): Go-based, parallel-safe, sandbox-scoped; mandatory per F module. Currently 35 top-level tests.[M1] - 🟩 X3 — Conftest test suite for D3 (
policy/opa/tests/).[M1] - 🟦 X4 — Kyverno test framework for D4 (
policy/kyverno/tests/). 5 sub-suites / 21 assertions +run-tests.shwrapper + pre-push hook.[M2a] - ⬜ X5 — Pipeline integration tests (test consumer repos for reusable workflows).
[M2b]⏸️ postponed - ⬜ X6 — Manual test runbooks (
docs/runbooks/test/<asset_id>.md).[ongoing] - 🟦 X7 —
sandbox/cleanup/+.github/workflows/sandbox-cleanup.yml— nightly cleanup ofephemeral=trueRGs. Three guards: (1)ephemeral=truetag only, (2) protected-name globs, (3)--min-age-hours(default 6). 12 offline assertions. Cron 03:17 UTC (always deletes); dispatch (defaults dry-run).[M2a][SO] - ⬜ X8 — Synthetic monitoring.
[M2b]⏸️ postponed
Y. Go-To-Market & Sales Engine [X][SO] — all 🟦
See docs/context/04-asset-status.md § Y for full table.
All 14 assets (Y0–Y13) drafted under gtm/. Human sign-offs pending.
Z. Reference Architectures [X][SO→CO] — all 🟦
See docs/context/04-asset-status.md § Z for full table.
All 4 assets (Z0–Z3) drafted under gtm/z/. Human sign-offs pending.