Skip to content

Manual Test Runbook — J2: Enforce Diagnostic Settings (Azure Policy)

Owner: Sagar  |  Time: ~10 min (Parts A + B) · +15 min (optional Part C live apply) · +30 min (optional Part D DINE remediation drill)  |  Sandbox: snowops-sandbox-01

Promotes J2 (modules/azure/policy-diagnostics/) from 🟦 Code Complete → 🟩 Shipped. Part C/D cost ~$0 (policy assignment is free; a throwaway Key Vault is pennies). J2 is validate-only in the automated suite — a live apply needs built-in diagnostic policy GUIDs sourced per-tenant, so the apply + remediation drill are manual here (same pattern as H3 / B5).


Prerequisites

  • Sandbox subscription access active (PIM activated if required)
  • az login done; az account show confirms the sandbox subscription is selected
  • Identity has Owner OR Contributor + User Access Administrator on the sandbox sub (UAA needed because J2 grants the assignment identity remediation roles)
  • A central Log Analytics workspace exists (F1's or a J1 workspace) — note its ARM ID
  • Local tooling: terraform >= 1.6, go >= 1.22, az CLI >= 2.50, jq
  • Working directory: repo root

Steps

Part A — terraform fmt + validate (offline, ~3 min)

  1. Module + example:
terraform -chdir=modules/azure/policy-diagnostics fmt -recursive -check
terraform -chdir=modules/azure/policy-diagnostics init -backend=false -input=false
terraform -chdir=modules/azure/policy-diagnostics validate

terraform -chdir=modules/azure/policy-diagnostics/examples/basic init -backend=false -input=false
terraform -chdir=modules/azure/policy-diagnostics/examples/basic validate

Expected: Success! The configuration is valid. for both.

  1. Offline Terratest case:
cd tests/terratest
go test -v -timeout 5m ./modules/azure/... -run TestPolicyDiagnosticsValidate

Expected: PASS. Exercises the initiative + dynamic policy references (default + explicit parameter_values) + the system-assigned-identity assignment + remediation role grants.


Part B — full Terratest suite (offline, ~3 min)

  1. Confirm no regression:
cd tests/terratest
go test -count=1 -timeout 12m ./...

Expected: full offline suite green (27 top-level tests across all packages).


Part C — live apply (sandbox, ~15 min, ~$0)

  1. Source real built-in diagnostic policy GUIDs for your tenant:
az policy definition list \
  --query "[?policyType=='BuiltIn' && contains(displayName,'diagnostic settings') && contains(displayName,'Log Analytics')].{name:name, display:displayName}" \
  -o table

Pick (for example) the Key Vault + Storage diagnostic policies and build their full ARM IDs: /providers/Microsoft.Authorization/policyDefinitions/<name>.

  1. Apply the example, passing the sourced GUIDs + a real workspace:
cd modules/azure/policy-diagnostics/examples/basic
terraform init -input=false
terraform apply -auto-approve \
  -var "subscription_id=$SNOWOPS_SANDBOX_SUBSCRIPTION_ID" \
  -var "tenant_id=$SNOWOPS_SANDBOX_TENANT_ID" \
  -var "workspace_id=<central-LAW-ARM-ID>" \
  -var 'diagnostic_policy_definition_ids={"keyvault"="/providers/Microsoft.Authorization/policyDefinitions/<kv-guid>","storage"="/providers/Microsoft.Authorization/policyDefinitions/<sa-guid>"}'
  1. Confirm the assignment + identity + role grants:
ASSIGN_ID=$(terraform output -raw assignment_id)
PRINCIPAL=$(terraform output -raw assignment_principal_id)
echo "assignment=$ASSIGN_ID principal=$PRINCIPAL"

az policy assignment show --name snowops-diagnostics --query "{name:name, identity:identity.type, enforcement:enforcementMode}" -o json
az role assignment list --assignee "$PRINCIPAL" --query "[].roleDefinitionName" -o tsv

Expected: assignment exists with SystemAssigned identity + Default enforcement; the principal holds Log Analytics Contributor + Monitoring Contributor.


Part D — DINE remediation drill (optional, ~30 min)

Proves the catalog criterion: "provision PaaS without diag → policy remediates." DINE auto-deploys on new resources.

  1. Create a fresh Key Vault with no diagnostic setting:
az group create -n j2-probe-rg -l eastus
az keyvault create -n j2probe$RANDOM -g j2-probe-rg -l eastus
VAULT=$(az keyvault list -g j2-probe-rg --query "[0].name" -o tsv)
  1. Trigger evaluation + remediation (DINE on existing resources needs a one-shot remediation task — for the brand-new vault you can either wait for the automatic evaluation cycle ~15–30 min, or force it):
az policy state trigger-scan --resource-group j2-probe-rg   # speeds evaluation
# Then remediate the keyvault reference of the initiative:
az policy remediation create \
  --name j2-probe-remediation \
  --resource-group j2-probe-rg \
  --policy-assignment snowops-diagnostics \
  --definition-reference-id keyvault
  1. After a few minutes, confirm a diagnostic setting was deployed onto the vault forwarding to the central workspace:
VAULT_ID=$(az keyvault show -n "$VAULT" -g j2-probe-rg --query id -o tsv)
az monitor diagnostic-settings list --resource "$VAULT_ID" \
  --query "[].{name:name, workspace:workspaceId}" -o json

Expected: at least one diagnostic setting whose workspaceId is the central workspace. If none appears, check the remediation task status (az policy remediation show ...) and the identity's role grants.

  1. Cleanup the probe:

    az group delete -n j2-probe-rg --yes --no-wait
    

Pass criteria

  • Part A — module + example validate; TestPolicyDiagnosticsValidate passes
  • Part B — full offline Terratest suite passes
  • (Part C) assignment created with SystemAssigned identity; both remediation roles granted
  • (Part D) a fresh Key Vault gets a diagnostic setting deployed to the central workspace via DINE remediation
  • All test resources removed; no orphaned probe RG

Teardown

cd modules/azure/policy-diagnostics/examples/basic
terraform destroy -auto-approve \
  -var "subscription_id=$SNOWOPS_SANDBOX_SUBSCRIPTION_ID" \
  -var "tenant_id=$SNOWOPS_SANDBOX_TENANT_ID" \
  -var "workspace_id=<central-LAW-ARM-ID>" \
  -var 'diagnostic_policy_definition_ids={"keyvault"="...","storage"="..."}'

az group delete -n j2-probe-rg --yes --no-wait   # if Part D ran

Destroying the assignment stops future remediation but leaves diagnostic settings already deployed onto resources in place (they belong to the target resources). Remove the probe vault's RG to clean those up.


Sign-off

  • Tester: _  |  Date: _  |  Result: PASS / FAIL / N/A
  • Notes: