Manual Test Runbook — J2: Enforce Diagnostic Settings (Azure Policy)
Owner: Sagar | Time: ~10 min (Parts A + B) · +15 min (optional Part C live apply) · +30 min (optional Part D DINE remediation drill) | Sandbox: snowops-sandbox-01
Promotes J2 (
modules/azure/policy-diagnostics/) from 🟦 Code Complete → 🟩 Shipped. Part C/D cost ~$0 (policy assignment is free; a throwaway Key Vault is pennies). J2 is validate-only in the automated suite — a live apply needs built-in diagnostic policy GUIDs sourced per-tenant, so the apply + remediation drill are manual here (same pattern as H3 / B5).
Prerequisites
- Sandbox subscription access active (PIM activated if required)
-
az logindone;az account showconfirms the sandbox subscription is selected - Identity has Owner OR Contributor + User Access Administrator on the sandbox sub (UAA needed because J2 grants the assignment identity remediation roles)
- A central Log Analytics workspace exists (F1's or a J1 workspace) — note its ARM ID
- Local tooling:
terraform >= 1.6,go >= 1.22,az CLI >= 2.50,jq - Working directory: repo root
Steps
Part A — terraform fmt + validate (offline, ~3 min)
- Module + example:
terraform -chdir=modules/azure/policy-diagnostics fmt -recursive -check
terraform -chdir=modules/azure/policy-diagnostics init -backend=false -input=false
terraform -chdir=modules/azure/policy-diagnostics validate
terraform -chdir=modules/azure/policy-diagnostics/examples/basic init -backend=false -input=false
terraform -chdir=modules/azure/policy-diagnostics/examples/basic validate
Expected: Success! The configuration is valid. for both.
- Offline Terratest case:
Expected: PASS. Exercises the initiative + dynamic policy references (default
+ explicit parameter_values) + the system-assigned-identity assignment +
remediation role grants.
Part B — full Terratest suite (offline, ~3 min)
- Confirm no regression:
Expected: full offline suite green (27 top-level tests across all packages).
Part C — live apply (sandbox, ~15 min, ~$0)
- Source real built-in diagnostic policy GUIDs for your tenant:
az policy definition list \
--query "[?policyType=='BuiltIn' && contains(displayName,'diagnostic settings') && contains(displayName,'Log Analytics')].{name:name, display:displayName}" \
-o table
Pick (for example) the Key Vault + Storage diagnostic policies and build their
full ARM IDs: /providers/Microsoft.Authorization/policyDefinitions/<name>.
- Apply the example, passing the sourced GUIDs + a real workspace:
cd modules/azure/policy-diagnostics/examples/basic
terraform init -input=false
terraform apply -auto-approve \
-var "subscription_id=$SNOWOPS_SANDBOX_SUBSCRIPTION_ID" \
-var "tenant_id=$SNOWOPS_SANDBOX_TENANT_ID" \
-var "workspace_id=<central-LAW-ARM-ID>" \
-var 'diagnostic_policy_definition_ids={"keyvault"="/providers/Microsoft.Authorization/policyDefinitions/<kv-guid>","storage"="/providers/Microsoft.Authorization/policyDefinitions/<sa-guid>"}'
- Confirm the assignment + identity + role grants:
ASSIGN_ID=$(terraform output -raw assignment_id)
PRINCIPAL=$(terraform output -raw assignment_principal_id)
echo "assignment=$ASSIGN_ID principal=$PRINCIPAL"
az policy assignment show --name snowops-diagnostics --query "{name:name, identity:identity.type, enforcement:enforcementMode}" -o json
az role assignment list --assignee "$PRINCIPAL" --query "[].roleDefinitionName" -o tsv
Expected: assignment exists with SystemAssigned identity + Default
enforcement; the principal holds Log Analytics Contributor + Monitoring
Contributor.
Part D — DINE remediation drill (optional, ~30 min)
Proves the catalog criterion: "provision PaaS without diag → policy remediates." DINE auto-deploys on new resources.
- Create a fresh Key Vault with no diagnostic setting:
az group create -n j2-probe-rg -l eastus
az keyvault create -n j2probe$RANDOM -g j2-probe-rg -l eastus
VAULT=$(az keyvault list -g j2-probe-rg --query "[0].name" -o tsv)
- Trigger evaluation + remediation (DINE on existing resources needs a one-shot remediation task — for the brand-new vault you can either wait for the automatic evaluation cycle ~15–30 min, or force it):
az policy state trigger-scan --resource-group j2-probe-rg # speeds evaluation
# Then remediate the keyvault reference of the initiative:
az policy remediation create \
--name j2-probe-remediation \
--resource-group j2-probe-rg \
--policy-assignment snowops-diagnostics \
--definition-reference-id keyvault
- After a few minutes, confirm a diagnostic setting was deployed onto the vault forwarding to the central workspace:
VAULT_ID=$(az keyvault show -n "$VAULT" -g j2-probe-rg --query id -o tsv)
az monitor diagnostic-settings list --resource "$VAULT_ID" \
--query "[].{name:name, workspace:workspaceId}" -o json
Expected: at least one diagnostic setting whose workspaceId is the central
workspace. If none appears, check the remediation task status
(az policy remediation show ...) and the identity's role grants.
-
Cleanup the probe:
Pass criteria
- Part A — module + example validate;
TestPolicyDiagnosticsValidatepasses - Part B — full offline Terratest suite passes
- (Part C) assignment created with
SystemAssignedidentity; both remediation roles granted - (Part D) a fresh Key Vault gets a diagnostic setting deployed to the central workspace via DINE remediation
- All test resources removed; no orphaned probe RG
Teardown
cd modules/azure/policy-diagnostics/examples/basic
terraform destroy -auto-approve \
-var "subscription_id=$SNOWOPS_SANDBOX_SUBSCRIPTION_ID" \
-var "tenant_id=$SNOWOPS_SANDBOX_TENANT_ID" \
-var "workspace_id=<central-LAW-ARM-ID>" \
-var 'diagnostic_policy_definition_ids={"keyvault"="...","storage"="..."}'
az group delete -n j2-probe-rg --yes --no-wait # if Part D ran
Destroying the assignment stops future remediation but leaves diagnostic settings already deployed onto resources in place (they belong to the target resources). Remove the probe vault's RG to clean those up.
Sign-off
- Tester: _ | Date: _ | Result: PASS / FAIL / N/A
- Notes: