Skip to content

Manual Test Runbook — X1: SnowOps Sandbox Subscription

Owner: Sagar  |  Time: ~25 min  |  Sandbox: the very subscription this stack is provisioning (chicken-and-egg — first run is against a freshly empty sub)

Purpose

Validate that sandbox/ Terraform delivers everything X1 promises: budget alerts, mandatory-tag enforcement, allowed-locations restriction, ephemeral-audit signal, Defender free-tier baseline, Activity Log forwarding.

Prerequisites

  • Azure subscription exists; subscription ID + tenant ID captured
  • You have Owner on that subscription (PIM-activated if applicable)
  • Local tooling: az v2.50+, terraform v1.6+
  • az login --tenant <TENANT_ID> completed; az account show returns the sandbox sub
  • Budget alert email recipients confirmed with Nidhi

Steps

1. Bootstrap the state backend (one-time)

cd sandbox/bootstrap
./bootstrap.sh <SUBSCRIPTION_ID> eastus
  • Script completes without error
  • Resource group snowops-sandbox-state-rg exists in the portal
  • Storage account snowopssandboxstate* exists with RA-GZRS, TLS 1.2 min, blob public access disabled
  • Container sandbox-tfstate exists
  • Copy the printed backend block into sandbox/backend.hcl

2. Configure and apply

cd sandbox
cp terraform.tfvars.example terraform.tfvars
# Edit subscription_id, tenant_id, budget_alert_emails
terraform init -backend-config=backend.hcl
terraform plan -out tfplan
terraform apply tfplan
  • terraform init connects to the remote backend successfully
  • terraform plan shows expected resources (RG + workspace + budget + ~7 policy assignments + ~10 Defender pricing entries + 1 diagnostic setting + 1 custom policy definition)
  • terraform apply completes without error

3. Budget verification

  • Portal → Cost Management → Budgets shows snowops-sandbox-monthly-budget
  • Amount matches var.budget_amount_monthly_usd
  • Three notifications visible (50%, 80%, 100%) with the correct emails

4. Policy enforcement verification

  • Portal → Policy → Assignments at subscription scope shows all six SnowOps assignments
  • Try to create an untagged storage account:
    az storage account create --name testuntagged$RANDOM --resource-group snowops-sandbox-observability-rg --location eastus
    
    Result: denied with policy violation
  • Try to create a resource in a non-allowed region (e.g., southeastasia): Result: denied by Allowed Locations
  • Create a fully-tagged but non-ephemeral storage account:
    az storage account create --name testephem$RANDOM --resource-group snowops-sandbox-observability-rg --location eastus \
      --tags Environment=sandbox Owner=test ManagedBy=manual Client=snowops-internal CostCenter=snowops-rnd ephemeral=false
    
    Result: created, but Policy compliance view shows it as non-compliant against snowops-audit-ephemeral after a few minutes

5. Defender baseline

  • Portal → Defender for Cloud → Environment settings → sandbox sub: all plans show Free tier (no surprise Standard-tier billing)

6. Activity Log forwarding

  • Trigger any control-plane action (e.g., the policy test in step 4)
  • Wait ~10 min, then in Log Analytics workspace snowops-sandbox-law:
    AzureActivity | order by TimeGenerated desc | take 20
    
    Result: entries from the last 10 minutes appear

7. Throwaway resource lifecycle (X1 acceptance criterion)

  • Create a throwaway resource tagged ephemeral=true plus mandatory tags
  • Confirm it appears in the workspace's activity log and is compliant with the audit-ephemeral policy
  • Delete it manually (X7 will do this nightly once shipped) — confirm clean delete

8. Teardown

terraform destroy
# Optional: also delete the state backend
az group delete --name snowops-sandbox-state-rg --yes --no-wait
  • terraform destroy removes all sandbox-managed resources cleanly
  • State storage account deleted (if desired)

Pass criteria

  • All six policy assignments visible at sub scope; deny-on-untagged + allowed-location-restriction observed
  • Budget present with three notifications at correct thresholds + recipients
  • Defender all-plans = Free tier
  • Activity Log entries land in the workspace within 15 min
  • Ephemeral audit signal correctly fires on a non-ephemeral resource
  • terraform destroy returns the sub to a clean state
  • No drift on a second terraform plan immediately after apply

Failure modes & escalation

Symptom Likely cause Action
Policy definition not found on apply Tenant lacks built-in policy Verify az policy definition list --query "[?displayName=='Require a tag on resources']" returns a row; raise with Azure support if not
Authorization failed on policy assignment Caller not Owner on sub Activate Owner via PIM and re-apply
Budget API rate-limited Created during a billing-cycle boundary Re-run apply 5 min later
Storage account name collision Random suffix collided with another tenant Re-run bootstrap (idempotent — creates a fresh suffix)

Sign-off

  • Tester: ___  |  Date: _  |  Result: PASS / FAIL / N/A
  • Notes: