Skip to content

Manual Test Runbook — X8: Synthetic Monitoring

Owner: Sagar  |  Time: ~8 min (Parts A + B offline) · +15 min (optional Part C integration apply)  |  Sandbox: snowops-sandbox-01

Promotes X8 (modules/azure/synthetic-monitoring/) from 🟦 Code Complete → 🟩 Shipped. Parts A + B are offline ($0). Part C applies an Application Insights component + two availability tests + alerts to the sandbox (~$0–$1 — per-execution billing) and destroys.


Prerequisites

  • Sandbox subscription access active (PIM activated if required)
  • az login done; sandbox subscription selected
  • Identity has Contributor on the sandbox sub
  • A sandbox Log Analytics workspace ARM ID (for the workspace-based AI component)
  • SNOWOPS_SANDBOX_SUBSCRIPTION_ID + SNOWOPS_SANDBOX_TENANT_ID exported
  • Local tooling: terraform >= 1.6, go >= 1.22, az CLI >= 2.50
  • Working directory: repo root

Steps

Part A — terraform fmt + validate (offline, ~3 min)

  1. Module + example:
terraform -chdir=modules/azure/synthetic-monitoring fmt -recursive -check
terraform -chdir=modules/azure/synthetic-monitoring init -backend=false -input=false
terraform -chdir=modules/azure/synthetic-monitoring validate

terraform -chdir=modules/azure/synthetic-monitoring/examples/basic init -backend=false -input=false
terraform -chdir=modules/azure/synthetic-monitoring/examples/basic validate

Expected: Success! for both.

  1. Offline Terratest case:
cd tests/terratest
go test -v -timeout 5m ./modules/azure/... -run TestSyntheticMonitoringValidate

Expected: PASS — exercises the workspace-based AI creation path, a GET test (content match + SSL check) and a POST test (headers + SSL off), the dynamic header/content blocks, and the per-test availability alerts.

Part B — full Terratest suite (offline, ~5 min)

  1. bash cd tests/terratest && go test -count=1 -timeout 15m ./...

Expected: the full suite green (the new TestSyntheticMonitoringValidate included).

Part C — integration apply (sandbox, ~15 min, ~$0–$1)

  1. Apply the fixture against a real sandbox workspace. Edit the fixture's workspace_id to a real sandbox LAW ARM ID (and optionally point a web test at a URL you control), then:
cd tests/terratest/fixtures/synthetic-monitoring
terraform init -input=false
terraform apply -auto-approve \
  -var "subscription_id=$SNOWOPS_SANDBOX_SUBSCRIPTION_ID" \
  -var "tenant_id=$SNOWOPS_SANDBOX_TENANT_ID"
  1. Confirm the component, web tests, and alerts exist:
RG=snowops-x8-test-rg
az monitor app-insights component show -g "$RG" --app snowops-x8-test-ai \
  --query "{name:name, kind:kind, workspace:workspaceResourceId}" -o json
az resource list -g "$RG" --resource-type "Microsoft.Insights/webtests" \
  --query "[].name" -o table
az monitor metrics alert list -g "$RG" \
  --query "[].{name:name, severity:severity, enabled:enabled}" -o table

Expected: the AI component is workspace-based; two web tests (snowops-x8-test-home, snowops-x8-test-health-api); two availability alerts.

  1. (Optional, signal test) Point a test at a URL that returns a non-200 (or an unreachable host), wait two evaluation cycles (~5–10 min), and confirm the availability alert fires (Alerts blade / the action group notifies).

  2. Destroy:

terraform destroy -auto-approve \
  -var "subscription_id=$SNOWOPS_SANDBOX_SUBSCRIPTION_ID" \
  -var "tenant_id=$SNOWOPS_SANDBOX_TENANT_ID"

Expected: clean destroy — web tests, alerts, the AI component, and the RG are removed. (A real sandbox workspace passed by ID is untouched.)


Pass criteria

  • Part A — module + example validate; TestSyntheticMonitoringValidate passes
  • Part B — full offline suite passes
  • (Part C) fixture applies the AI component + 2 web tests + 2 alerts; destroys clean
  • (Part C, optional) a down endpoint fires the availability alert
  • All test resources removed

Failure mode

An availability alert that can't fire because failed_location_count exceeds the number of geo_locations, or too few locations cause flapping. Documented in the README; a plan-time precondition rejects failed_location_count > geo_locations.

Cost impact

Per-execution billing: a few cents to low single-dollars per test per month (locations × runs) at the default 5-minute frequency, plus availability-result ingestion. Metric alerts ~$0.10 each/month. Part C apply is ~$0–$1.

Removal path

terraform destroy (Part C step 7) removes every resource and the RG. Verified clean in Part C.


Sign-Off

Field Value
Part A (validate) ☐ PASS
Part B (offline suite) ☐ PASS
Part C (integration apply) ☐ PASS / ☐ skipped
Part C signal test ☐ PASS / ☐ skipped
Tester
Date
Result ☐ PASS