Manual Test Runbook — X8: Synthetic Monitoring
Owner: Sagar | Time: ~8 min (Parts A + B offline) · +15 min (optional Part C integration apply) | Sandbox: snowops-sandbox-01
Promotes X8 (
modules/azure/synthetic-monitoring/) from 🟦 Code Complete → 🟩 Shipped. Parts A + B are offline ($0). Part C applies an Application Insights component + two availability tests + alerts to the sandbox (~$0–$1 — per-execution billing) and destroys.
Prerequisites
- Sandbox subscription access active (PIM activated if required)
-
az logindone; sandbox subscription selected - Identity has Contributor on the sandbox sub
- A sandbox Log Analytics workspace ARM ID (for the workspace-based AI component)
-
SNOWOPS_SANDBOX_SUBSCRIPTION_ID+SNOWOPS_SANDBOX_TENANT_IDexported - Local tooling:
terraform >= 1.6,go >= 1.22,az CLI >= 2.50 - Working directory: repo root
Steps
Part A — terraform fmt + validate (offline, ~3 min)
- Module + example:
terraform -chdir=modules/azure/synthetic-monitoring fmt -recursive -check
terraform -chdir=modules/azure/synthetic-monitoring init -backend=false -input=false
terraform -chdir=modules/azure/synthetic-monitoring validate
terraform -chdir=modules/azure/synthetic-monitoring/examples/basic init -backend=false -input=false
terraform -chdir=modules/azure/synthetic-monitoring/examples/basic validate
Expected: Success! for both.
- Offline Terratest case:
Expected: PASS — exercises the workspace-based AI creation path, a GET test (content match + SSL check) and a POST test (headers + SSL off), the dynamic header/content blocks, and the per-test availability alerts.
Part B — full Terratest suite (offline, ~5 min)
bash cd tests/terratest && go test -count=1 -timeout 15m ./...
Expected: the full suite green (the new TestSyntheticMonitoringValidate included).
Part C — integration apply (sandbox, ~15 min, ~$0–$1)
- Apply the fixture against a real sandbox workspace. Edit the fixture's
workspace_idto a real sandbox LAW ARM ID (and optionally point a web test at a URL you control), then:
cd tests/terratest/fixtures/synthetic-monitoring
terraform init -input=false
terraform apply -auto-approve \
-var "subscription_id=$SNOWOPS_SANDBOX_SUBSCRIPTION_ID" \
-var "tenant_id=$SNOWOPS_SANDBOX_TENANT_ID"
- Confirm the component, web tests, and alerts exist:
RG=snowops-x8-test-rg
az monitor app-insights component show -g "$RG" --app snowops-x8-test-ai \
--query "{name:name, kind:kind, workspace:workspaceResourceId}" -o json
az resource list -g "$RG" --resource-type "Microsoft.Insights/webtests" \
--query "[].name" -o table
az monitor metrics alert list -g "$RG" \
--query "[].{name:name, severity:severity, enabled:enabled}" -o table
Expected: the AI component is workspace-based; two web tests
(snowops-x8-test-home, snowops-x8-test-health-api); two availability
alerts.
-
(Optional, signal test) Point a test at a URL that returns a non-200 (or an unreachable host), wait two evaluation cycles (~5–10 min), and confirm the availability alert fires (Alerts blade / the action group notifies).
-
Destroy:
terraform destroy -auto-approve \
-var "subscription_id=$SNOWOPS_SANDBOX_SUBSCRIPTION_ID" \
-var "tenant_id=$SNOWOPS_SANDBOX_TENANT_ID"
Expected: clean destroy — web tests, alerts, the AI component, and the RG are removed. (A real sandbox workspace passed by ID is untouched.)
Pass criteria
- Part A — module + example validate;
TestSyntheticMonitoringValidatepasses - Part B — full offline suite passes
- (Part C) fixture applies the AI component + 2 web tests + 2 alerts; destroys clean
- (Part C, optional) a down endpoint fires the availability alert
- All test resources removed
Failure mode
An availability alert that can't fire because failed_location_count exceeds the
number of geo_locations, or too few locations cause flapping. Documented in the
README; a plan-time precondition rejects failed_location_count > geo_locations.
Cost impact
Per-execution billing: a few cents to low single-dollars per test per month (locations × runs) at the default 5-minute frequency, plus availability-result ingestion. Metric alerts ~$0.10 each/month. Part C apply is ~$0–$1.
Removal path
terraform destroy (Part C step 7) removes every resource and the RG. Verified
clean in Part C.
Sign-Off
| Field | Value |
|---|---|
| Part A (validate) | ☐ PASS |
| Part B (offline suite) | ☐ PASS |
| Part C (integration apply) | ☐ PASS / ☐ skipped |
| Part C signal test | ☐ PASS / ☐ skipped |
| Tester | |
| Date | |
| Result | ☐ PASS |