Skip to content

Manual Test Runbook — F4: Azure Container Registry

Owner: Sagar  |  Time: ~30 min (Parts A + B) · +20 min (optional Part D push-pull)  |  Sandbox: snowops-sandbox-01

Promotes F4 (modules/azure/acr/) from 🟦 Code Complete → 🟩 Shipped. Part B costs ~$5 (Premium ACR hourly + PE). Skip Part C / D if not iterating on the registry data plane.


Prerequisites

  • Sandbox subscription access active (PIM activated if required)
  • az login done; az account show confirms the sandbox subscription is selected
  • Identity has Contributor + User Access Administrator on the sandbox subscription
  • Local tooling: terraform >= 1.6, go >= 1.22, az CLI >= 2.50, docker, notation (for Part D only)
  • SNOWOPS_SANDBOX_SUBSCRIPTION_ID and SNOWOPS_SANDBOX_TENANT_ID env vars set
  • Working directory: repo root

Steps

Part A — terraform fmt + validate (offline, ~1 min)

  1. Confirm formatting + structural validity of the module on its own:
terraform -chdir=modules/azure/acr fmt -check
terraform -chdir=modules/azure/acr init -backend=false -input=false
terraform -chdir=modules/azure/acr validate

Expected: Success! The configuration is valid.

  1. Run the offline Terratest suite for the whole azure/ tree:
cd tests/terratest
go test -v -timeout 5m ./modules/azure/... -run 'TestACRValidate|TestF4RegistryContractConformance|TestContractsRejectBadLiterals'

Expected: 3 top-level tests pass; the registry-missing-login-server sub-test under TestContractsRejectBadLiterals is the F4-relevant negative case.


Part B — full Terratest suite (offline, ~3 min)

  1. Run the whole offline suite to confirm F4 hasn't regressed F1/F2/F6:
cd tests/terratest
go test -v -timeout 10m ./...

Expected: ~11 top-level tests pass (TestNoopHarness, TestBaselineValidate, TestStateBackendValidate, TestSandboxValidate, TestF1ContractConformance, TestF6ObjectStoreContractConformance, TestF2NetworkContractConformance, TestNetworkHubValidate, TestACRValidate, TestF4RegistryContractConformance, TestContractsRejectBadLiterals with 6 sub-tests).


Part C — integration test (real Azure apply + destroy, ~30 min, ~$5)

Skip if iterating on offline changes only. The dominant cost is the Premium ACR ($0.92/hr) — keep the test short.

  1. Export sandbox env vars (same as F1 / F2 / F6):
export SNOWOPS_SANDBOX_SUBSCRIPTION_ID="<sandbox-subscription-guid>"
export SNOWOPS_SANDBOX_TENANT_ID="<sandbox-tenant-guid>"
  1. Run the F4 integration test:
cd tests/terratest
go test -v -tags integration -timeout 60m ./modules/azure/... -run TestACRModule
  1. Watch for key milestones:
  2. Plan: ~15 to add, 0 to change, 0 to destroy. — F2 (RG + hub vnet + 1 hub subnet + 1 spoke vnet + 2 spoke subnets + 2 NSGs + 2 NSG associations
    • 2 peerings + 1 Private DNS zone + 2 vnet links) + F4 (RG + ACR + PE + PE psc + PE dns zone group) = ~15 resources.
  3. azurerm_container_registry.this: Creation complete after ~30s — fast.
  4. azurerm_private_endpoint.this: Still creating... — typically ~2 min.
  5. All output assertions PASS.
  6. Destroy complete! — clean teardown.

Part D — synthetic push + verify-from-PE (optional, ~20 min)

Verifies the data plane (private DNS resolution + PE pull) works end-to-end. Requires docker + (for signature verification) notation CLI.

  1. After Part C apply but before destroy, log in to the ACR over its public endpoint (the integration fixture keeps public_network_access_enabled = false, so push must go via a peered VM or via the AzCLI tunnel az acr login over the private endpoint when DNS resolves to the PE). Cheapest path: deploy a temp VM into the apps/workload subnet of the F2 spoke.
az vm create --resource-group "<f2-net-rg>" \
  --name "f4-push-vm" \
  --image "Ubuntu2204" \
  --vnet-name "<f2-spoke-vnet>" \
  --subnet "workload" \
  --public-ip-address "" \
  --admin-username "snowops" \
  --generate-ssh-keys \
  --size "Standard_B2s"
az vm run-command invoke --resource-group "<f2-net-rg>" --name "f4-push-vm" \
  --command-id "RunShellScript" \
  --scripts "sudo apt-get update -y && sudo apt-get install -y docker.io && sudo usermod -aG docker snowops"
  1. From the VM, push a sample image to the ACR via its private endpoint:
az ssh vm --resource-group "<f2-net-rg>" --name "f4-push-vm" -- \
  -t "az login --identity && \
      az acr login --name <registry-name> && \
      docker pull alpine:3.20 && \
      docker tag alpine:3.20 <registry-name>.azurecr.io/snowops/alpine:3.20 && \
      docker push <registry-name>.azurecr.io/snowops/alpine:3.20"

Expected: push succeeds. DNS lookup on the VM resolves <registry-name>.azurecr.io to the PE's private IP (10.10.2.x in the F2 fixture's address space) — confirm via:

az ssh vm --resource-group "<f2-net-rg>" --name "f4-push-vm" -- \
  -t "dig +short <registry-name>.azurecr.io"
  1. Pull the image back from a separate identity context to prove the PE resolves consistently:
az ssh vm --resource-group "<f2-net-rg>" --name "f4-push-vm" -- \
  -t "docker pull <registry-name>.azurecr.io/snowops/alpine:3.20"

Expected: pull succeeds from the same PE.

  1. Optional — sign with Notation v2 and verify (full D4 + F4 loop):

    # On the VM, after `notation` install (see notation.com docs):
    notation cert generate-test --default "snowops-test"
    notation sign --signature-format jws <registry-name>.azurecr.io/snowops/alpine:3.20
    notation verify <registry-name>.azurecr.io/snowops/alpine:3.20
    

    Expected: signature operations succeed; ACR stores the signature manifest alongside the image manifest in the same repository.

  2. Delete the probe VM, then terraform destroy will tear down the rest.


Pass criteria

  • Part A — terraform validate passes for the module
  • Part B — full offline Terratest suite passes (11+ top-level tests)
  • Part C — TestACRModule integration test passes end-to-end
  • Registry created at Premium SKU
  • public_network_access_enabled = false confirmed via portal or az acr show --query publicNetworkAccess
  • admin_enabled = false confirmed via portal or az acr show --query adminUserEnabled
  • Private Endpoint reachable; private IP non-empty; A-record present in F2's privatelink.azurecr.io zone
  • (Part D) Push + pull from VM in F2 spoke succeeds over Private Endpoint
  • (Part D) dig resolves <registry-name>.azurecr.io to the PE private IP
  • (Part D) Notation v2 sign + verify round-trip succeeds
  • All Destroy calls complete without error
  • No orphaned resource groups remain (verify with az group list -o table)
  • All test resources tagged ephemeral = true (X7 cleanup safety net)

Teardown

The integration test runs terraform destroy automatically. If a failure mid-run orphans resources, clean up manually:

# Two RGs: <name_prefix>-net-rg (F2) and <name_prefix>-acr-rg (F4)
az group delete --name "<name_prefix>-net-rg" --yes --no-wait
az group delete --name "<name_prefix>-acr-rg" --yes --no-wait

Premium ACR destroy takes ~30 seconds; the PE detach can stretch to ~2 minutes.


Sign-off

  • Tester: _  |  Date: _  |  Result: PASS / FAIL / N/A
  • Notes: