Skip to content

Manual Test Runbook — C1: terraform-plan-apply reusable workflow

Owner: Sagar  |  Time: ~30 min  |  Sandbox: X1

Purpose

Validate that .github/workflows/terraform-plan-apply.yml (C1) correctly:

  1. Lints fmt on every event.
  2. Plans against the X1 sandbox stack and posts a PR comment.
  3. Gates apply behind the sandbox GitHub Environment and runs it only on merge to main when the plan reported changes.

Prerequisites

  • X1 sandbox subscription provisioned and sandbox/ applied at least once (DoD criterion 4 for X1 may still be pending; that's OK — C1 only needs the state backend to exist).
  • GitHub repo has these repository variables set (Settings → Secrets and variables → Actions → Variables):
  • SANDBOX_STATE_RG = snowops-sandbox-state-rg
  • SANDBOX_STATE_SA = snowopssandboxstate<random> (from bootstrap.sh output)
  • SANDBOX_STATE_KEY = sandbox.tfstate
  • GitHub repo has these repository secrets:
  • AZURE_CLIENT_ID = the Azure AD app registration consuming OIDC (created out-of-band; B2 will codify this)
  • AZURE_TENANT_ID
  • AZURE_SUBSCRIPTION_ID
  • GitHub Environment sandbox exists. Optionally configure:
  • Required reviewers (Sagar, Nidhi)
  • Wait timer
  • Deployment branch rule: main only
  • Azure AD federated credential on the app registration permits this repo. Two subjects, one each for plan and apply:
  • repo:<org>/snowops-automation:pull_request — for plan-on-PR
  • repo:<org>/snowops-automation:environment:sandbox — for apply
  • Repo branch protection: main requires the sandbox-plan-apply / terraform / terraform plan check before merge.

Steps

1. Confirm local prerequisites

terraform -chdir=sandbox fmt -check -recursive
terraform -chdir=sandbox init -backend=false -input=false
terraform -chdir=sandbox validate
  • All three commands pass cleanly.

2. Open a no-op PR

Create a branch with a comment-only change in sandbox/:

git checkout -b test/c1-no-op
printf '\n# c1 test\n' >> sandbox/README.md
git commit -am "test: trigger C1 with no-op change"
git push -u origin test/c1-no-op
gh pr create --title "Test C1: no-op" --body "Verifies terraform-plan-apply on a no-op."
  • sandbox-plan-apply workflow runs.
  • fmt job succeeds.
  • plan job succeeds with exit code 0 (no infra changes).
  • A PR comment titled "Terraform plan — sandbox" appears reading "✅ No changes — plan matches state."
  • No apply job runs (PR event).

3. Open a change-bearing PR

git checkout -b test/c1-budget-bump
# Bump the sandbox budget cap in sandbox/terraform.tfvars by 50, or
# add a placeholder tag to extra_tags via tfvars.
git commit -am "test: bump sandbox budget for C1 verification"
git push -u origin test/c1-budget-bump
gh pr create --title "Test C1: changes" --body "Verifies plan comment + apply gate."
  • PR comment is updated in place (not duplicated) on subsequent pushes.
  • Comment shows "📝 Changes proposed" with the expected diff in the <details> block.
  • Workflow run shows tfplan-<run-id> artifact uploaded.
  • No apply yet (PR event).

4. Merge to main and watch the gate

  • Merge the PR (squash). Confirm the push-triggered run starts.
  • plan job succeeds with exit 2 (changes).
  • apply job enters "Waiting for review" if reviewers required, OR runs immediately if none.
  • Apply approval succeeds; terraform apply consumes the saved tfplan.binary and completes without re-planning.
  • Azure portal reflects the new state (e.g., budget amount updated).

5. Drift test (manual mutation)

# Manually edit the budget in the portal (or via az CLI) to a different amount.
az consumption budget list --query "[?name=='snowops-sandbox-monthly-budget']"
  • Open a no-op PR. Plan now reports the manual drift as a change to revert.
  • This validates that state is being read from the remote backend, not from a local file.

6. Failure-mode probe

  • Temporarily introduce a syntax error in sandbox/budget.tf. Push to a PR.
  • fmt job fails fast (no Azure auth attempted). Revert.

Pass criteria

  • Plan comment renders on PRs and updates in place on resync.
  • No apply runs on PR events.
  • Apply runs only on push-to-main + plan-exit-2 + env approval (when required).
  • Plan binary artifact is the one consumed by apply (no replan between approval and apply).
  • fmt failure blocks the workflow before any auth attempt.
  • Drift surfaces on the next no-op PR.

Failure modes & escalation

Symptom Likely cause Action
azure/login fails with AADSTS70025 / 70021 Federated credential subject mismatch Verify subject strings exactly match repo:<org>/<repo>:pull_request and …:environment:sandbox
Plan succeeds but apply hangs at "Waiting" Reviewers not set on environment Edit environment → Required reviewers, or remove the requirement
Apply re-plans from scratch tfplan.binary artifact missing or mismatched run Check artifact upload step ran; ensure same github.run_id between jobs
Plan comment posts twice Existing-comment detection broke (e.g., header changed) Inspect actions/github-script step in workflow logs
Backend reinit required Lock file drifted Re-run; cache key includes lock-file hash and refreshes

Sign-off

  • Tester: ___  |  Date: _  |  Result: PASS / FAIL / N/A
  • Notes: