Manual Test Runbook — X5: Pipeline Integration Tests
Owner: Sagar | Time: ~4 min (Part A offline) · +20–40 min (Part C live consumers) | Cloud: none for Part A · sandbox for Part C
Promotes X5 (
tests/pipeline-integration/+.github/workflows/pipeline-integration.yml+it-*consumers) from 🟦 Code Complete → 🟩 Shipped. Part A is the offline reusable-workflow contract gate ($0). Part C runs the liveit-*consumers against the X1 sandbox.
Prerequisites
- Local tooling:
python3+ PyYAML (pip install pyyaml) - (Part C only)
ghauthenticated; sandbox secrets/vars configured on the repo: - vars:
SANDBOX_ACR_LOGIN_SERVER,SANDBOX_NOTATION_CERT_KEY_ID,SANDBOX_STATE_RG,SANDBOX_STATE_SA,SANDBOX_ARGOCD_SERVER - secrets:
AZURE_CLIENT_ID,AZURE_TENANT_ID,AZURE_SUBSCRIPTION_ID,ARGOCD_AUTH_TOKEN - Working directory: repo root
Steps
Part A — offline contract gate (~4 min, $0)
- Run the full offline gate (unit tests + repo contract check):
Expected: ==> OK: X5 offline gate passed. — 11 unit tests pass and the repo
contract check reports OK: every caller honours its reusable-workflow
contract. (one expected WARN for the client-only image-scan/I1 workflow).
- (Optional) Prove the gate actually catches drift. Temporarily break a caller and confirm a FAIL, then revert:
# add a bogus input to a template caller
sed -i.bak 's/ image_name:/ bogus_input: x\n image_name:/' \
templates/client-repo/.github/workflows/container-build-sign.yml
python3 tests/pipeline-integration/contract_check.py; echo "exit: $?" # expect FAIL + exit 1
mv templates/client-repo/.github/workflows/container-build-sign.yml.bak \
templates/client-repo/.github/workflows/container-build-sign.yml
Expected: FAIL …: unknown input(s) not declared by the reusable workflow: bogus_input, exit 1.
- Confirm the CI gate wiring parses:
python3 -c "import yaml; yaml.safe_load(open('.github/workflows/pipeline-integration.yml')); print('OK')"
Part C — live consumers against the sandbox (~20–40 min)
Each
it-*workflow is dispatch-only and runs the corresponding reusable workflow end-to-end. Run the ones whose sandbox dependencies are available.
- C2 — container build/sign/scan (needs sandbox ACR + AKV cert):
gh workflow run it-container-build-sign.yml -f fixture=clean
gh workflow run it-container-build-sign.yml -f fixture=vulnerable -f fail_on_scan_findings=true
Expected: clean succeeds (image pushed + signed + Grype passes);
vulnerable fails at the Grype gate (proves fail-on-findings). Clean up the
pushed repos per container-build-sign/README.md.
- C3 — AKS deploy (needs sandbox AKS + ArgoCD). Install the fixture app once:
kubectl apply -f tests/pipeline-integration/aks-deploy/argocd-app.yaml
gh workflow run it-aks-deploy.yml
gh workflow run it-aks-deploy.yml -f rollback_drill=true
Expected: deploy syncs + smoke probe returns 200; the rollback drill deploys
a bad image and asserts auto-rollback. Tear down per aks-deploy/README.md.
- C1 — terraform plan (needs sandbox state backend):
Expected: OIDC login + backend init + fmt/validate + OPA post-plan gate all
pass in plan-only mode; a compliance snapshot artifact is emitted. Nothing is
applied (plan_only: true).
Pass criteria
- Part A —
validate.shpasses (11 unit tests + clean repo contract check) - Part A — the drift-injection check (step 2) produces a FAIL + exit 1
- (Part C) at least one
it-*consumer runs green against the sandbox - (Part C) the C2
vulnerablefixture fails the scan gate (negative path) - All sandbox test artifacts cleaned up
Failure mode
A reusable workflow's workflow_call interface changes and a caller is not
updated. Detected offline by contract_check.py (the PR gate) rather than at run
time in a client repo. A caller targeting a workflow at a pinned @ref is
checked against the current definition — bump the pin when the interface
changes (noted in the README).
Cost impact
Part A is $0 (pure parsing). Part C costs sandbox ACR storage (a few cents, purged by X7) + transient AKS workload; C1 is plan-only ($0).
Removal path
Delete tests/pipeline-integration/ (checker + fixtures), the it-* consumer
workflows, and .github/workflows/pipeline-integration.yml. No infra is created
by the offline gate; Part C artifacts are sandbox-only and X7-cleaned.
Sign-Off
| Field | Value |
|---|---|
| Part A (offline gate) | ☐ PASS |
| Part A (drift detection) | ☐ PASS |
| Part C (live consumers) | ☐ PASS / ☐ skipped |
| Tester | |
| Date | |
| Result | ☐ PASS |