Skip to content

0001. Azure-first, cloud-agnostic by contracts

  • Status: Accepted
  • Date: 2026-05-27
  • Formalizes: D6, D18, D19
  • Deciders: Sagar, SnowOps engineering

Context

SnowOps's ICP is predominantly Azure today, but a meaningful share of prospects run AWS or GCP (see G13). We need to ship real, opinionated infrastructure now without painting ourselves into a single-cloud corner that makes AWS/GCP parity a rewrite.

Two failure modes to avoid: (1) a premature lowest-common-denominator abstraction that makes the Azure modules weaker than hand-written Azure; (2) deeply Azure-coupled modules with no seam, forcing a fork-and-rewrite for every new cloud.

Decision

We will be Azure-first but cloud-agnostic by construction, enforced through explicit interface contracts rather than a runtime abstraction layer:

  • modules/_contracts/ (F0) defines 7 cloud-agnostic contracts: network, identity, cluster, registry, key-vault, observability, and object_store. (object_store was added so F6 has a conforming shape — D18.)
  • Conformance is checked at terraform validate time via the candidate pattern (D19): each contract module exposes a typed variable "candidate" and an output "candidate" that echoes it. A provider module pipes its output through the contract module; mismatches fail validation. Strict on literals, permissive on inter-module unknowns.
  • Azure implementations (modules/azure/) are the deep, opinionated default. AWS/GCP (modules/aws/, modules/gcp/) are placeholders until M5 and must satisfy the same contracts.

Consequences

  • Easier: adding a second cloud is "implement the contract", not "redesign". The contract surface is small and explicit, and CI catches drift from it.
  • Easier: Azure modules stay first-class — the contract constrains the output shape, not the internal richness.
  • Harder: every new capability that crosses a contract boundary needs a contract change first, which is a deliberate speed bump.
  • Risk accepted: the contracts were defined after F1–F6 already shipped (gap G3), so some retrofitting was required; F0 is now sequenced first within M2a.

Alternatives considered

  • Runtime abstraction layer (e.g. Crossplane / Pulumi multi-cloud component): rejected — adds an operational dependency and dilutes Azure depth for a multi-cloud story we don't need until M5.
  • No abstraction, fork per cloud later: rejected — guarantees a rewrite and divergent behavior across clouds.

References

  • modules/_contracts/ (F0), modules/azure/, modules/aws/, modules/gcp/
  • docs/context/11-patterns.md (cloud-agnostic module pattern)
  • Decisions D6, D18, D19 in docs/context/07-decisions.md