Why Snap CD: Event-driven Continuous Deployment
Infrastructure deployment usually starts simple: someone runs terraform apply on their laptop, eyeballs the plan, and hits yes. That works fine with a small team and a handful of resources. But as the infrastructure grows — more states, more teams, more environments — the question shifts from "how do I apply this" to "how do I make sure the right things deploy at the right time, in the right order, without someone babysitting the process."
This guide walks through the common approaches to automating Terraform deployments, where each one breaks down, and how Snap CD's event-driven model addresses the gaps.
The manual era
Every Terraform project starts here:
cd infra/networking
terraform plan -out=plan.tfplan
# read the output carefully...
terraform apply plan.tfplan
This is fine until it isn't. The problems are well-known:
- No audit trail. Who applied what, when? You'd need to grep shell history or hope someone wrote it down.
- No ordering guarantee. If networking needs to be applied before compute, that lives in someone's head. A new team member doesn't know.
- Drift between environments. Dev gets the latest change; prod doesn't, because someone forgot.
- Stale plans. You run
planat 2pm, get distracted,applyat 5pm. The world may have changed in between.
Most teams move away from manual applies within months of going to production.
Scheduled CI pipelines
The natural next step is to put terraform apply in CI. A pipeline runs on every merge to main, or on a cron schedule:
# Typical CI approach
on:
push:
branches: [main]
paths: ['infra/networking/**']
jobs:
apply:
steps:
- run: terraform init
- run: terraform apply -auto-approve
This solves the audit trail (CI logs everything) and drift (cron catches config drift eventually). But it introduces new problems:
- No dependency awareness. You can trigger networking's pipeline on a path filter, but compute doesn't know to re-run when networking's outputs change. You end up writing brittle pipeline glue: "after networking finishes, trigger compute, then trigger DNS."
- Wasted runs. A cron-based pipeline runs every 15 minutes whether anything changed or not. Most runs produce empty plans.
- Blast radius of
-auto-approve. If the pipeline auto-applies, a bad commit deploys immediately. If it doesn't, someone still has to watch it and click approve — you've automated theinitandplanbut not the decision. - Cross-repo coordination. If networking and compute live in different repos, the path filter approach doesn't help. You need webhook chains or a shared orchestration layer.
Teams at this stage typically spend significant time maintaining CI configuration that is, in effect, a hand-rolled deployment orchestrator.
GitOps-style operators
Tools like Atlantis and similar Terraform GitOps operators move the trigger model closer to what you want: watch a repo, run plan on PR, apply on merge. This is a genuine improvement over raw CI — the plan is visible in the PR, approvals happen in the code review flow.
But the model has limits:
- Single-state focus. Most GitOps operators work within one repository or one state. They don't model the relationship between your networking state and your compute state.
- No cascading. When networking outputs change, nothing tells the compute operator to re-plan. You're back to manual coordination or webhook scripts.
- Approval is binary. You can approve a PR, but you can't say "this plan needs two approvals before apply" or "destroy plans need a different approval threshold than regular changes."
Event-driven deployment with Snap CD
Snap CD's approach is different: instead of triggering on CI events and bolting on dependency management after the fact, it models the dependency graph as a first-class concept and triggers deployments based on changes to that graph.
Source watching
Every Snap CD Module points at a source — a Git repository at a specific revision:
resource "snapcd_module" "networking" {
name = "networking"
namespace_id = snapcd_namespace.platform.id
source_url = "https://github.com/myorg/infra-networking.git"
source_revision = "main"
runner_id = data.snapcd_runner.platform.id
}
Snap CD periodically checks the source for new commits. When it finds one, it triggers a plan. No CI pipeline configuration, no webhooks, no path filters.
If you prefer version-based releases over branch tracking, use semantic version ranges:
resource "snapcd_module" "networking" {
name = "networking"
namespace_id = snapcd_namespace.platform.id
source_url = "https://github.com/myorg/infra-networking.git"
source_revision = "v2.*"
source_revision_type = "SemanticVersionRange"
runner_id = data.snapcd_runner.platform.id
}
This tells Snap CD to resolve the latest v2.x.y tag. When you push v2.4.0, Snap CD picks it up and triggers a plan. Tags outside the range (like v3.0.0) are ignored.
Dependency cascading
The real power shows up when you wire Modules together. Suppose your compute Module needs the VPC ID and subnet IDs from networking:
resource "snapcd_module" "compute" {
name = "compute"
namespace_id = snapcd_namespace.platform.id
source_url = "https://github.com/myorg/infra-compute.git"
source_revision = "main"
runner_id = data.snapcd_runner.platform.id
}
resource "snapcd_module_input_from_output" "vpc_id" {
input_kind = "Param"
module_id = snapcd_module.compute.id
name = "vpc_id"
output_module_id = snapcd_module.networking.id
output_name = "vpc_id"
}
resource "snapcd_module_input_from_output" "private_subnet_ids" {
input_kind = "Param"
module_id = snapcd_module.compute.id
name = "private_subnet_ids"
output_module_id = snapcd_module.networking.id
output_name = "private_subnet_ids"
}
Now Snap CD knows: compute depends on networking. When networking applies and its outputs change — say you added a new subnet — compute automatically re-plans with the updated values. No webhook. No CI trigger. No glue script.
This cascading is transitive. If DNS depends on compute, and compute depends on networking, a change to networking ripples through:
networking outputs change
→ compute re-plans and applies
→ compute outputs change
→ dns re-plans and applies
Independent Modules run in parallel. If both compute and database depend on networking but not on each other, they re-plan simultaneously.
Configuration-driven triggers
Source changes aren't the only trigger. If you update a Module's definition — change an input value, reassign it to a different Runner, modify a hook — Snap CD detects the configuration change and triggers a re-plan.
This means your Terraform provider code is the single source of truth. Changing a variable in your Snap CD configuration:
resource "snapcd_module_input_from_literal" "cluster_version" {
input_kind = "Param"
module_id = snapcd_module.compute.id
name = "kubernetes_version"
literal_value = "1.30" # was "1.28"
type = "String"
}
…triggers a re-plan of the compute Module with the new value. The same way a commit to the source repo would.
Approval gates
Not every plan should auto-apply. Snap CD lets you set approval thresholds at the Module or Namespace level:
resource "snapcd_module" "database" {
name = "database"
namespace_id = snapcd_namespace.platform.id
source_url = "https://github.com/myorg/infra-database.git"
source_revision = "main"
runner_id = data.snapcd_runner.platform.id
apply_approval_threshold = 1
destroy_approval_threshold = 2
}
With apply_approval_threshold = 1, Snap CD pauses after planning and waits for at least one principal to approve before applying. Destroy operations require two separate approvals.
You can set defaults at the Namespace level so all Modules within it inherit the same policy:
resource "snapcd_namespace" "production" {
name = "production"
stack_id = data.snapcd_stack.main.id
default_apply_approval_threshold = 1
default_destroy_approval_threshold = 2
default_approval_timeout_minutes = 60
}
Individual Modules can override the Namespace defaults. A low-risk monitoring Module might not need approval; a database Module might need two.
Putting it together
Consider an infrastructure setup with four states: networking, compute, database, and DNS. In a traditional CI setup, you'd maintain four separate pipelines with webhook triggers, shell scripts to pass outputs between them, and manual ordering logic scattered across CI configuration files.
With Snap CD, the same setup is four Modules with explicit dependency wiring:
networking (watches main branch)
├── compute (takes vpc_id, subnet_ids from networking)
│ └── dns (takes load_balancer_ip from compute)
└── database (takes subnet_ids, security_group_id from networking)
A commit to infra-networking that changes a subnet:
- Networking re-plans and applies.
- Snap CD detects networking's outputs changed.
- Compute and database re-plan in parallel (independent of each other).
- Compute applies. Its outputs change (new load balancer IP).
- DNS re-plans and applies with the new IP.
- Database applies. No downstream dependents, cascade stops.
All of this happens without any CI configuration. The dependency graph lives in Terraform code (the snapcd_module_input_from_output resources), not in CI pipeline YAML.
When to use what
Event-driven deployment isn't always necessary. Here's a rough guide:
- Single state, single team: manual applies or a simple CI pipeline are fine. You don't need an orchestrator.
- Multiple states, one team: a CI pipeline with some output-passing glue works, but starts to get brittle. Snap CD simplifies the wiring.
- Multiple states, multiple teams: this is where event-driven deployment pays for itself. The dependency graph is explicit, ordering is automatic, and approval gates let each team control their own blast radius.
- Version-based releases: if you tag infrastructure modules with semantic versions and want controlled rollouts, Snap CD's version range tracking is built for this.
Tips
- Start with one Namespace. Put your first few Modules in a single Namespace to learn the trigger model. Split into multiple Namespaces later when you need different default approval policies or Runner assignments.
- Use
source_revision_type = "SemanticVersionRange"for production. Trackingmainis fine for dev, but production Modules should pin to a version range so you control exactly when changes roll out. - Set approval thresholds on destructive Modules first. Databases and DNS are the obvious candidates — the resources where a bad apply is hardest to undo.
- Don't over-wire dependencies. Only create
snapcd_module_input_from_outputresources for values that actually flow between Modules. Not every Module needs to depend on every other Module. - Watch the cascade. When networking changes, the cascade might touch five downstream Modules. That's the point — but make sure those Modules have appropriate approval thresholds if you want a human in the loop.
See also
- Modular Deployments — how the Module, Namespace, and Stack hierarchy works in detail
- Self-Hosted Terraform Runners with Credential Isolation — how Runners execute plans triggered by events
- A Permission System Built for Infrastructure — approval gates and RBAC for event-driven workflows
- Detecting and Managing Terraform Drift — scheduled drift checks as another event trigger
- Non-invasive Orchestration — how Snap CD runs standard Terraform without wrappers