Why Snap CD: A Permission System Built for Infrastructure

Karl Schriek·March 01, 2026

Most infrastructure teams handle access control in one of two places: the CI/CD layer or the cloud provider's IAM layer. Neither maps well to how infrastructure is actually structured.

CI permissions are usually binary — you can trigger a pipeline or you can't. There's no concept of "this person can deploy networking but not databases." Cloud IAM is more granular, but it governs what credentials can do, not what people can do within your deployment workflow. You end up with a gap: the system that understands your infrastructure topology has no permission model, and the system that has a permission model doesn't understand your infrastructure topology.

Snap CD sits in that gap. It provides a hierarchical role-based access control system that maps directly to the way you organise your infrastructure — Stacks, Namespaces, and Modules and Runners — and enforces it uniformly whether actions come through the web dashboard, the API, or the Terraform provider.

The two common approaches and where they break down

CI/CD gating

The simplest form of infrastructure access control: who can trigger the pipeline?

Most CI systems give you repository-level permissions. If you have write access to the repo, you can trigger the workflow. Some offer environment-level protection rules — require approval from a specific team before deploying to prod.

This works until your infrastructure spans multiple repositories, or until you need more granularity than "can deploy to this environment." Can this person create new modules but not delete existing ones? Can they approve a plan but not trigger an apply? CI systems don't model these distinctions.

There's also the backdoor problem. Protection rules only apply to CI-triggered runs. Anyone with the right credentials can run terraform apply from their laptop and bypass every gate you've set up.

Cloud IAM

Cloud providers have sophisticated permission systems — Azure RBAC, AWS IAM, GCP IAM. These control what API calls a principal can make against cloud resources. But they operate at the wrong abstraction level for deployment workflows.

Cloud IAM doesn't know that your VPC, subnets, and route tables form a logical "networking" group that one team owns. It doesn't know that module-compute depends on module-networking and should only be deployable after networking is stable. It can tell you whether a service principal can create an EC2 instance, but it can't tell you whether a human should be allowed to approve the plan that creates it.

You end up encoding deployment permissions across multiple systems — repo access in GitHub, environment protection rules in Actions, IAM policies in AWS — with no single place to answer "who can do what to which part of my infrastructure?"

Snap CD's permission model

Snap CD's permission system is built around two ideas: roles describe what you can do, and scope determines where you can do it.

Principals

Three types of identity can hold role assignments:

Users — human operators, authenticated via the identity provider.
Service principals — machine identities for automation, CI pipelines, and API integrations.
Groups — collections of users or service principals, for managing permissions at team scale.

Roles

Roles define a set of allowed operations. The same role names appear across different scope levels, with context-appropriate permissions:

Owner — full control, including the ability to delete the resource and manage role assignments on it.
Contributor — create, update, and manage child resources, but cannot delete the resource itself or manage role assignments.
Reader — read-only access.
IdentityAccessManager — can manage role assignments on this resource without having full Owner control.

Additional roles exist at specific scope levels:

StackCreator (organization) — can create new Stacks.
NamespaceCreator (Stack) — can create new Namespaces within the Stack.
ModuleCreator (Namespace) — can create new Modules within the Namespace.
Approver — can approve deployment plans.
JobManager — can manage deployment jobs (cancel, retry).
SourceChangeNotifier — can notify the system of source changes (used by webhooks and CI integrations).

Scope hierarchy

Role assignments are scoped to a specific level in the hierarchy. Permissions granted at a higher level flow down to all children:

Organization
  └── Stack (e.g. "prod", "test")
        └── Namespace (e.g. "prod/networking", "prod/application")
              └── Module (e.g. "prod/networking/vpc")

Runners sit outside this hierarchy — they have their own scope. A Runner's Owner controls which Modules are allowed to execute on it.

A role assigned at the organization level applies everywhere. A role assigned at a specific Module applies only to that Module. This means you can express both broad policies ("the platform team is Reader on the entire organization") and narrow exceptions ("except they're Owner on the networking Namespace").

Concrete examples

Platform team owns networking, reads everything else

The platform team manages all networking infrastructure but should only observe application deployments:

resource "snapcd_organization_role_assignment" "platform_reader" {
  principal_id            = snapcd_group.platform_team.id
  principal_discriminator = "Group"
  role_name               = "Reader"
}

resource "snapcd_namespace_role_assignment" "platform_owns_networking" {
  namespace_id            = snapcd_namespace.networking.id
  principal_id            = snapcd_group.platform_team.id
  principal_discriminator = "Group"
  role_name               = "Owner"
}

The platform team gets Reader at the org level (they can see everything) and Owner on the networking Namespace (they can deploy, approve, and manage Modules within it). They cannot modify or deploy anything in the application Namespace.

Junior engineer approves test but not prod

A junior team member should be able to approve deployment plans in the test environment but only observe production:

resource "snapcd_stack_role_assignment" "junior_test_contributor" {
  stack_id                = snapcd_stack.test.id
  principal_id            = snapcd_user.junior_engineer.id
  principal_discriminator = "User"
  role_name               = "Contributor"
}

resource "snapcd_stack_role_assignment" "junior_prod_reader" {
  stack_id                = snapcd_stack.prod.id
  principal_id            = snapcd_user.junior_engineer.id
  principal_discriminator = "User"
  role_name               = "Reader"
}

They can trigger plans, approve, and deploy in test. In prod, they can see what's happening but can't change anything.

CI service principal scoped to a single module

An automated deployment pipeline that should only be able to deploy one specific Module:

resource "snapcd_module_role_assignment" "ci_deploys_api" {
  module_id               = snapcd_module.api_gateway.id
  principal_id            = snapcd_service_principal.ci_pipeline.id
  principal_discriminator = "ServicePrincipal"
  role_name               = "Contributor"
}

The service principal can trigger plans and applies on the API gateway Module, but has no access to anything else in the organization. If the pipeline is compromised, the blast radius is limited to a single Module.

Runner access control

Runners execute the actual Terraform commands, so controlling which Modules can use which Runners is a security boundary. A Runner deployed in your production Azure subscription should only execute production Modules:

resource "snapcd_runner_role_assignment" "prod_modules_use_prod_runner" {
  runner_id               = snapcd_runner.azure_prod.id
  principal_id            = snapcd_group.prod_deployers.id
  principal_discriminator = "Group"
  role_name               = "Runner"
}

The Runner role on a Runner resource grants the ability to execute jobs on that Runner. Without it, a Module assignment pointing to this Runner will be rejected.

No backdoors

A common failure mode with CI-based access control is that the gates only apply to one path. Someone with the right cloud credentials can bypass CI entirely and run terraform apply from their laptop.

Snap CD's permission model applies to every interaction path. Whether you click "Approve" in the web dashboard, call the REST API from a script, or manage resources through the Terraform provider, the same role assignments are evaluated. There is no unenforced path.

This also means your access control configuration is auditable in one place. Instead of piecing together GitHub team permissions, CI environment protection rules, and cloud IAM policies to understand who can deploy what, you query Snap CD's role assignments.

Managing permissions as code

Because every role assignment is a Terraform resource, your permission model lives in version control alongside the rest of your infrastructure configuration. Changes go through the same review process as any other infrastructure change — pull request, review, approve, apply.

resource "snapcd_stack" "prod" {
  name            = "prod"
  organization_id = snapcd_organization.main.id
}

resource "snapcd_namespace" "prod_networking" {
  name     = "networking"
  stack_id = snapcd_stack.prod.id
}

resource "snapcd_namespace" "prod_application" {
  name     = "application"
  stack_id = snapcd_stack.prod.id
}

resource "snapcd_stack_role_assignment" "sre_owns_prod" {
  stack_id                = snapcd_stack.prod.id
  principal_id            = snapcd_group.sre.id
  principal_discriminator = "Group"
  role_name               = "Owner"
}

resource "snapcd_namespace_role_assignment" "appdev_contributes_app" {
  namespace_id            = snapcd_namespace.prod_application.id
  principal_id            = snapcd_group.app_developers.id
  principal_discriminator = "Group"
  role_name               = "Contributor"
}

The SRE team owns the entire prod Stack. Application developers can deploy within the application Namespace but cannot touch networking. Both constraints are declared, version-controlled, and enforced at every interaction point.

Tips

Start broad, narrow later. Give your team Contributor at the organization level to start. As you identify boundaries — different teams, different environments, different risk levels — add scoped assignments and remove the broad one.
Use groups, not individual users. Assigning roles to groups means onboarding a new team member is a single group membership change, not a dozen role assignments.
Scope Runners to environments. A Runner with production credentials should only accept jobs from production Modules. Use the Runner role on Runner resources to enforce this.
Treat permissions as infrastructure. Define all role assignments in Terraform. If a role assignment isn't in code, it shouldn't exist.
Audit regularly. Because all role assignments are Terraform resources, terraform plan will show you any drift between your intended permissions and the actual state.