Governance as code

Your engineers define infrastructure as code, version-control it, test it automatically, and deploy it through pipelines. They would not manage server configurations in a Word document, and they would probably find the suggestion more than mildly offensive.

Your security governance, the invariants that constrain what the entire operation must and must not do, likely lives in prose documents on SharePoint or Confluence that a human has to read, interpret, and manually verify against operational reality. That gap is worth pausing on. While it is absurd, it is almost certainly not explained by laziness or lack of tooling.

The gap exists because governance documentation has not historically been structured in a way that makes machine-readable expression possible, and the conventional wisdom about what governance documentation should look like has reinforced the problem. You cannot express “we are committed to protecting customer data” as code; aspiration has no operational meaning a system can check. You can express “customer payment data must never be stored unencrypted at rest” as code, because an invariant is a testable assertion in one or another way.

The previous article introduced a four-layer model with invariants at every level, conformance criteria that point downward through the hierarchy, and a process layer that instruments operational boundaries rather than prescribing methods.

That structure is the prerequisite for what follows. Once governance documentation expresses invariants rather than aspirations, the entire hierarchy becomes expressible as code: machine-readable, version-controlled, and automatically verifiable, with traceability from governance intent to operational evidence built into the system rather than assembled manually for each audit.

Everything is code except governance

Infrastructure as code is mature and policy as code is mature. Compliance as code exists in several forms, and controls as structured data exist through projects like NIST’s OSCAL, even though almost nobody’s using it in anger yet. Each of these generally operates at a specific layer of the stack, and each is useful within its scope.

The “as code” pattern has a consistent shape: take something managed manually through prose and tooling, express it in a machine-readable format, version-control it, automate use and ultimately, profit. Infrastructure as code (Terraform, Pulumi, CloudFormation) did this for servers. Policy as code (OPA, Sentinel) did it for infrastructure rules. The same pattern but this time applied to governance documentation, is the aim of this article.

All of these work bottom-up. Open Policy Agent defines infrastructure rules and enforces them at deployment, AWS Config Rules check system state against configuration baselines, and Chef InSpec tests compliance assertions against running systems. NIST’s OSCAL project provides machine-readable formats for control catalogues and baselines. None of these starts from governance intent and works down; the governance assertions that all these rules ultimately implement are in most cases somewhere in a Word document that nobody has read since the last audit. In my darker moments, I speculate that that Word document hasn’t been read since it was written.

OPA (Open Policy Agent) is a CNCF-graduated project that evaluates policy decisions expressed in its Rego language at runtime. It can enforce rules like “no container may run as root” or “all S3 buckets must have encryption enabled.” What it cannot do is tell you which governance policy those rules implement, because that mapping exists only in someone’s memory or in a prose document.

As far as I can tell, nobody is working from governance invariants downward through standards through process criteria to operational verification in a machine-readable chain. The bottom-up tooling is mature. What does not yet exist in a machine-readable form is the top-down expression of what all that tooling should be checking against. You know, our goals.

The bottom-up tooling is getting mature. The top-down expression of what it should be checking against does not exist in machine-readable form. That is the gap.

Why the gap exists

The reason is more likely structural than technical: you cannot express governance as code if governance is aspirational prose, and “high-level statements of management intent” cannot be parsed by a machine in any meaningful way. Invariants can.

The four-layer model from the previous article may well be what makes governance as code tractable, because every layer expresses testable assertions: policy invariants are organisational constraints that must hold, standard invariants are precise implementation specifications, and process invariants define input and output boundaries. Each of these is a statement that can be evaluated as true or false against operational state, which is exactly what code does.

Once you have made that move, the question changes from “how do we express governance as code?” (which sounds like a tooling problem) to “what format should the assertions take?” (which is an engineering problem with known solutions, after all: if you’ve got JSON schemas, everything must be possible).

The conformance chain

For the purposes of governance as code, the most useful concept from the i2i four-layer model is the conformance chain, where each layer’s conformance criteria generally point to the layer below for evidence.

Policy conformance is demonstrated by the existence of standards that cover the policy’s scope and by passing conformance results from those standards. The policy does not know about specific tools or configurations. It knows that standards should exist below it and that those standards should be verifiable.

Standard conformance asserts three things, and this distinction matters for automation. Design conformance verifies that architecture artefacts specify the standard’s invariants, configuration conformance verifies that recorded system state matches the design, and functional conformance verifies that actual behaviour matches recorded state. I’ve seen organisations where configuration conformance passes on every check while functional conformance would fail if anyone tested it, because a misconfiguration at a different layer, in reality, contradicts and overrides the reported state.

Automation characteristics of the three conformance types

Design conformance can be automated if both the design artefacts and the standard invariants are machine-readable: compare one structured document against another. Configuration conformance is already widely automated (AWS Config, OPA, Sentinel). Functional conformance is the “hardest” (hey guys, it’s a glorified scanner that runs on demand or on a schedule) to automate and is where penetration testing, red team exercises, and chaos engineering sit too. Governance as code can automate the first two types comprehensively and define the assertions for the third, even where the third still might require human execution.

Process conformance is verified against operational systems: queries against ticketing systems for input invariant violations, stalled work items for output invariant violations, and timestamp calculations for performance expectation attainment.

In practice, the chain might look something like this:

# Policy invariant (governance layer)
- id: POL-DATA-001
  assertion: "Customer payment data must never be stored 
              unencrypted at rest"
  scope: "All systems processing payment data"

# Standard invariant (refinement)
- id: STD-ENC-003
  implements: POL-DATA-001
  assertion: "EBS volumes attached to payment-processing instances
              must use customer-managed KMS encryption"
  target: 100%  # Absolute invariant, no tolerance

# Config rule (operational check)
- rule: ebs-encryption-customer-kms
  implements: STD-ENC-003
  check: "aws configservice get-compliance-details-by-config-rule"

This is illustrative, not a proposed standard. The point is that each level references the one above, the traceability is a property of the format rather than something reconstructed for audit, and any engineer who has worked with Terraform or Kubernetes manifests will recognise the pattern immediately.

When the entire chain is machine-readable, traceability is a property of the system rather than something assembled manually for audit. Each infrastructure-level check traces to a standard invariant, each standard invariant traces to a policy invariant, and each process metric traces to a performance expectation. The mapping is built in, and the fundamental audit question (“does the organisation do what it says it does?”) becomes something you can in principle answer continuously rather than annually.

The payoff is a view that no prose-document approach can produce:

# "Does the organisation do what it says it does?"
# Query: Show conformance status for POL-DATA-001

POL-DATA-001: "Customer payment data must never be stored
               unencrypted at rest"
  └─ STD-ENC-003: "EBS volumes use customer-managed KMS"
  │    ├─ config check: 142/142 compliant (last run: 2h ago)
  │    └─ functional test: passed (last run: 14 days ago)
  └─ STD-ENC-004: "RDS instances use AES-256 encryption"
       ├─ config check: 38/38 compliant (last run: 2h ago)
       └─ functional test: passed (last run: 14 days ago)

This is what governance as code produces. A live, queryable, automatically maintained chain from governance intent to operational evidence. Nobody has this today, even organisations whose OPA rules and Config checks are correct, because the governance layer that gives those checks meaning is still a Word document.

Two verification paths

The four-layer model establishes two independent verification paths, and both can become automatically traceable when governance is expressed as code.

Configuration conformance can and probably should be owned by the engineering teams, as discussed in the previous article. It is self-verification: the team checks that its own deployment meets the standard, typically through automated tooling integrated into the deployment pipeline. When the standard invariants are machine-readable, the pipeline gates can reference them directly as their source of truth, and while someone still has to write the enforcement logic, they are writing it against a precise machine-readable specification rather than interpreting a prose document, and ultimately automated from them in the near future. If your standard says “Customer-tagged EBS volumes must always use customer-managed KMS encryption” in a machine-readable format, the Config rule that checks it can reference the standard invariant it implements, and the traceability is automatic.

Functional conformance is owned by the security function. It verifies that controls actually work as designed, through penetration testing, adversarial simulation, and control validation exercises. When the standard invariants are machine-readable, the security function can derive its test scope from the same source that the engineering team’s configuration checks reference, even though the test methods themselves may require expertise the invariants don’t encode.

Process boundaries as structured data

The process layer’s five elements, as defined in the previous article, are all reasonably expressible as structured data, and this is where the error budget connection becomes automatic.

Input criteria become schema definitions: here is what a well-formed vulnerability report looks like, expressed as a data structure that a system can validate against. Output criteria become validation rules: here is what a completed remediation record must contain. Performance expectations become something very much like SLO definitions that monitoring systems can consume directly: 72 hours, 95% attainment, measured weekly. Measurement definitions become monitoring configurations: clock starts on this ticket state transition, clock stops on that one, report monthly, alert at this threshold.

In structured form, a process definition might look like this:

# Process: Vulnerability Management
- id: PROC-VULN-001
  implements: STD-VULN-007
  
  input_criteria:
    required_fields: [asset_id, severity, identifier, discovered_at]
    acceptance: "All well-formed inputs must be accepted"
  
  output_criteria:
    final:
      required_fields: [remediation_evidence, verified_at, closed_by]
      alternatives: [documented_exception, reclassification_record]
  
  performance:
    timeliness:
      critical: { target_hours: 72, attainment: 0.95 }
      high: { target_hours: 168, attainment: 0.90 }
    quality:
      reopen_rate: { target: 0.001, window_days: 30 }
  
  measurement:
    clock_start: "ticket.status == 'confirmed_critical'"
    clock_stop: "ticket.status == 'verified_deployed'"
    reporting: monthly
    alert_threshold: "attainment < 0.90 for 2 consecutive periods"

If you have this and a ticketing system, the attainment calculation is mechanical and the error budget is just the distance between the attainment line and the target. The quality measurement is a query against reopened tickets. None of this requires a separate programme; it requires structured data and a monitoring system you almost certainly already have.

NIST’s December 2025 roadmap for OSCAL (CSWP 53) describes integration with “autonomous risk reasoning and continuous assurance.” The direction is clear, and the process layer’s performance expectations expressed as SLO definitions are the natural input for that kind of automated reasoning. The governance layer is the missing piece that gives the automation its assertions to reason about.

The error budget measurement falls out more or less automatically. If your standard expresses a target and tolerance in machine-readable form (72 hours, 95% attainment) and your process layer expresses the measurement definition in machine-readable form (which clock, which data source, which calculation), the error budget is just the computation of attainment against target. You may not need a separate programme to calculate the error budget; the computation is a consequence of having governance as code with a properly defined process layer. What the computation doesn’t give you is organisational agreement about what happens when the budget is spent, which is a governance decision that remains human.

If your standard expresses targets and tolerances in machine-readable form, and your process layer expresses measurement definitions in machine-readable form, the error budget is a computation rather than an initiative.

Quality expectations work the same way. “Fewer than 0.1% of vulnerability closures reopened within 30 days” is a quality SLO. Express it as structured data, point your monitoring at the ticketing system, and you have continuous quality measurement without building a separate quality programme.

What stays prose

Guidelines are perhaps the one artefact type that remains prose permanently. They express how to exercise judgment within constraints, and judgment is not something a machine can parse. “When evaluating a new third-party integration, here’s how to think about the data residency invariant in a multi-region architecture” requires human reasoning about context, tradeoffs, and organisational priorities that do not reduce to boolean assertions.

This is worth stating because it defines the boundary of governance as code and makes the claim more credible. The argument is not that everything becomes machine-readable. The argument is that invariants, performance expectations, conformance criteria, and measurement definitions become machine-readable, while guidelines remain human-readable companions that help people exercise judgment within machine-verifiable constraints.

Work instructions will also tend to remain primarily prose in many cases, because they describe how humans do work. But where work instructions map to automated processes (CI/CD pipelines, deployment workflows, provisioning scripts), they are already expressed as code. The team’s Terraform module is both the work instruction and the implementation, and if it references the standard invariants it satisfies, the traceability is complete.

Why this matters for growth and scaling companies specifically

Larger organisations have GRC teams, audit preparation functions, and compliance tooling that compensate (imperfectly) for the prose-document problem through brute-force human effort. A 200-person company preparing for an ISO 27001 audit typically spends weeks assembling evidence: screenshots of configurations, exports from ticketing systems, attestations from process owners, all manually mapped to policy statements in documents that may not precisely match operational reality.

A company of that size cannot afford a dedicated compliance team to do this manually on an ongoing basis. But a company of that size can often express governance invariants as code without much difficulty, because that is already how its engineers think about every other system constraint.

A reasonable objection: if your OPA policies enforce the right things and you pass your audits, why does the governance layer need to be machine-readable? The answer is maintenance. The OPA rules were probably written by someone who read the governance document and translated it into Rego. When that person leaves, or when the governance document changes, or when a new system is added, the mapping between governance intent and operational enforcement exists only in institutional memory. Machine-readable governance invariants make that mapping explicit and survivable.

Governance as code is not automating away governance decisions

Expressing governance as code means expressing governance decisions in a format that machines can verify. The decisions still require human judgment: which invariants to adopt, what targets to set, how much risk to accept. The verification of whether those decisions are being honoured in practice is what becomes automatic.

The practical starting point for most organisations is relatively modest, and I’ve found that teams which already think in code often need less convincing about the approach than about the governance model that makes it possible. Express your most important policy and standard invariants in a structured format (even a well-defined YAML file is a reasonable beginning). Connect those invariants to the infrastructure-level checks you already have, so the traceability exists. Define your process layer’s input and output criteria and performance expectations as structured data, and point your monitoring at them. The tooling to do each of these things exists today; what has been missing is the governance layer that gives all of it coherence.

The relationship to continuous assurance

Governance as code is a foundation layer for (modern, high-automation and change-resilient) continuous assurance, and the distinction between the two is worth being precise about.

Continuous monitoring already exists in most organisations: systems that check configuration state, alert on anomalies, and report on operational metrics. Continuous assurance is different. It is continuous verification that operational state satisfies governance intent, with built-in traceability from each operational check to the governance assertion it implements.

Governance as code provides the assertion layer that continuous assurance verifies. Without machine-readable governance assertions, continuous assurance has nothing to verify against except manually maintained mappings that drift from both governance documentation and operational reality. With machine-readable governance assertions, the entire verification chain from policy invariant through standard invariant through process measurement through operational evidence can largely be automated, traced, and reported continuously.

The Continuous Assurance series that follows builds on this foundation: governance as code provides the assertions, the process layer provides much of the measurement surface, and continuous assurance provides the verification loop. Together they answer the audit question (“does the organisation do what it says it does?”) as a continuous data feed rather than a periodic exercise, which is probably how it should have worked from the very beginning.