Agentic AI and DevOps| Solsta Blog

For years, Solsta has solved a specific problem in live service game operations: how do you control what a production system is actually running, across multiple environments, when the artifacts that define its behavior are distributed and changing constantly?

Live service games are unforgiving in this regard. A bad release reaches millions of players immediately. Rolling it back requires knowing exactly what state you're returning to. Knowing that requires a system that treats environment state as a first-class object: versioned, immutable, and promoted through explicit, auditable transitions. Not just by policy, but through an enforceable structure.

That's precisely what Solsta is: a deployment control plane that governs what is deployed, where, and by whom. Independently evolving artifacts are published together as one immutable release. Environments resolve to snapshots, each a point-in-time record of complete system state. The only way an environment can change is through a gated promotion. Every promotion produces a new snapshot. Rollback is just promotion to a prior one.

We built Solsta for game development because game studios experienced the pain first. It turns out the problem it solves is not specific to games.

‍

The Missing Layer

AI agents introduce a version of this problem that existing deployment infrastructure was not designed to handle. Their behavior doesn't live in code alone. It emerges from the composition of models, prompts, tools, retrieval pipelines, and configuration. Change any one component and the system's effective behavior can change — sometimes in ways that are subtle but consequential. None of those changes are required, by default, to pass through a formal declaration and review before reaching production.

Modern deployment infrastructure handles this well for code. Pipelines verify builds. Version control tracks changes. IAM restricts who can invoke what at runtime. These controls are mature and they work for what they were designed to govern.

But none of them own the question that becomes critical once an autonomous system is operating in production: what is this agent supposed to do, and was that behavior reviewed before it was deployed?

A prompt change that expands an agent's effective scope may look like a minor text edit in a diff. Connecting a new tool may not touch application code at all. A model upgrade may alter reasoning capability without changing any configuration file. Each of these is a behavioral change.

An agent's ‘behavioral capability surface’ can be defined as the complete set of things it is permitted and technically able to do. This exists implicitly, distributed across repositories and configuration layers, with no single artifact that answers the foundational questions: what was declared, what was reviewed, what was approved, and what is actually running?

When something goes wrong, reconstructing that picture is forensic work. When an audit requires it, that work becomes expensive. When you need to roll back, the absence of a canonical record of prior states makes the decision harder than it should be.

‍

A Control Plane for Agent Behavior

Governance of agent behavioral capability needs the same structural properties that Solsta was built to enforce for game deployments.

Behavioral capability must be declared as an explicit artifact before deployment into any controlled environment. That declaration must be human-authored and reviewable. It must be versioned so that changes to declared capability are visible as changes, not buried in diffs of implementation detail.

During a promotion event, that declared intent must be resolved to the exact runtime artifacts that implement it (model versions, tool release identifiers, configuration digests, etc) and that binding must be immutable. The reviewed and approved artifact configuration must be identical to what runs in production. Any deviation invalidates the governance guarantee.

Every promotion into a controlled environment must require that both artifacts exist, that required approvals are durably recorded, and that the binding is cryptographically stable.

A declaration without evidence is a statement of intent, not a validated contract. Promotion policy should require evaluation evidence commensurate with the agent's risk. At minimum, structured validation that observed behavior stays within declared scope.

Rollback must be deterministic: redeployment of a prior lock, not reconstruction from memory.

‍

BehaviorSpec and Solsta for AI Agents

This week we published BehaviorSpec, a declarative governance model for managing agent behavioral capability at promotion boundaries. It defines two mandatory artifacts: a behavior.intent file that declares the agent's authorized purpose, tool permissions, model policy, and constraints; and a behavior.lock generated at promotion time that binds the approved intent to immutable artifact identities.

The promotion invariant BehaviorSpec establishes is direct: no agent may be deployed into a controlled environment unless its declared behavioral intent has been reviewed, approved, and cryptographically bound to the exact runtime artifacts. The staging lock carries forward to production unchanged. The production gate verifies and appends to it. Environment parity is structural, not procedural.

Solsta is extending its deployment control plane to enforce this model natively. BehaviorSpec artifacts become first-class objects in the same promotion infrastructure we built for game releases. The primitives are the same: immutable releases, environment snapshots, promotion as the exclusive path to state change, and an append-only audit record. What's new is that behavioral capability is now explicitly governed within that structure, alongside every other artifact the system depends on.

This means teams can manage agent deployments with the same operational discipline they apply to software releases (declared intent, reviewed capability, immutable bindings, deterministic rollback) without adopting a new deployment system or abandoning existing infrastructure. Solsta integrates with CI/CD promotion workflows. BehaviorSpec defines what those workflows must require.

‍

Why This Matters at Scale

A single agent or two in a low-risk internal workflow can probably be governed adequately with existing review processes. The problem compounds as systems scale.

When the number of agents increases, when agents operate across multiple environments, when ownership is distributed across teams, and when regulatory or audit obligations require structured evidence of behavioral change control, the absence of a canonical behavioral artifact creates compounding liability. Accountability for what any given agent is authorized to do becomes distributed across whoever wrote the prompts, whoever configured the tools, whoever approved the last deployment, and whoever maintains the infrastructure. When an incident occurs, that accountability has to be reconstructed rather than looked up.

The organizations moving agents into regulated, customer-facing, and business-critical environments are encountering this now. The infrastructure layer that governs how behavioral capability is introduced, reviewed, and promoted into production systems does not yet exist in a mature form. That's the layer Solsta is building.

Read the BehaviorSpec Paper

The blog post has been deliberately high-level. The full governance model — the promotion invariant, artifact schemas, compliance framework alignment, and the boundary between artifact binding and behavioral compliance — is in the paper.

If your organization is moving agents into production and you don't yet have a clear answer to what was declared, what was approved, and what is running, the paper defines the model for closing that gap.

‍

Read Behavior Spec →[link to paper]

Learn more about Solsta for Agentic AI→ [solsta.io/agentic-ai]