System Specification

Open Agent Trust Stack

A system specification for zero-trust AI agent execution. Define what is permitted and make everything else structurally inexpressible.

Version 0.1.0-draft
Status Draft
Authors Jascha Wanger / ThirdKey AI
Date 2026-03-12
License MIT
Abstract

Zero-Trust Agent Execution Through Structural Enforcement

As AI systems evolve from assistants into autonomous agents executing consequential actions, the security boundary shifts from model outputs to tool execution. Traditional security paradigms — log aggregation, perimeter defense, post-hoc forensics, and even runtime interception of fully-formed actions — cannot adequately protect systems where AI-driven actions are irreversible, execute at machine speed, and originate from potentially compromised orchestration layers.

The fundamental problem is architectural: when the policy gate can be influenced by the LLM it governs, when enforcement correctness is verified only at runtime, and when identity is self-asserted rather than cryptographically verified, security guarantees degrade under adversarial pressure.

Conviction 01

Allow-List Enforcement

Constrain what actions can be expressed through declarative tool contracts, making dangerous actions structurally inexpressible.

Conviction 02

Compile-Time Enforcement

The ORGA reasoning loop uses typestate programming so that skipping the policy gate is a type error, not a runtime bug.

Conviction 03

Structural Independence

The Gate phase operates outside LLM influence by construction, not by trust assumption.

OATS specifies five layers: (1) the ORGA reasoning loop with compile-time phase enforcement, (2) declarative tool contracts with typed parameter validation, (3) a cryptographic identity stack providing bidirectional trust between agents and tools, (4) a formally verifiable policy engine operating on structured inputs, and (5) hash-chained cryptographic audit journals with Ed25519 signatures for tamper-evident forensic reconstruction.

OATS is model-agnostic, framework-agnostic, and vendor-neutral. It defines what a compliant agent runtime must enforce, not how it must be implemented. The architecture specified here has been validated through approximately eight months of autonomous operation in a production runtime, moving beyond theoretical frameworks to specify requirements derived from operational experience.

Section 01

Introduction

1.1 The Problem

AI agents now execute consequential actions across enterprise systems: querying databases, sending communications, modifying files, invoking cloud services, and managing credentials. These actions are irreversible, execute at machine speed, originate from potentially compromised orchestration layers, and compose into violation patterns invisible when evaluated in isolation.

The security community has correctly identified the action layer as the stable enforcement boundary. Regardless of how agent frameworks, model architectures, or orchestration patterns evolve, actions on tools and APIs remain the point where AI decisions materialize as real-world effects. Security must be enforced at this boundary.

However, identifying the right boundary is necessary but not sufficient. The critical question is how enforcement occurs at that boundary, and current approaches have structural weaknesses that undermine their guarantees.

1.2 The Allow-List Thesis

Existing runtime security approaches operate on a deny-list model: the agent formulates an action, the security system intercepts it, evaluates it against policy and context, and decides whether to allow or block it. This model has a fundamental problem: it requires enumerating dangerous behavior. Every deny-list is incomplete by definition. Novel attacks, unanticipated compositions, and edge cases slip through because the system only blocks what it has been told to block.

OATS inverts this model. Instead of intercepting arbitrary actions and deciding which to block, an OATS-compliant runtime constrains what actions can be expressed in the first place. The agent fills typed parameters defined by a declarative tool contract. The runtime validates those parameters against the contract, constructs the invocation from a template, and executes it. The agent never generates raw commands, never constructs API calls directly, never formulates unconstrained actions. Dangerous actions cannot be expressed because the interface does not permit them.

Core Thesis

Define what is permitted and make everything else structurally inexpressible, rather than trying to enumerate and block what is dangerous.

1.3 Structural Enforcement

A second structural weakness in existing approaches is that enforcement correctness is verified only at runtime. The policy gate is a software component that can be misconfigured, bypassed through bugs, or circumvented through unanticipated code paths. If the gate fails to execute, actions proceed unguarded.

OATS addresses this through compile-time enforcement of the reasoning loop. The Observe-Reason-Gate-Act (ORGA) cycle uses type-level programming (typestates) so that skipping the Gate phase, dispatching tools without reasoning first, or observing results without dispatching are compile-time errors, not runtime bugs. The type system guarantees that every action passes through policy evaluation.

1.4 Gate Independence

The third structural weakness is that existing approaches evaluate actions within systems that process untrusted LLM outputs. The policy engine shares context, memory, and execution environment with the orchestration layer it is supposed to govern. An LLM that has been compromised through prompt injection can potentially influence the evaluation of its own actions.

In OATS, the Gate phase operates outside LLM influence. The Gate receives a structured action proposal and evaluates it against policy using a formally verifiable policy engine. The LLM cannot modify, bypass, or influence the Gate's evaluation. Policy denial is fed back to the LLM as an observation, allowing it to adjust its approach, but the denial itself is not negotiable.

1.5 Contributions

  1. Typestate-enforced reasoning loop. The ORGA (Observe-Reason-Gate-Act) cycle with compile-time phase enforcement, ensuring that policy evaluation cannot be skipped, circumvented, or reordered.
  2. Allow-list tool contracts. A declarative tool contract format that constrains agent-tool interaction to typed, validated parameters, making dangerous actions structurally inexpressible.
  3. Layered cryptographic identity. A bidirectional identity stack: tool integrity verification and agent identity verification, providing mutual authentication between agents and tools.
  4. Hash-chained audit journals. Cryptographically signed, hash-chained event journals that provide tamper-evident forensic reconstruction with offline verification.
  5. Conformance requirements. Minimum requirements for OATS-compliant systems, enabling objective evaluation of implementations and preventing category dilution.
Section 03

Threat Model

Fundamental Assumption

The AI orchestration layer cannot be trusted as a security boundary. The model processes untrusted inputs through opaque reasoning, producing actions that may serve attacker goals rather than user intent.

3.1 Threats Addressed

Prompt Injection (Direct and Indirect)

Adversaries embed instructions in user input, documents, tool outputs, or multimedia that override the agent's intended behavior. OATS mitigates this at two layers. At the tool contract layer, injected instructions cannot produce arbitrary tool invocations because the contract does not expose parameters that accept raw commands. At the policy layer, actions are evaluated against accumulated session context regardless of how the agent was instructed.

Confused Deputy

A privileged agent is tricked into misusing its authority through ambiguous or deceptive instructions. OATS mitigates this through bidirectional identity verification: before an agent invokes a tool, the tool's integrity is verified cryptographically; before a tool accepts an invocation, the agent's identity is verified cryptographically.

Action Composition / Data Exfiltration

Individual actions may each satisfy policy while their composition constitutes a breach. OATS tracks data classification across actions within a session through context accumulation. When sensitive data is accessed, subsequent external communications are evaluated against this context.

Intent Drift

The agent's actions gradually diverge from the user's original request through its own reasoning process. OATS tracks the chain of intent from original request through each action via context accumulation and semantic distance measurement. When cumulative drift exceeds configured thresholds, the Gate triggers deferral, step-up authorization, or denial.

Malicious Tool Outputs

Compromised or adversarial tools return outputs designed to manipulate subsequent agent behavior. OATS tracks tool outputs as part of session state and restricts what actions are permissible after specific tool calls.

Over-Privileged Credentials

Agents provisioned with credentials exceeding operational requirements. OATS supports least-privilege enforcement through just-in-time credential issuance and operation-specific token scoping.

Goal Hijacking and Memory Poisoning

Adversaries alter the agent's objectives or corrupt persistent memory. OATS operates at the action level: regardless of what objective the agent believes it is pursuing, each action must satisfy policy and align with accumulated context.

3.2 Tool Supply Chain Attacks

When agents use tools provided by third parties (MCP servers, API integrations, plugin ecosystems), those tools may be tampered with, impersonated, or silently modified. OATS mitigates tool supply chain attacks through cryptographic tool integrity verification. Tool contracts are signed by their publishers. The runtime verifies signatures before registering tools, rejecting any contract that fails verification.

3.3 Trust Assumptions

Trusted
  • The OATS runtime (ORGA loop, policy engine, tool contract executor, journal, identity verifier)
  • Cryptographic primitives and key management infrastructure
  • The policy store and policy authoring process
  • The underlying infrastructure (OS, network, hardware)
  • The compiler and type system
Untrusted
  • The AI model and its outputs
  • The orchestration layer
  • User inputs and prompts
  • Tool outputs and retrieved data
  • External documents, emails, web content
  • Agent memory and conversation history
  • Tool contracts from unverified publishers
Partially Trusted
  • Tool implementations (OATS constrains invocation but cannot prevent bugs within tools)
  • Human approvers (OATS routes step-up authorization but cannot prevent social engineering)
  • Verified tool contracts (verified as untampered, but the tool itself may have vulnerabilities)
Section 04

Core Architecture: The ORGA Loop

The ORGA (Observe-Reason-Gate-Act) loop is the core execution engine for OATS-compliant agent runtimes. It drives a multi-turn cycle between an LLM, a policy gate, and external tools through four mandatory phases.

4.1 Phase Definitions

Observe

Collect results from previous tool executions. Incorporate tool outputs, error messages, policy denial feedback, and environmental signals into the agent's context. This phase also integrates knowledge retrieval (RAG-enhanced context) when available.

Reason

The LLM processes accumulated context and produces proposed actions (tool calls or text responses). The LLM sees tool definitions but never sees raw invocation details. The LLM's output is a structured proposal, not an executable action.

Gate

The policy engine evaluates each proposed action. This phase operates entirely outside LLM influence. The Gate receives the proposed action, the accumulated session context, and the agent's identity, and evaluates them against organizational policy. The Gate produces one of five decisions: Allow, Deny, Modify, Step-Up (pause for human approval), or Defer (temporarily suspend pending additional context).

Act

Approved actions are dispatched to tool executors. The tool contract executor validates parameters against the contract's type system, constructs the invocation from the contract's template, executes with timeout enforcement, captures output in a structured evidence envelope, and records the execution in the audit journal.

4.2 Typestate Enforcement

Phase transitions MUST be enforced at compile time using type-level programming (typestates). Each phase is a distinct type. The transition from Reason to Act without passing through Gate MUST be a type error, not a runtime check.

AgentLoop<Reasoning>    -- produce_output() -->  AgentLoop<PolicyCheck>
AgentLoop<PolicyCheck>  -- check_policy()  -->  AgentLoop<ToolDispatching>
AgentLoop<ToolDispatching> -- dispatch()   -->  AgentLoop<Observing>
AgentLoop<Observing>    -- observe()       -->  AgentLoop<Reasoning> | LoopResult

The following are compile-time errors:

  • Skipping the policy check (Reasoning to ToolDispatching)
  • Dispatching tools without reasoning (PolicyCheck to Observing)
  • Observing results without dispatching (Reasoning to Observing)

Implementations in languages without native typestate support MUST provide equivalent guarantees through runtime enforcement with 100% path coverage testing and formal verification that all tool dispatch paths pass through the Gate.

4.3 Dynamic Branching

The only point where the ORGA loop branches dynamically is after the Observe phase: the loop either continues (returning to Reason for another iteration) or completes (producing a final result). This branching is a standard pattern match on a concrete type, not dynamic dispatch. All other phase transitions are strictly linear.

4.4 Loop Termination

The loop terminates when:

  • The LLM produces a final text response (no tool calls proposed)
  • Iteration limits are reached (configurable per deployment)
  • Token budget is exhausted
  • Time budget is exhausted
  • A circuit breaker trips (configurable failure thresholds on tool calls)

4.5 Policy Denial Feedback

When the Gate denies an action, the denial reason MUST be fed back to the LLM as an observation in the next Observe phase. The LLM may propose alternative actions that satisfy policy, but the Gate evaluates each proposal independently. The denial is not negotiable; only the LLM's subsequent proposals can change.

Section 05

Tool Contract Layer

5.1 The Allow-List Principle

An OATS-compliant runtime MUST support declarative tool contracts that define the complete behavioral contract for each tool. The security model inverts the sandbox approach:

  • Sandbox (deny-list): LLM generates an arbitrary action. The security system intercepts, evaluates, and decides whether to allow or block.
  • Tool contract (allow-list): LLM fills typed parameters constrained by the contract. The executor validates, constructs from template, and executes. The LLM never generates or sees the raw invocation.

5.2 Contract Requirements

A tool contract MUST define:

  1. Typed parameters. Each parameter has a declared type with validation constraints. The type system MUST include at minimum: string (with injection sanitization, optional regex pattern), integer (with optional min/max), boolean, enum (value from declared allow-list), and a target type for scope-checked values. All string-based types MUST reject shell metacharacters (;&|$\`(){}[]<>!) by default.
  2. Invocation mechanism. The contract declares how the tool is invoked: command template, HTTP request template, protocol server address, or interactive session definition.
  3. Output schema. The contract declares the expected structure of tool output. The executor validates parsed output against this schema before returning results to the agent.
  4. Policy metadata. The contract declares the policy resource and action for the tool, enabling the policy engine to evaluate authorization without parsing tool-specific details.
  5. Risk tier. The contract declares a risk classification (e.g., low, medium, high, critical) that informs default policy generation and step-up authorization thresholds.

5.3 Execution Modes

An OATS-compliant tool contract format SHOULD support multiple execution modes:

  • Oneshot: Execute a single invocation and return results.
  • Session: Maintain a running process where each interaction is independently validated and policy-gated.
  • Browser: Maintain a governed browser session where navigation, form submission, and script execution are typed, scoped, and policy-gated.

5.4 Contract Integrity

Tool contracts MUST support cryptographic integrity verification. The signature MUST cover the entire contract (parameters, validation rules, invocation templates, output schemas, scope constraints). Partial signatures are insufficient because the invocation template and scope constraints are security-critical.

5.5 MCP Schema Generation

Tool contracts SHOULD support automatic generation of protocol-compatible schemas (e.g., MCP inputSchema and outputSchema) from the contract definition.

Section 06

Identity Layer

6.1 The Identity Problem

When AI agents interact with tools, services, and other agents, identity is typically self-asserted. An agent claims to be "Scout v2 from Tarnover LLC" with no way for the receiving party to verify that claim. Self-asserted identity provides no security guarantee: agents can be impersonated, tools can be spoofed, and delegation claims cannot be verified.

OATS specifies a two-layer cryptographic identity stack that addresses both directions of the trust problem.

6.2 Tool Integrity Verification

An OATS-compliant runtime MUST support cryptographic verification of tool schemas and contracts:

  • Domain-anchored discovery. Tool publishers host public keys at well-known endpoints (e.g., /.well-known/ URIs per RFC 8615). No centralized registry required.
  • Signature verification. Tool contracts and schemas are signed with ECDSA P-256 (or equivalent). The runtime verifies signatures before registering tools.
  • Trust-On-First-Use (TOFU) key pinning. On first encounter, the runtime pins the publisher's key. Subsequent key changes require explicit trust decisions.
  • Revocation support. Publishers can revoke keys and schemas. The runtime checks revocation status before accepting tools.

6.3 Agent Identity Verification

An OATS-compliant runtime SHOULD support cryptographic agent identity verification:

  • Domain-anchored agent identity. Organizations publish verifiable identity documents for their agents at well-known endpoints.
  • Short-lived credentials. Agents are issued time-limited signed credentials (e.g., ES256 JWTs) declaring their identity, capabilities, and delegation chain.
  • Delegation chains. Agent credentials support maker-deployer delegation, where the organization that builds agent software and the organization that deploys it are independently verifiable.
  • Capability scoping. Agent credentials declare specific capabilities (e.g., read:data, write:reports), enabling verifiers to enforce least-privilege access.

6.4 Bidirectional Trust

The two identity layers create a bidirectional trust model:

  1. Agent verifies tool. Before invoking a tool, the agent's runtime verifies the tool's contract integrity.
  2. Tool verifies agent. Before accepting an invocation, the tool verifies the agent's identity.
  3. Policy evaluation. The runtime evaluates whether the verified agent's capabilities authorize it to use the verified tool.
  4. Audit recording. Both verifications and the policy decision are recorded in the cryptographic audit journal.
Section 07

Policy Enforcement Layer

7.1 Policy Engine Requirements

An OATS-compliant runtime MUST include a policy engine that evaluates the tuple (action, context, identity) and produces an authorization decision. The policy engine:

  • MUST support five authorization decisions: Allow, Deny, Modify, Step-Up, Defer.
  • MUST evaluate both static policy and accumulated session context.
  • MUST operate outside LLM influence.
  • SHOULD support a formally verifiable policy language (Cedar, OPA, or equivalent).
  • MUST default to deny.

7.2 Action Classification

  • Structurally forbidden. Actions that cannot be expressed through the tool contract layer. Eliminated before reaching the policy engine.
  • Policy-forbidden. Actions expressible through tool contracts but always blocked by policy regardless of context.
  • Context-dependent deny. Actions allowed by static policy but blocked when session context reveals inconsistency with stated intent.
  • Context-dependent allow. Actions denied by default policy but permitted when context demonstrates clear alignment with legitimate intent.
  • Context-dependent defer. Actions whose risk cannot be conclusively determined.

7.3 Context Accumulation

An OATS-compliant runtime MUST accumulate session context across actions. The context accumulator maintains:

  • Original request. The user's initial instruction establishing intent.
  • Action history. The sequence of actions proposed, approved, denied, deferred, and executed.
  • Data classification. The sensitivity level of information accessed.
  • Tool outputs. Results returned from previous actions.
  • Semantic distance. How far the current action has drifted from the original request.
  • Identity context. Verified identities of the agent, user, and tools involved.

7.4 Semantic Distance Tracking

An OATS-compliant runtime SHOULD compute semantic distance between actions and stated intent to detect intent drift. Cumulative drift SHOULD be tracked across action sequences, not only per-action.

7.5 Step-Up Authorization

For ambiguous cases, the runtime MUST support step-up authorization workflows. Action execution blocks until an approval decision is received. Full action context must be available to approvers. Configurable timeouts are enforced; deny on timeout is the default.

7.6 Deferral

Deferred actions remain paused without producing effects. The runtime tracks deferred actions and maintains their execution order. Cascading deferrals are bounded: when concurrently deferred actions exceed a configurable limit, subsequent actions are denied.

Section 08

Audit Layer

8.1 Journal Requirements

An OATS-compliant runtime MUST maintain a cryptographic audit journal recording all events in the ORGA loop. The journal is the authoritative record of what happened, when, why, and by whose authority.

8.2 Event Types

Event When Content
LoopStarted Loop begins Configuration, agent identity, original request
ReasoningComplete After LLM response, before Gate Proposed actions, token usage
PolicyEvaluated After Gate decision Actions evaluated, decisions, matching policies, reasons
ToolsDispatched After tool execution Tools invoked, parameters, duration, evidence hashes
ObservationsCollected After collecting results Observation count, context size
LoopTerminated Loop ends Reason, iterations, total usage, duration
RecoveryTriggered On tool failure Strategy, error context

8.3 Cryptographic Properties

Each journal entry MUST include:

  • Ed25519 signature (or equivalent; ECDSA P-256 also acceptable). The signature covers the canonical serialization of the entry contents.
  • Hash chain link. Each entry includes the cryptographic hash of the previous entry, forming an append-only chain that detects retroactive modification.
  • Timestamp. Cryptographic timestamp for temporal ordering.

Journal entries MUST be verifiable offline.

8.4 Evidence Envelopes

Tool executions MUST produce structured evidence envelopes containing: tool name and version, validated parameters, constructed invocation, duration and exit status, output hash (SHA-256), policy decision that authorized execution, and agent and user identity at time of execution.

8.5 Compliance Properties

The journal provides: HIPAA audit trail, SOC2 evidence, SOX audit trail, and GDPR accountability records.

Section 09

Sandboxing and Isolation

9.1 Multi-Tier Sandboxing

An OATS-compliant runtime SHOULD support multiple sandboxing tiers:

  • Tier 1: Container isolation. Agent execution within container boundaries with resource limits, network restrictions, and filesystem isolation.
  • Tier 2: Kernel-level isolation. Agent execution within a user-space kernel (e.g., gVisor) providing syscall filtering without full virtualization overhead.
  • Tier 3: Microkernel isolation. Agent execution within a lightweight VM (e.g., Firecracker) providing hardware-level isolation with minimal overhead.

9.2 Resource Limits

Regardless of sandboxing tier, agent execution MUST support configurable resource limits: token budget, time budget, iteration budget, tool call budget, network restrictions, and filesystem restrictions.

9.3 Circuit Breakers

Tool executions SHOULD be protected by circuit breakers. When a tool fails repeatedly, the circuit breaker trips and subsequent calls are rejected without execution until the circuit resets.

Section 10

Inter-Agent Communication

10.1 Communication Governance

When agents communicate with other agents, all inter-agent messages MUST pass through a communication policy gate. The gate evaluates authorization rules on communication primitives (ask, delegate, send, parallel, race) before execution.

10.2 Message Security

Inter-agent messages MUST be cryptographically signed (Ed25519 or equivalent), encrypted (AES-256-GCM or equivalent), and attributed to verified agent identities.

10.3 Delegation Constraints

Delegation chains MUST be bounded: maximum delegation depth (configurable), capability narrowing (a delegated agent cannot exceed the delegating agent's capabilities), and blast-radius containment.

10.4 Cross-Agent Context

When an agent delegates to another agent, the session context SHOULD be propagated to the downstream agent, enabling the downstream agent's Gate to evaluate actions against the original intent.

Section 11

Conformance Requirements

11.1 Conformance Levels

OATS Core (satisfies all MUST requirements): Provides baseline zero-trust agent execution.

OATS Extended (satisfies all MUST and SHOULD requirements): Provides comprehensive zero-trust agent execution with identity, sandboxing, and advanced policy features.

11.2 Core Requirements

  • C1: ORGA Loop Enforcement. The runtime MUST implement the four-phase ORGA loop. The Gate phase MUST execute before every tool dispatch.
  • C2: Tool Contract Support. The runtime MUST support declarative tool contracts with typed parameter validation. The LLM MUST NOT generate raw tool invocations.
  • C3: Policy Evaluation. The runtime MUST evaluate actions against policy before execution. Default stance MUST be deny.
  • C4: Context Accumulation. The runtime MUST accumulate session context across actions.
  • C5: Cryptographic Audit Journal. The runtime MUST maintain a hash-chained, cryptographically signed audit journal.
  • C6: Gate Independence. The Gate phase MUST operate on structured inputs only.
  • C7: Evidence Envelopes. Tool executions MUST produce structured evidence envelopes.

11.3 Extended Requirements

  • E1: Tool Integrity Verification. SHOULD verify tool contract signatures with domain-anchored cryptographic verification.
  • E2: Agent Identity Verification. SHOULD verify agent identity with domain-anchored credentials.
  • E3: Semantic Distance Tracking. SHOULD compute and track semantic distance between actions and intent.
  • E4: Multi-Tier Sandboxing. SHOULD support configurable sandboxing tiers.
  • E5: Inter-Agent Governance. SHOULD enforce authorization policies on inter-agent communication.
  • E6: Telemetry Export. SHOULD export structured telemetry to security platforms.
  • E7: Formally Verifiable Policies. SHOULD use a formally verifiable policy language.
  • E8: Least-Privilege Credential Scoping. SHOULD support just-in-time credential issuance.

11.4 Verification Methodology

For each Core requirement, the specification defines a verification procedure:

  • C1: Attempt to construct a code path from Reason to Act that bypasses Gate. Must be a compile error (typestate) or caught by verified test suite.
  • C2: Submit tool parameters containing shell metacharacters. Verify rejection. Verify LLM never receives raw invocation strings.
  • C3: Configure Deny policy. Submit matching action. Verify no effects and denial recorded.
  • C4: Execute a sequence of actions. Verify policy engine receives accumulated context for each.
  • C5: Generate journal entries. Verify fields, signatures, hash chain. Tamper with entry and verify detection.
  • C6: Inspect Gate implementation. Verify no natural language parsing, no shared mutable state with LLM.
  • C7: Execute a tool. Verify evidence envelope completeness.
Section 12

Implementation Architectures

OATS does not mandate a specific implementation architecture. The specification defines what a compliant runtime must do, not how it must be implemented.

12.1 Self-Hosted Runtimes

For organizations that control their agent infrastructure, the full OATS stack can be deployed as a single runtime. This provides the strongest guarantees: compile-time enforcement, full context visibility, cryptographic identity, and multi-tier isolation.

12.2 Plugin/Extension Model

For agents that run inside third-party platforms, OATS compliance can be achieved through a layered approach:

  • Inner layer (awareness): A plugin or extension running inside the agent platform provides tool discovery, audit logging, and advisory policy evaluation.
  • Outer layer (enforcement): The agent platform runs inside an OATS-compliant runtime. The outer runtime's ORGA Gate provides hard enforcement that the inner plugin cannot bypass.

12.3 Gateway Architecture

For protocol-based tool invocations (MCP, REST APIs), an OATS-compliant gateway can intercept all traffic between agents and tools. The gateway implements the Gate phase, context accumulation, and audit journaling without modifying agent code.

12.4 Vendor Integration

For SaaS agents where organizations control none of the infrastructure, OATS conformance requires vendor cooperation. Vendors must provide synchronous pre-execution hooks, decision enforcement, context availability, and receipt export.

Section 13

Research Directions

13.1 Typestate in Non-Rust Languages

The compile-time enforcement guarantee is strongest in languages with typestate support (Rust, potentially Haskell, Scala with phantom types). Providing equivalent guarantees in Python, JavaScript, and Go requires either runtime enforcement with formal verification, or code generation from a verified specification.

13.2 Data Flow Through Context Windows

Tracking data lineage through LLM context windows remains an open challenge. Data may be transformed, summarized, or paraphrased. Information-theoretic approaches (taint analysis, watermarking) are active research areas.

13.3 Multi-Agent Trust Coordination

As agents delegate across organizational boundaries, maintaining coherent trust chains requires distributed tracing standards, federated receipt verification, and cross-domain policy negotiation.

13.4 Formal Verification of the ORGA Loop

Formal verification of the entire loop (including policy engine correctness, context accumulator completeness, and journal integrity) would provide stronger assurance than typestate enforcement alone.

13.5 Approval Fatigue and Deferral Resolution

Balancing security against usability remains an open design challenge. ML-based approval recommendation, batch approval, and progressive autonomy are active research directions.

13.6 Vector Embedding Security

When semantic distance tracking uses vector embeddings, the embeddings themselves become a security surface. Research into watermarking, steganographic attack detection, and quantization-robust integrity verification is needed.

Section 14

Conclusion

Principle

Allow-List Over Deny-List

Tool contracts make dangerous actions structurally inexpressible.

Principle

Compile-Time Over Runtime

The ORGA typestate makes invalid phase transitions compile errors.

Principle

Structural Over Assumed

The Gate operates outside LLM influence through structural isolation.

These convictions are not theoretical. The architecture specified here has been validated through approximately eight months of autonomous operation in a production runtime (Symbiont, by ThirdKey AI), including a catastrophic disaster recovery scenario where the runtime rebuilt itself using its own agent infrastructure after a total codebase loss.

By publishing this specification as an open standard, we aim to establish baseline requirements that preserve interoperability and buyer choice. The goal is not to build OATS, but to define what an OATS-compliant system must do, enabling the market to compete on implementation quality rather than category definition.

References

References

  • Anthropic. "Model Context Protocol Specification." 2024. modelcontextprotocol.io
  • Amazon Web Services. "Cedar: A Language for Defining Permissions as Policies." 2023. cedarpolicy.com
  • Chuvakin, A. "Cloud CISO Perspectives: How Google secures AI Agents." Google Cloud Blog, June 2025.
  • Debenedetti, E. et al. "AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents." arXiv:2406.13352, 2024.
  • Errico, H. "Autonomous Action Runtime Management (AARM)." arXiv:2602.09433v1, 2026.
  • Gaire, S. et al. "Systematization of Knowledge: Security and Safety in the MCP Ecosystem." arXiv:2512.08290, 2025.
  • Gregg, B. "BPF Performance Tools." Addison-Wesley, 2019.
  • Greshake, K. et al. "Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection." AISec Workshop at ACM CCS, 2023.
  • Hardy, N. "The Confused Deputy: (or why capabilities might have been invented)." ACM SIGOPS, 1988.
  • Microsoft. "Governance and security for AI agents across the organization." Cloud Adoption Framework, 2024.
  • Miller, M. S. "Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control." Ph.D. dissertation, Johns Hopkins University, 2006.
  • National Institute of Standards and Technology. "AI Risk Management Framework (AI RMF 1.0)." 2023.
  • Open Policy Agent. "OPA: Policy-based control for cloud native environments." 2024. openpolicyagent.org
  • OWASP Foundation. "OWASP Top 10 for Large Language Model Applications." 2024.
  • Raza, S. et al. "TRiSM for Agentic AI." arXiv:2506.04133, 2025.
  • Reber, D. "The Agentic AI Security Scoping Matrix." AWS Security Blog, November 2024.
  • Ruan, Y. et al. "The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies." arXiv:2407.19354, 2024.
  • Su, H. et al. "A Survey on Autonomy-Induced Security Risks in Large Model-Based Agents." arXiv:2506.23844, 2025.
  • Wanger, J. "AgentPin Technical Specification v0.2.0." ThirdKey AI, 2026. agentpin.org
  • Wanger, J. "SchemaPin Protocol Specification." ThirdKey AI, 2025. schemapin.org
  • Wanger, J. "Symbiont Runtime Architecture." ThirdKey AI, 2026. symbiont.dev
  • Wanger, J. "ToolClad: Declarative Tool Interface Contracts for Agentic Runtimes v0.5.1." ThirdKey AI, 2026.
  • Wu, Q. et al. "Security of AI Agents." arXiv:2406.08689, 2024.
  • Ye, Q. et al. "ToolEmu: Identifying Risky Real-World Agent Failures with a Language Model Emulator." ICLR, 2024.