Behind every powerful AI agent lies sophisticated infrastructure that makes it accessible, reliable, and useful across different platforms and use cases. OpenAI has published detailed insights into the Codex App Server, the critical piece of infrastructure that exposes the Codex harness as a stable, client-friendly protocol. This transparency not only helps developers integrate Codex into their own products but also advances the broader conversation about standardizing agent-to-application protocols.
The Origins of the App Server
The App Server began as a practical solution to a specific problem: how to reuse the Codex harness across multiple products without reimplementing core functionality. Initially, Codex CLI was built as a terminal user interface (TUI), providing command-line access to Codex agents. When OpenAI developed the VS Code extension to offer more IDE-friendly interaction with Codex, the team needed a way to drive the same agent loop from an IDE UI without duplicating the underlying logic.
What started as an internal tool gradually evolved into OpenAI's standard protocol for agent interaction. The journey from experimental interface to production-grade platform reflects lessons learned about what it takes to make powerful agents broadly accessible and usable.
Inside the Codex Harness
The Codex harness encompasses more than just the core agent loop. It includes:
Thread Lifecycle and Persistence: Creating, resuming, forking, and archiving conversation threads, with persistent event history so clients can reconnect and render consistent timelines.
Configuration and Authentication: Loading configuration, managing defaults, and running authentication flows like "Sign in with ChatGPT," including credential state management.
Tool Execution and Extensions: Executing shell and file tools in sandboxed environments and wiring up integrations like MCP servers and skills to participate in the agent loop under consistent policy models.
All this agent logic lives in "Codex core," a component of the Codex CLI codebase that functions both as a library containing agent code and as a runtime that can be instantiated to run the agent loop and manage the persistence of individual Codex threads.
App Server Architecture
The App Server consists of both a JSON-RPC protocol and a long-lived process that hosts Codex core threads. The architecture includes four main components:
Stdio Reader: Handles communication over standard input/output streams.
Codex Message Processor: Translates between client JSON-RPC requests and Codex core operations.
Thread Manager: Spins up one core session for each conversation thread.
Core Threads: The actual agent instances executing work.
The message processor serves as a critical translation layer, converting client JSON-RPC requests into Codex core operations and transforming low-level internal events into a small set of stable, UI-ready JSON-RPC notifications that clients can easily consume.
The JSON-RPC Protocol
The protocol between clients and the App Server is fully bidirectional JSON-RPC. A typical thread involves a client request followed by many server notifications. The server can also initiate requests when the agent needs input, such as approval for a potentially sensitive action, pausing the turn until the client responds.
This bidirectional nature is essential for rich agent interactions. Simple request-response patterns cannot adequately capture the complexity of agent work: incremental progress updates, intermediate artifacts like code diffs, approval requests, and streaming responses all require a more sophisticated protocol.
Conversation Primitives
Designing an API for an agent loop presents unique challenges because user-agent interaction isn't a simple request-response pattern. One user request can unfold into a structured sequence of actions that clients need to represent faithfully. OpenAI landed on three core primitives:
Item: The atomic unit of input/output in Codex. Items are typed (user message, agent message, tool execution, approval request, diff) and each has an explicit lifecycle: started, optional delta events for streaming, and completed when finalized. This lifecycle enables clients to start rendering immediately, stream incremental updates, and finalize when done.
Turn: One unit of agent work initiated by user input. A turn begins when a client submits input and ends when the agent finishes producing outputs. It contains a sequence of items representing intermediate steps and outputs.
Thread: The durable container for an ongoing Codex session between user and agent, containing multiple turns. Threads can be created, resumed, forked, and archived, with persistent history enabling clients to reconnect and render consistent timelines.
These primitives provide clear boundaries and lifecycles that make agent interactions predictable and manageable across different client implementations.
Client Integration Patterns
Different client surfaces integrate with the App Server in distinct ways:
Local Apps and IDEs
Local clients like VS Code extensions and desktop applications typically bundle or fetch a platform-specific App Server binary, launch it as a long-running child process, and maintain a bidirectional stdio channel for JSON-RPC. The shipped artifact includes the platform-specific Codex binary pinned to a tested version, ensuring the client always runs validated bits.
Some partners like Xcode decouple release cycles by keeping the client stable while pointing to newer App Server binaries when needed. This allows adopting server-side improvements and bug fixes without waiting for client releases. The JSON-RPC surface is designed to be backward compatible, enabling older clients to communicate safely with newer servers.
Codex Web Runtime
Codex Web runs the harness in containerized environments. A worker provisions a container with the checked-out workspace, launches the App Server binary inside it, and maintains a long-lived JSON-RPC channel. The web app running in the user's browser communicates with the Codex backend over HTTP and Server-Sent Events (SSE), which streams task events from the worker.
Keeping state and progress on the server means work continues even if browser tabs close or networks disconnect. The streaming protocol and saved thread sessions enable new sessions to reconnect, pick up where they left off, and catch up without rebuilding state client-side.
Terminal User Interface (TUI)
Historically, the TUI was a "native" client running in the same process as the agent loop, talking directly to Rust core types rather than using the App Server protocol. OpenAI plans to refactor the TUI to use the App Server, making it behave like any other client. This will unlock workflows where the TUI connects to a Codex server running on a remote machine, keeping the agent close to compute while delivering live updates and controls locally.
Choosing the Right Integration Method
While the App Server will be OpenAI's first-class integration method going forward, other options exist for specific use cases:
MCP Server: Run "codex mcp-server" and connect from any MCP client supporting stdio servers. Good for existing MCP workflows wanting to invoke Codex as a callable tool. Limitation: only exposes what MCP endpoints provide.
Protocol Adapters: Some ecosystems offer portable interfaces targeting multiple model providers. Good for coordinating multiple agents, but these protocols often converge on common capability subsets, making richer interactions harder to represent.
App Server: Choose when you want the full Codex harness exposed as a stable, UI-friendly event stream. Provides complete functionality including Sign in with ChatGPT, model discovery, and configuration management. Main cost is integration work building client-side JSON-RPC binding.
CLI Mode: Lightweight, scriptable mode for one-off tasks and CI runs. Good for automation and pipelines wanting single commands to run to completion non-interactively.
Agent SDK: TypeScript library for programmatically controlling local Codex agents from your own application. Best for native library interface for server-side tools and workflows without building separate JSON-RPC client.
Technical Accessibility and Open Standards
The transport is JSON-RPC over stdio using JSONL (JSON Lines format). JSON-RPC makes it straightforward to build client bindings in any language. Codex surfaces and partner integrations have implemented App Server clients in Go, Python, TypeScript, Swift, and Kotlin.
For TypeScript, definitions can be generated directly from the Rust protocol. For other languages, a JSON Schema bundle can be generated and fed into preferred code generators. This accessibility is intentional—making it easier for the ecosystem to build on Codex accelerates innovation and expands the agent's utility.
Looking Forward
OpenAI's detailed documentation of the App Server architecture represents more than technical transparency—it's an invitation to the developer community to build on a stable foundation. As AI agents become more capable and widespread, standardized protocols for agent interaction will become increasingly important.
The App Server approach—emphasizing clear primitives, backward compatibility, and open standards—provides a model for how agent infrastructure can be both powerful and accessible. By sharing their learnings and opening their implementation, OpenAI is contributing to the broader conversation about how to make agents useful across the diverse landscape of development tools, platforms, and use cases.
For developers interested in integrating Codex into their workflows, the App Server provides a production-grade path forward. All source code lives in the Codex CLI open-source repository, and OpenAI welcomes feedback and feature requests as the protocol continues to evolve based on real-world usage.
Source: Unlocking the Codex harness: how we built the App Server - OpenAI Blog