OpenAI Codex

The most capable autonomous coding agent on benchmarks, with both cloud and local modes, but it's locked inside ChatGPT's pricing and the cloud agent can't be steered mid-task.

Agentic·DevTool·LLM·Infrastructure

openai.com

Our Take

What It Is

OpenAI Codex is a two-part coding agent: a cloud-based system that runs tasks in isolated sandbox environments preloaded with your repository, and an open-source CLI (built in Rust) for local terminal workflows. It's powered by GPT-5.3-Codex, a model trained via reinforcement learning on real-world coding tasks. GPT-5.3-Codex goes beyond code writing into full computer-use, setting industry highs on SWE-Bench Pro, Terminal-Bench, OSWorld, and GDPval.

Why It Matters

The cloud sandbox architecture is meaningfully different from local coding agents. Each task runs in its own isolated container, and you can fire off multiple tasks in parallel. The model's iterative test execution (trained to run tests repeatedly until passing) closes the write-test-fix loop autonomously. With GPT-5.4 announced in March 2026 combining Codex capabilities with stronger reasoning, OpenAI is pushing the boundary on what autonomous coding agents can handle.

Key Developments

Mar 2026: GPT-5.4 announced, combining GPT-5.3-Codex capabilities with stronger reasoning and tool use.
Feb 2026: Codex macOS desktop app launched; double rate limits offered temporarily for all paid subscribers.
Feb 2026: GPT-5.3-Codex released with computer-use capability, setting new highs on SWE-Bench Pro and Terminal-Bench.
Late 2025: GPT-5.2-Codex released with software engineering optimisation via reinforcement learning.
Mid 2025: Codex cloud agent launched with parallel task execution in sandboxed environments.

What to Watch

The fire-and-forget nature of the cloud agent is the biggest limitation. You can't course-correct it mid-task, which means a wrong approach on a 20-minute task wastes all that time. Watch whether OpenAI adds mid-task intervention. Also track the pricing: Codex requires a ChatGPT subscription ($20-$200/month), and rate limits vary unpredictably (30-150 messages per 5 hours on Plus). The Codex CLI being open-source is a hedge worth considering.

Strengths

Benchmark leader: GPT-5.3-Codex sets industry highs on SWE-Bench Pro and Terminal-Bench for complex, multi-step engineering tasks.
Cloud sandbox architecture: Each task runs in its own isolated container. Parallel task execution lets you fire off multiple bugs or features simultaneously.
Open-source CLI: Built in Rust, runs locally, works with your files directly. Different trust model from the cloud version.
End-to-end task completion: GPT-5.3-Codex can operate a computer, deploy software, and complete multi-step professional workflows beyond pure coding.

Considerations

Locked to ChatGPT subscriptions: No standalone Codex plan. Need ChatGPT Plus ($20/mo) for basic access or Pro ($200/mo) for serious usage.
Fire-and-forget cloud agent: Can't course-correct while it's working. Wrong approach on a 20-minute task means waiting and starting over.
No image input: The cloud agent can't process screenshots or mockups. Meaningful gap for frontend-heavy work.
macOS only for desktop app: Windows support planned but unscheduled. CLI works cross-platform.

Resources

Repositories

Codex CLI GitHub Repositorygithub.com

Open-source Rust-based terminal agent with installation instructions

Documentation

Codex Pricingdevelopers.openai.com

Official pricing page covering plan requirements and rate limits

Articles

Introducing Codexopenai.com

Launch post explaining cloud agent architecture, sandbox model, and workflows

Introducing GPT-5.3-Codexopenai.com

Announcement of the computer-use capable model with benchmark results

More in Developer Experience

OpenAI Codex· Gemini CLI· LiteLLM· Coding Agents· Cursor· Google Antigravity· OpenRouter· Windsurf· Xcode Agentic Coding· Claude Code· GitHub Copilot· Prompt Caching

Back to AI Radar