OpenAI GPT-5.5 + Codex Launch Day — 400K Context, Fast Mode, 2× API Pricing, and NVIDIA GB200 Under the Hood

GPT-5.5 + Codex Launch Day — What OpenAI Actually Announced

Today (Apr 23, 2026) OpenAI officially announced GPT-5.5, shipping it simultaneously across ChatGPT and Codex, positioning it as “a new class of intelligence for real work” — with emphasis on multi-step tasks, planning, tool calling, and self-verification.

This isn’t a hands-on review. The API isn’t open to general developers yet (OpenAI only said “coming very soon”). What we can do right now is analyze the specs, pricing, and market positioning from the announced data. If you want a hands-on take, check back once we’ve had real time with it through the API.

TL;DR

Who can use it now: Plus / Pro / Business / Enterprise via ChatGPT UI and Codex — API not yet open for general developers
What’s new in Codex: 400K token context window, Fast mode 1.5× faster (2.5× the cost), and expanded browser use that clicks through real web apps, takes screenshots, and iterates based on what it sees
API pricing at launch: $5/M input, $0.50/M cached input, $30/M output — roughly 2× GPT-5.4 — OpenAI says improved token efficiency will partially offset the difference
Hardware side: Runs on NVIDIA GB200 NVL72 rack-scale systems — 35× lower cost per million tokens, 50× higher output per second per megawatt vs. the previous generation

Where OpenAI Is Positioning GPT-5.5

GPT-5.5 isn’t a clean-sweep replacement for GPT-5.4. It’s positioned as the flagship for serious work — long continuous coding, agents that need multiple tools, and workflows that require self-directed planning.

On the ChatGPT side, Plus ($20/month) gets access immediately. GPT-5.5 Pro is limited to Pro / Business / Enterprise only. For Codex, coverage also extends to Go and Edu tiers.

Worth noting: OpenAI didn’t release precise benchmark numbers (SWE-bench, Terminal-Bench, MMLU) in the day-one announcement — leaning instead on qualitative use cases like agentic coding, multi-step tasks, and self-verification. If you need hard numbers for comparison, you’ll have to wait for official results.

GPT-5.5 vs GPT-5.4 — Announced Specs Compared

Factor	GPT-5.4	GPT-5.5
Context (Codex)	unchanged	400K tokens
Fast mode	none	1.5× faster (2.5× the cost)
Browser use	limited	expanded — click, screenshot, iterate
API input	baseline	$5 / M tokens (2× GPT-5.4)
API cached input	baseline	$0.50 / M tokens
API output	baseline	$30 / M tokens (2× GPT-5.4)
API availability	available	coming very soon

Read the pricing row straight: 2× more expensive, not 30% or 4×. OpenAI confirms the new model uses fewer tokens for the same work, so the real-money delta may be smaller — but that still needs to be proven in production, not taken on faith from a launch doc.

The cached input at $0.50/M is worth flagging for anyone running agents with long, repeated system prompts — you can save significantly if you design your prompts to be cache-friendly.

4 New Things in Codex Worth Knowing

400K context window — large enough to load a mid-size repo in one shot. Compare to Claude Sonnet at roughly 200K and Gemini Pro at 2M via Vertex AI (check each provider’s access terms). GPT-5.5 isn’t the biggest, but it’s sufficient for most real-world codebases.

Fast mode — generates tokens 1.5× faster at 2.5× the cost. Right for interactive sessions where someone is waiting on a response, wrong for overnight batch jobs.

Expanded browser use — Codex can now click through real web apps, take screenshots, and iterate based on what it sees. This is a meaningful step for agents that need to test UI or work with services that have no API.

Positioned as an agent that “does real work” — OpenAI’s emphasis is on multi-step tasks, self-verification, and tool use. No official benchmark numbers on launch day.

vs. Claude Sonnet / Gemini Pro — Qualitative Comparison

Factor	GPT-5.5 (Codex)	Claude Sonnet	Gemini Pro (Vertex)
Context	400K	~200K	2M (via Vertex)
Agent / Tool use	agent-first, browser use built-in	strong across chat + tool use	workflow-level only
API availability	coming very soon	available	available via Vertex / AI Studio
Pricing	$5 / $30 per M	varies by tier	varies by tier

This table is intentionally qualitative — an apples-to-apples comparison on launch day is risky before official benchmarks drop, and competitor pricing shifts by tier and region. If you’re building production workloads, run your own prompts across all three after the GPT-5.5 API opens.

Pros / Cons — From the Announced Data

Pros

+400K context in Codex — fits a mid-size repo in a single pass, no chunking
+Fast mode 1.5× throughput, right for interactive coding sessions
+Expanded browser use — clicks real UI and takes screenshots
+OpenAI claims improved token efficiency partially offsets the price jump
+Cached input at $0.50/M — 10× cheaper than standard input for repeated prompts

Cons

−API pricing 2× GPT-5.4 — both input and output
−API not yet open to general developers ('coming very soon')
−Requires Plus/Pro or higher ($20/month+) to use via UI right now
−No official benchmark numbers released on launch day
−GPT-5.5 Pro limited to Pro / Business / Enterprise only

Cost Math You Should Do First

Three questions to answer before moving any workload to GPT-5.5 API.

One — does your workflow actually benefit from cached input? If your system prompt is long and reused across every call, $0.50/M could save more than you’d expect. If your prompt changes every request, you won’t see that discount at all.

Two — is Fast mode worth 2.5× the cost? Only if someone is waiting on the response in real time. For overnight batch jobs, standard mode is fine.

Three — when are you ready to migrate? The API isn’t open yet, and GPT-5.4 is still fully functional. If you don’t have a use case that genuinely needs 400K context, wait for the API launch before making any decisions.

Who Should Use It / Who Can Wait

✓

Made for

Plus / Pro subscribers who use ChatGPT / Codex daily — you get it at no extra cost within your existing subscription
Teams building agents that click through web apps or test UIs — the expanded browser use is exactly what you need
Mid-to-large codebases you want Codex to read whole without chunking

Think twice

Teams currently dependent on the API — wait for the API launch announcement before planning a migration
Startups watching their API budget closely — the 2× price jump is real; test token efficiency against your own workloads before committing

Skip this one

Solo devs / hobby projects — GPT-5.4 or open-weight models are still capable and meaningfully cheaper
Anyone who needs official benchmark numbers before spending — wait a month for third-party results
Workloads that don't involve agent / tool use — that's GPT-5.5's primary value prop; without it, you're overpaying

Verdict

GPT-5.5 looks like an agent capability upgrade more than a general model improvement — 400K context, Fast mode, expanded browser use. All of it targets people actually running Codex or writing serious agents. Chat quality isn’t the story here.

On pricing, 2× is a steeper jump than the market is used to. But if the token efficiency gains OpenAI is claiming are real, the final bill may not look that different. You’ll have to measure it yourself.

The NVIDIA GB200 NVL72 hardware detail is worth paying attention to if you care about infrastructure — 35× lower cost per token and 50× higher throughput per megawatt is the reason OpenAI can ship a larger model at a price that’s still in the conversation.

For devs already on Plus — you can open GPT-5.5 in ChatGPT / Codex today. For teams planning a production API migration, wait for OpenAI to announce the API date, then test it against your actual workloads before flipping the switch.

This article is written from specs and launch-day announcements only. We’ll update it once the API is open and we’ve had enough time with it on a real codebase.