Codex Pricing and Usage Limits: Token Rate Card Explained

OpenAI has updated the Codex rate card. Most Plus, Pro, Business, Enterprise, Edu, Health, Gov, and ChatGPT for Teachers users should now refer to the new token-based usage rules.

The key change is simple: Codex is no longer mainly estimated by “how many messages you sent.” Instead, it separates input tokens, cached input tokens, and output tokens. Longer tasks, longer outputs, and stronger models usually consume more usage.

Quick Answer

Codex usage depends on the model, input tokens, cached input, output length, and the plan or workspace rate card. Check the current rate card shown by OpenAI or your organization before budgeting, because model availability and credit rates can change. Cached context and shorter outputs generally consume fewer credits than repeatedly sending large uncached prompts.

Which Plans Use the New Rate Card

The new token-based rate card applies to:

New and existing ChatGPT Plus and Pro users;
New and existing ChatGPT Business users;
New and existing Enterprise, Edu, Gov, Health, and ChatGPT for Teachers users.

OpenAI says that starting April 2, 2026, Codex pricing began moving from per-message billing to API token usage billing for Plus, Pro, ChatGPT Business, and new ChatGPT Enterprise plans. On April 23, 2026, the update expanded to existing ChatGPT Enterprise plans, including Edu, Health, Gov, and ChatGPT for Teachers.

A small number of Enterprise customers may still temporarily use the legacy rate card. If your organization has not migrated yet, follow the information shown in your workspace or confirmed by OpenAI sales.

New Codex Token Rate Card

The new rate card calculates usage per 1 million tokens and separates input, cached input, and output. Output is usually more expensive, while cached input is usually cheaper.

Model	Input Tokens	Cached Input Tokens	Output Tokens
GPT-5.5	125 credits	12.50 credits	750 credits
GPT-5.4	62.50 credits	6.250 credits	375 credits
GPT-5.4-Mini	18.75 credits	1.875 credits	113 credits
GPT-5.3-Codex	43.75 credits	4.375 credits	350 credits
GPT-5.2	43.75 credits	4.375 credits	350 credits
GPT-5.3-Codex-Spark	Research preview	Research preview	Research preview
GPT-Image-2.0 (image)	200 credits	50 credits	750 credits
GPT-Image-2.0 (text)	125 credits	31.25 credits	250 credits

The “per 1 million tokens” part matters. The final cost of a Codex task depends not only on how many sentences you typed, but also on how much context it read, how much cached input it reused, how long the output was, and whether a faster mode was enabled.

OpenAI also notes that code review uses GPT-5.3-Codex by default. GPT-5.3-Codex-Spark may appear in Codex as a research preview, and its rates have not been finalized.

Why Token-Based Billing Matters

The legacy rules used average estimates such as “one message” or “one Pull Request.” That was useful for rough budgeting, but it did not explain why different tasks could cost very different amounts. The new rules connect usage more directly to real model activity.

For example, for the same Codex request:

Asking it to explain a small function keeps both input and output short, so usage is relatively low;
Asking it to read multiple files, run tools, and generate a large patch increases both input and output;
Repeatedly adding requirements during a long task makes the context keep growing;
Generating lots of code, reports, or review comments can make output tokens the main cost.

So controlling Codex cost is not just about sending fewer messages. It is more about reducing irrelevant context, splitting tasks, controlling output size, and making each request clear.

The Legacy Rate Card Still Exists

OpenAI still keeps the legacy rate card mainly for a small number of Enterprise customers that have not migrated yet. The old table estimates average usage by message or Pull Request.

Billing Unit	GPT-5.5	GPT-5.4	GPT-5.3-Codex	GPT-5.1-Codex-mini
1 local task message	About 14 credits	About 7 credits	About 5 credits	About 2 credits
1 cloud task message	Not available	About 34 credits	About 25 credits	Not available
1 code review Pull Request	Not available	About 34 credits	About 25 credits	Not available

OpenAI says the legacy average credits also apply to older GPT-5.2, GPT-5.2-Codex, GPT-5.1, GPT-5.1-Codex-Max, GPT-5, GPT-5-Codex, and GPT-5-Codex-Mini models.

If your account has already migrated to the new rules, use the token rate card first instead of relying on old message averages.

How This Affects Real Usage

The impact depends on your workload. Tasks with long output, large context, many automation steps, or frequent fast-mode use may consume noticeably more. Lightweight edits, short Q&A, and small code explanations may consume less.

OpenAI’s reference estimate is that the average Codex cost per developer is about 100 to 200 USD per month, but real usage varies widely. Model choice, parallel instances, automation features, and fast mode can all change the final amount.

Where to Check Remaining Usage

Users can check usage limits and remaining credits in the Usage panel inside Codex settings. Depending on the plan and workspace role, some users can also buy credits directly or manage auto-recharge.

If you are in a team or enterprise workspace and cannot add credits yourself, you usually need to contact the workspace owner or administrator.

Practical Tips

For everyday Codex use, these habits can help control usage:

Set one clear goal for each task;
Split large tasks by file, module, or feature;
Avoid repeatedly carrying irrelevant logs, old outputs, and large context blocks;
When you need long output, specify the scope and format first;
For frequent reviews or long-running automation, pay attention to output tokens and fast mode;
In team environments, confirm whether your workspace has migrated to the new token-based rate card.

The core shift in the new rate card is that Codex usage is now closer to real token cost. When a task costs more than expected, do not look only at message count. Look at input, cached input, and output together.

Source: Codex rate card