Reading the Official Codex Article: How to Get the Most Out of Codex

Most developers start using Codex with code tasks: reading a repository, editing a diff, running tests, and opening a pull request.

That is still Codex’s core use case. But a lot of work on a computer is already surrounded by code and tools: running shell commands, browsing web pages, calling APIs, exporting documents, responding to messages, and triggering automation. As these capabilities gradually connect to Codex, it becomes less a narrow coding assistant and more a system that helps you complete work on a computer.

The Codex app makes this shift more concrete. A thread can preserve context, call tools, display artifacts, and keep moving across multiple rounds of prompts instead of restarting every conversation from scratch.

To use Codex more fully, the key is combining these capabilities:

Durable threads for preserving long-term context
Voice input, steering, and queuing, so the user still controls the process
browser, computer use, MCP servers, and connectors, so Codex can move beyond the repository
thread automations and Goals, so tasks can keep progressing after the user leaves
The sidebar for reviewing code, documents, slides, web pages, and other artifacts
Shared memory, which writes important context outside the thread

Durable threads

Durable threads are long-running threads that can preserve work context across multiple sessions.

Pinned threads are a very practical entry point. They are a good place for workflows you return to repeatedly, such as:

A Chief of Staff thread
A release thread
A document review thread
An external monitoring thread

These are not temporary chats, but persistent workspaces. Codex can return to the same thread later and reuse prior decisions, preferences, and background information, avoiding the need to rebuild context from zero every time.

Keyboard shortcuts also make this smoother. Command-1 through Command-9 can jump directly to saved threads.

Voice input

The value of voice input is that it captures ideas before they have been organized into formal text.

Codex has built-in voice input. It is especially useful for fuzzy starting points that feel natural to say but awkward to type:

1
2
3


我记得 Slack 里好像有个叫 Ben 的人提过这个。
具体细节我不记得了。
帮我去找一下。

For an agent that can search, organize context, and report back, this is often enough to get started.

Voice is also useful for two- or three-minute thought dumps. Meeting transcripts, dictated planning notes, and unorganized raw records are often more useful than a one-sentence summary, because they preserve uncertainty, emphasis, and unfinished lines of thought.

Steering and queuing

Voice becomes more useful when combined with explicit control.

Steering means inserting a new direction while a Codex task is running, so it can change course before the current step finishes.

For example, while reviewing a web page, the user can annotate in the sidebar and interrupt the current task at the same time:

1
2
3


这里再小一点。
这两个元素之间的间距不对。
这句文案写错了。

Queuing is different. It does not interrupt the current task; it places the next piece of work in the queue:

1

这项工作完成后，把预览链接发给 Slack 里的 reviewer。

Steering changes what Codex is doing right now. Queuing changes what it should do next. Both keep the user close to the work as the task unfolds.

Tools and reachable scope

Once threads have continuity, the next question is: what can they operate?

Codex can expand outward layer by layer:

$browser: good for web page inspection, annotation, and review in the sidebar
@chrome: good for browser workflows that depend on the user’s Chrome login state
@computer: good for tasks that can only be completed through a desktop GUI

MCP servers and connectors extend the same idea into more workflows. Slack, Gmail, and Calendar matter because many tasks do not first appear as code. They appear as messages, emails, and calendar problems.

Skills are good for solidifying repeated work. Once a process has proven useful, it can be packaged as a skill so Codex does not need to relearn the same steps next time.

Continue working from anywhere

The Codex mobile app changes how long the user has to stay in front of a computer.

A task can start on a Mac because the files, permissions, and local environment are there. Later, the user can leave the desktop and continue confirming, adding details, or changing direction from a phone.

This is valuable in many small scenarios: while Codex runs a long task, the user can leave their desk; if it needs confirmation, they can respond while away; if the direction is wrong, they can redirect it in time. What truly stays in place is the local environment, not the user.

Automation

Automations can run Codex work on a schedule.

If a recurring task should restart from a specific workspace, such as a daily report or routine repository check, scheduled automation is a good fit. If the schedule should return to an existing conversation and reuse its context, thread automation is better.

Thread automations are more like heartbeat wake-ups: they return to the same Codex thread on a fixed rhythm.

Pinned threads require the user to come back actively, while thread automation can check every few minutes or every few hours, keep running until a condition is satisfied, and adjust its rhythm over time.

For example, a Chief of Staff thread could run every 30 minutes:

1
2
3


每 30 分钟检查 Slack 和 Gmail，找出需要我注意但还没有回复的消息。
帮我判断哪些最重要。
如果有人问我问题，尽可能深入研究答案，并替我起草回复，但不要发送。

When the user returns, the most time-consuming context gathering is often already done. The actual decision about whether to send still belongs to a human.

Thread automations are also useful for feedback loops. They can periodically check pull request comments, Google Docs comments, or Slack replies, continuing adjacent work while the user is away.

For example, in an animation workflow, a reviewer sends video feedback in Slack, and a thread automation checks the thread on a schedule. If there is a new comment, it re-renders the version and replies to the reviewer in the same Slack thread. If an integration cannot complete the final upload, desktop automation can still fill in the last step through the GUI.

This loop may cross Slack, the codebase, and desktop applications, but to the user it still stays inside one workflow.

Goals

Goals are best suited to tasks that have a clear endpoint and can be pushed forward continuously by an agent.

A weaker goal might be:

1

实现这个 Markdown 文件里的计划。

A stronger goal has measurable completion criteria.

For example, when migrating an internal tool from Python to Rust, you can first create the new directory and then define the target clearly: the new implementation is only complete once the unit tests pass.

A Goal is essentially continuous execution plus a verifier. The user needs to define the outcome, the stopping condition, and the signals that indicate whether Codex is getting closer to the goal.

Common verifiers include:

Test suites
benchmark
bug reproduction
validation matrix
End-to-end workflows that must keep passing

A task can be ambitious, but without verification criteria, it is more like a wish than a goal.

The sidebar places work artifacts next to the conversation that generated them. The user does not need to export files, switch context, and then describe the problem afterward. The artifact might be code, but it could also be a deck, PDF, web page, spreadsheet, or another artifact generated during the work.

It is especially useful for four types of work:

Inspecting an artifact
Annotating places that need changes
Operating a web interface
Reviewing changes

Markdown, spreadsheets, data tables, documents, and slides can all be viewed directly in the sidebar. The user can inspect, annotate, and revise them without turning the process into another handoff.

If it is a deck or PDF, it can stay beside the thread that produced it and accept review and fixes at any time.

The browser is a similar work surface. Codex can open a rendered page, inspect it, respond to user annotations on the page, and continue fixing the same object. A web page is both the output and the control surface.

These surfaces are especially good fits for the sidebar:

Lightweight static artifacts such as index.html
Storybook
Remotion Studio
Browser slides
Data analysis applications

A standalone index.html file can become a long-lived interactive artifact without necessarily requiring a server. Thread automations can also refresh static artifacts on a schedule, so the user sees new results when they return.

Shared memory

Long threads are useful, but important context should not exist only in the conversation history.

Shared memory means storing durable context outside the thread, so future work can continue from an explicit, reviewable place.

One stable practice is to anchor durable threads in an Obsidian vault. In practice, this can be very simple: a set of ordinary files that are easy to inspect, edit, move, and preserve long term. Teams can put it in cloud storage, Git, Dropbox, Google Drive, or another sync layer.

A vault might look like this:

1
2
3
4
5
6


vault/
├── TODO.md
├── people/
├── projects/
├── agent/
└── notes/

A top-level AGENTS.md can explain how Codex should maintain this workspace: what information should be written down, where it should go, and when not to create noise.

A practical AGENTS.md might look like this:

1
2
3
4
5


- Treat ~/vault as durable work memory.
- Prefer canonical notes over note sprawl.
- Route TODOs, people, projects, daily summaries, and scratch notes explicitly.
- Preserve decisions, blockers, owners, dates, and useful links.
- If nothing meaningful changed, do not churn the vault.

Do not copy a particular vault structure blindly. What matters more is teaching the agent where long-term context should live, which information is worth preserving, and when it should avoid repeatedly changing files.

The repository stores code. The vault stores rolling context: relevant people, what happened, where things are blocked, who owns what, what comes next, and the details that would otherwise disappear between conversations if they were not written down.

Codex also has first-party memory capabilities, configurable in Settings > Personalization > Memories. They are good for recording preferences, repeated workflows, and common pain points, but they work better as a supplement to explicit written context than as a replacement. Chronicle is moving in the same direction as well: helping Codex build memory from recent screen context.

Expanding outward from code

Codex still starts with code. But more of the work around code can now be reached by the same system: MCP servers, browser interfaces, desktop control, thread automations, and reviewable artifacts.

This changes how Codex is controlled. Steering interrupts work in progress. Queuing schedules the next step. Thread automations keep a thread active after the user leaves. Goals add clear endpoints and verification signals to long-running tasks.

When these capabilities connect, Codex can move a workflow from instruction to execution and then onward to artifact review. Even after a task has left the code repository, it can still be completed inside the same system.

Original link: Getting the most out of Codex