Jason published a long post on X about Codex’s computer-use capabilities. The practical question is simple: Codex can now “use a computer” through Computer Use, the Chrome extension, and the in-app browser, but their boundaries are easy to blur.
In short:
- Prefer a plugin or MCP when a structured tool can solve the task.
- Use
@Computerwhen the task requires a desktop app, system settings, or a GUI with no API. - Use
@Chromewhen the task depends on your signed-in Chrome state, account, cookies, or multiple tabs. - Use
@Browserwhen you are building a website, debugging a local page, checking responsive layout, or leaving visual feedback.
Visual control is most useful at the point where structured tools stop. A Slack plugin is usually more precise than having Codex click around Slack. A GitHub plugin also produces actions that are easier to inspect than driving the website. Let Codex look at the screen and click only when an API, plugin, or MCP cannot cover the last step.
Original post: https://x.com/jxnlco/status/2066970432855581052
1. Use @Computer for desktop and native apps
Computer Use is the broadest of the three surfaces. It lets Codex observe graphical interfaces in approved macOS or Windows apps and operate them through windows, menus, keyboard input, and the clipboard.
It is usually also the slowest. A structured plugin can call an API directly. Computer Use has to inspect the UI, decide where to click, wait for the app to respond, and check the next state. That visual loop costs time, but it lets Codex work with software that exposes no useful API.
On macOS, “slow” does not necessarily mean disruptive. Computer Use can work in approved apps in the background while you keep using the rest of your computer. What it can handle depends on what you have installed and approved: Spotify, Xcode, System Settings, an iOS simulator, or even iPhone Mirroring.
Use @Computer for tasks that involve:
- native desktop apps such as Spotify, finance apps, or design tools;
- an iOS simulator, iPhone Mirroring, or another GUI-only flow;
- system settings or app settings;
- a data source with no plugin, MCP, or API;
- a workflow that moves across several local apps;
- the final UI action missing from an otherwise structured integration.
How to install it:
- Open
Settingsin Codex. - Go to
Computer Use. - Click
Installand follow the authorization prompts.
How to trigger it:
|
|
|
|
One example from the original post captures the boundary well: Jason once used Codex to handle an Amazon support wait after a stolen package. Amazon said it would take about 25 minutes to connect to an agent. He asked Codex to check the chat every five minutes, switch to every minute once an agent appeared, and do its best to complete the refund. When he came back, the refund was done.
Another common pattern is the “last mile.” Codex might read feedback from Slack, change code, and render a video, but the Slack integration available in that thread might not be able to upload the file. Computer Use only has to click Add file and finish the one action the structured tool cannot do.
The safety boundary is wider here. Give Computer Use one clear app or flow at a time. For financial, account, payment, credential, privacy, and system-security actions, stay present and review permission prompts.
2. Use @Chrome for signed-in state, multiple tabs, and account workflows
The Codex Chrome extension lets Codex use your existing signed-in Chrome state. If a task depends on your account, cookies, browser profile, existing tabs, or extensions, @Chrome is usually the right surface.
Use @Chrome for:
- Gmail, LinkedIn, or other signed-in websites;
- Salesforce, support consoles, or internal tools;
- company dashboards;
- research across several authenticated sites;
- forms that depend on your account, cookies, or browser extensions.
How to install it:
- Open
Pluginsin Codex. - Add
Chrome. - Follow the prompts to install the Codex Chrome extension and approve permissions.
- When the extension says
Connected, start a new thread.
How to trigger it:
|
|
Chrome tasks run in tab groups, which keeps the pages for one Codex thread together. Unlike the in-app browser, this surface carries your real browser identity, so it is both more capable and more sensitive.
Chrome’s other major advantage is multi-tab control. It can read context in one tab, compare it with another, and continue the workflow in a third. Computer Use can visually drive a browser too, but Chrome understands the task as a browser workflow instead of a sequence of screen coordinates.
The original post mentions a Strudel Composer example: Jason handed Codex an already-open music composition tab and asked it to make the music more interesting. Chrome gave Codex the selected tab and the page’s WebMCP tools. Codex inspected the composition, rewrote the harmony and four-minute form, changed the tempo, saved the track, and left it playing. It did not need to visually hunt for every button because Chrome could combine tab context with structured capabilities exposed by the page.
Long-running work also fits Chrome well. For example, you could ask Codex to check X DMs, relevant news, and feedback every day, add durable items to a local vault, and explicitly avoid posting or sending messages:
|
|
The important part is not that Codex can open X. It is that the same thread can return to the same signed-in work over time, connect what it finds to local files, and leave a reviewable result.
Chrome’s trust boundary also needs to be explicit. Websites may treat Codex’s clicks, submissions, and messages as actions taken by you. Page content can also be untrusted input. A safer pattern is to let it research, navigate, and draft automatically, while requiring your confirmation before sending, publishing, purchasing, or submitting.
If the entire task lives in the browser and needs signed-in state, choose Chrome before Computer Use.
3. Use @Browser to build and debug websites
The in-app browser is a browser inside a Codex thread. You and Codex share the same rendered page, which makes it especially useful for web development, visual debugging, and design feedback.
Use @Browser for:
- local development servers;
- local HTML or file previews;
- public pages that do not require sign-in;
- reproducing visual bugs;
- checking responsive layouts;
- leaving design feedback on page elements.
Its most important constraint is isolation: the in-app browser does not use your normal browser profile, cookies, extensions, signed-in sessions, or existing tabs. That is a limitation when a task needs an account, but a useful safety boundary when it does not.
How to install it:
- Open
Pluginsin Codex. - Add the
Browserplugin. - Enable it.
How to trigger it:
|
|
This surface is ideal for a tight feedback loop: Codex edits the code, operates the page, checks the rendered state, takes a screenshot, and keeps iterating.
The original post especially emphasizes annotation. When reviewing a local app, you can click an element or select an area and leave a comment. You can also give specific feedback on text, fonts, spacing, and color. Codex receives the comment with the relevant screenshot and element context, then changes the file and reopens the same page.
For design work, you can give Codex an idea, research packet, or project status, ask it to generate a single-file index.html, and open it in the in-app browser. Instead of describing the whole design again in another prompt, you can annotate the actual page:
|
|
The page itself becomes the specification.
The in-app browser is also a good starting point for mixed workflows. The original post gives an X example: first open the post in the in-app browser so Codex knows which post you mean; then switch to a Twitter CLI to fetch 38 replies, including nested replies hidden in the browser UI. That is the “narrowest surface” rule in practice: use the browser to establish on-screen context, then use a structured tool for deeper retrieval.
But if the task gets stuck on Google login, a passkey, a browser extension, or account state, move from the in-app browser to Chrome.
Appshots: point to context, not execution
Appshots are not a fourth way for Codex to control a computer. They are a way to give Codex the context in front of you.
On Mac, pressing CMD twice captures the most recent window. Codex attaches the image and any available text to the thread. You can Appshot an error, an email, a design, a settings panel, or an unfamiliar form, then tell Codex what you want it to do.
A useful mental model is:
Appshots point to something on your computer; Browser, Chrome, and Computer Use act on it.
Appshots are currently created from the Codex app on macOS. They capture the frontmost window, not the whole desktop. That makes them useful for focused context without granting control of the app.
How to choose
For everyday workflows, use this order:
- First check whether a plugin, MCP, CLI, or API can do the job. If it can be done structurally, do not make Codex click through the UI.
- If the task needs a desktop app, system settings, a simulator, or several local apps, use
@Computer. - If the task is in the browser and depends on your signed-in state, cookies, extensions, or multiple tabs, use
@Chrome. - If the task is web development, public-page inspection, local preview, or design annotation, use
@Browser. - If you only need to tell Codex “look at this window,” start with Appshots for context, then decide which surface should act.
The difference among these surfaces is not just capability. It is also the trust boundary. The closer a surface gets to your real accounts and desktop environment, the more clearly you should state the goal, define actions it must not take, and keep human confirmation before submitting, sending, paying, or publishing.