Computer Use on KnightLi Blog

What is cc-haha? A project that turns Claude Code into a desktop workbench

Thu, 14 May 2026 22:38:04 +0800

cc-haha is a project built around a modified Claude Code workflow. Its full repository name is NanmiCoder/cc-haha. The project page says plainly that it is based on Claude Code source code leaked from the Anthropic npm registry on 2026-03-31, and that its current main form is a desktop Claude Code workbench.

Project URL: https://github.com/NanmiCoder/cc-haha

There are two important points in that description.

First, it is not Anthropic’s official Claude Code. The README also states that the original source code copyright belongs to Anthropic and that the project is only for learning and research.

Second, its focus is no longer just “run a Claude Code CLI locally.” Judging from the README and the latest release, cc-haha is more like a desktop app that brings Claude Code sessions, projects, permissions, diffs, Computer Use, remote access, and model provider configuration into one place.

What problem is it trying to solve?

Claude Code is originally terminal-oriented. Sessions, command execution, permission prompts, file edits, and context switching all happen in the terminal. That works for people who are comfortable with CLI tools, but long-term use exposes a few rough edges:

Multiple projects and sessions are hard to manage side by side.
To see what files the AI changed, you often need to switch to Git or an editor.
Permission approvals, command execution, and file diffs are spread across different surfaces.
Remote viewing from a phone or another device requires extra setup.
Connecting non-Anthropic models requires dealing with protocol compatibility.

cc-haha tries to package these pieces into a graphical workbench. It is not just a skin for Claude Code; it moves session management and local development flow control into the desktop app.

Desktop workbench: from terminal to control center

According to the README, the cc-haha desktop app brings these capabilities into a macOS / Windows app:

Multi-session workbench: manage tasks with tabs, project switching, terminal entry points, and session history.
Branch / Worktree launch: choose a repository branch for a new session and decide whether to use the current worktree or an isolated Worktree.
Right-side code changes panel: view modified files, added and removed lines, and workspace status while chatting.
Visualized code edits: inspect AI edits, diffs, and execution steps.
Permission and approval flow: review dangerous commands, tool calls, and AI questions in the desktop app.
Multiple model providers: supports Anthropic-compatible APIs, third-party models, WebSearch fallback, and local configuration.
H5 remote access: use a one-time token to connect to the current desktop session from a phone or another device.
IM integration: use Telegram, Feishu, WeChat, or DingTalk to chat remotely, switch projects, and approve permissions.
Scheduled tasks and token usage: create scheduled tasks and view local token usage trends.

These features make it closer to an “AI coding workbench” than a simple command-line replacement. It tries to put the common surfaces of AI coding into one place: chat, file changes, permissions, projects, remote access, and model configuration.

Installation and startup

Most users should download the desktop installer from Releases.

The README describes the desktop install flow as:

Go to GitHub Releases and download the macOS or Windows installer.
On first launch, configure the model provider, API key, and default model in the desktop settings.
If macOS says the app cannot be opened, follow the installation guide to handle Gatekeeper permissions.

The latest release page shows that v0.2.6 was published on 2026-05-13. That version mainly focuses on restoring secure H5 mobile access, desktop session management, file mention search, and desktop UX polish.

If you want to start the CLI from source, the README provides:

1
2
3

bun install
cp .env.example .env
./bin/claude-haha

That path is better for people who want to debug the lower-level CLI, server, or build their own changes. For normal use, the desktop app is more direct.

What changed in v0.2.6

The main point of v0.2.6 is that H5/LAN access was pulled back from a temporary open state into an explicit enablement and token pairing model.

Notable changes include:

H5/LAN access must be explicitly enabled locally.
QR links carry a one-time visible token.
Remote APIs, proxies, and WebSockets are no longer exposed without protection.
Settings now has a separate H5 Access page.
The desktop sidebar gained batch management for selecting and deleting sessions.
Desktop file mention search became git-first, respects ignore rules, and reduces noise from node_modules and build output.
A pure white theme was added, and bugs such as long URLs breaking chat layout and draft leakage across tabs were fixed.

This shows the project has moved beyond “it runs” and is now filling in the safety boundaries and daily UX details that a desktop product needs.

The H5 access part deserves special care. The author explicitly notes in the release that H5 is a browser access entry for individuals or trusted teams, not a public multi-tenant login system. In practice, it should not be treated as an internet-facing SaaS admin console.

Computer Use: letting the Agent operate the desktop

Another important selling point of cc-haha is Computer Use.

The project docs say this feature is a heavily modified version of the Computer Use implementation in the leaked Claude Code source. The official implementation depends on Anthropic’s private native modules, such as @ant/computer-use-swift and @ant/computer-use-input, which are not publicly available. cc-haha replaces the low-level operation layer with a Python bridge using public libraries such as pyautogui, mss, and pyobjc.

Computer Use supports operations such as:

Screenshot: screenshot, zoom
Mouse: click, drag, move, scroll, and read cursor position
Keyboard: type text, press keys, hold keys
Applications: open applications, switch displays
Permissions: request app access, list granted applications
Clipboard: read and write clipboard content
Other: wait, batch operations

Its workflow is a “screenshot - analyze - act” loop:

The model receives a user request.
It calls screenshot to capture the screen.
The model uses vision to identify buttons, input fields, and coordinates.
It calls click, typing, or application tools.
It screenshots again to confirm the result, then continues.

From the docs, the fully supported platform is mainly macOS, including Apple Silicon and Intel. Windows / Linux are theoretically possible, but the pyobjc app-management parts need platform-specific replacements and are not fully adapted yet.

Runtime requirements include:

Bun >= 1.1.0
Python >= 3.8
macOS Accessibility permission
macOS Screen Recording permission

This kind of feature is powerful, but it also raises permission risk. When letting AI operate desktop apps, it is better to authorize only the applications that are clearly needed and avoid leaving sensitive content open in unrelated windows.

Multi-model access through an Anthropic-compatible layer

cc-haha still communicates using the Anthropic Messages API protocol. The project docs recommend using LiteLLM as a protocol conversion proxy.

The basic structure is:

`1`	`claude-code-haha ──Anthropic协议──▶ LiteLLM Proxy ──OpenAI协议──▶ 目标模型 API`

In other words, cc-haha sends Anthropic Messages API requests, LiteLLM converts them to formats such as OpenAI Chat Completions, and then forwards them to OpenAI, DeepSeek, Ollama, or other model services.

The LiteLLM install command in the docs is:

`1`	`pip install 'litellm[proxy]'`

Then you can configure OpenAI, DeepSeek, Ollama, and other models in litellm_config.yaml. After the proxy starts, set these values in .env or ~/.claude/settings.json:

ANTHROPIC_AUTH_TOKEN=sk-anything
ANTHROPIC_BASE_URL=http://localhost:4000
ANTHROPIC_MODEL=gpt-4o
ANTHROPIC_DEFAULT_SONNET_MODEL=gpt-4o
ANTHROPIC_DEFAULT_HAIKU_MODEL=gpt-4o
ANTHROPIC_DEFAULT_OPUS_MODEL=gpt-4o
API_TIMEOUT_MS=3000000
DISABLE_TELEMETRY=1
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

There are a few practical caveats:

drop_params: true is important, because Anthropic parameters such as thinking and cache_control do not exist in the OpenAI API.
Extended Thinking is an Anthropic-specific feature and is unavailable with third-party models.
Prompt Caching will not work in the Anthropic-native way.
Tool calls must be converted from Anthropic tool_use to OpenAI function calling, so complex tool use may have compatibility issues.
Small local Ollama models may not handle this tool-heavy workflow reliably.

So multi-model access can work, but that does not mean every model will feel the same. cc-haha still demands strong tool use, code understanding, and long-context ability from the model.

Who is it for?

cc-haha is better suited for:

People already familiar with Claude Code who want desktop session management.
Users who often work across multiple repositories, branches, and AI sessions.
People who want to inspect AI file changes, diffs, and workspace status in a side panel.
Users who want to experiment with Computer Use and let an Agent operate desktop apps.
People who want to connect OpenAI, DeepSeek, Ollama, or other models through an Anthropic-compatible protocol.
Users who need phone or IM-based remote viewing and permission approval.

It is less suitable for:

Users who only want the stable official Claude Code experience.
People who cannot accept the leaked-source background and copyright uncertainty.
Users who do not want to grant high system permissions to local tools.
Teams that need enterprise compliance, auditability, and official support.
Users unfamiliar with API keys, proxies, model compatibility, and local service configuration.

Risks and boundaries

This article cannot only talk about features. It also has to talk about risk.

The origin of cc-haha means it is not an ordinary community reimplementation. The README clearly states that it is based on leaked Claude Code source code and that the original source belongs to Anthropic. This creates uncertainty around copyright, compliance, and long-term maintenance.

Computer Use, H5 remote access, IM integration, and local permission approval are also high-permission capabilities. The more convenient they are, the more clearly boundaries need to be defined:

Do not expose H5 access on untrusted networks.
Do not treat the token as a long-term public login credential.
Do not grant the Agent access to unrelated sensitive applications.
Do not casually use it in production or company compliance environments.
Do not expose third-party model proxy settings or API keys in public repositories.

If your goal is to study AI coding tool architecture, desktop workflows, and Computer Use implementation, it is a useful reference. If you want to put it into a long-term production workflow, evaluate legal, permission, security, and maintenance risks first.

Summary

The most interesting thing about cc-haha is not whether it can replicate Claude Code. It is that it pushes Claude Code-style AI coding tools toward a desktop workbench form.

Sessions, projects, Worktree, diffs, permissions, remote access, Computer Use, model providers, scheduled tasks, and token usage are all brought into one desktop experience. That suggests the next step for AI coding tools is not only stronger models, but also a more complete workflow interface.

But its boundaries are also clear: it is not an official Anthropic product, it has a sensitive source-code background, and its high-permission features require caution. A better way to view it is as a project for observing where AI coding tools may evolve, not as a careless replacement for official Claude Code.

References

GitHub repository: https://github.com/NanmiCoder/cc-haha
Latest release: https://github.com/NanmiCoder/cc-haha/releases/tag/v0.2.6
Computer Use documentation: https://github.com/NanmiCoder/cc-haha/blob/main/docs/computer-use.md
Third-party model documentation: https://github.com/NanmiCoder/cc-haha/blob/main/docs/guide/third-party-models.md

Codex Is Starting to Control the Computer. What Does That Mean for the Future?

Wed, 29 Apr 2026 11:28:25 +0800

The most important part of this Codex update is not that it added another ordinary button. It is that Codex is starting to move toward “controlling the computer.”

In the past, using AI usually meant asking questions in a chat box, copying, pasting, and then manually operating software.
Now that boundary is expanding: AI does not just answer you. It can operate desktop applications according to your goal.

In the short term, this is a new feature. In the long term, it may change how many people use computers.

What This Feature Is

Simply put, Codex’s computer use capability lets it access and operate the desktop environment.

It can do things such as:

select and control an application
receive tasks in natural language
open browsers, AI tools, local files, or other software
enter text, click buttons, and wait for results
connect multiple steps into one task
keep running in the background without requiring the user to follow every step manually

Its role is not just to write a piece of text for you, but to complete an operation flow for you.

That is the key difference between an Agent and an ordinary chatbot:
a chatbot mainly gives answers; an Agent is closer to “receiving a goal and then executing it.”

Why This Matters

In the past, much automation required you to know how to write scripts.

For example, suppose you want to complete a cross-software workflow:

open a web page
find information
copy content
pass it to another AI tool
save a file
open the local directory and check the result

To automate this traditionally, you might need browser scripts, APIs, local programs, and even window automation.

But many ordinary users do not know how to write these things.
Even if they do, it may not be worth writing a script for a temporary task.

This is where computer use matters: it pushes “script-like capability” toward natural language.

You do not necessarily need to tell it exactly where to click.
You can tell it what result you want and let it try to complete the task.

Workflows It May Change

I think the first workflows to change will not be extremely serious or high-risk work, but the tasks that are annoying, fragmented, repetitive, and not worth writing a dedicated program for.

1. Moving Information Across Software

The most typical case is moving information between applications.

Previously, you might switch back and forth between a browser, a document, a chat window, and a local folder.
In the future, you can hand this kind of task to an Agent:

find a certain kind of information
summarize it into a document
save it to a specified directory
open the result for you to review

This work is not hard, but it consumes attention.
The value of an Agent is that it absorbs these small operations.

2. Coordination Between Multiple AI Tools

Many people’s real workflow is no longer based on a single AI tool.

It may look like this:

one tool writes code
one tool researches information
one tool generates images
one tool organizes documents

Previously, these tools were connected by manual copy and paste.
In the future, an Agent can become the middle layer: it opens tools, passes context, waits for output, and organizes results.

This can turn “multiple AI tools working together” from a manual process into a semi-automated process.

3. Office Software Automation

Spreadsheets, presentations, documents, and email share one trait: they are powerful, but many operations are fragmented.

If Agents can reliably control this software, the barrier to office automation will drop noticeably.

You do not need to remember where a menu is or learn complicated shortcuts.
You only need to describe the goal, such as:

turn this spreadsheet into a monthly report
make a one-page summary from this document
combine these materials into a clearly structured explanation

The tedious button operations will gradually be hidden behind natural language.

What It Means for Ordinary Users

For ordinary users, this kind of feature may have a more direct impact than “the model got a bit smarter.”

Because it lowers the operation barrier, not just the knowledge barrier.

Many people can describe what they want, but they do not know where to click or how to combine features inside software.
If Agents can take over this part, using a computer may become:

1
2
3

I describe the goal
Agent operates the software
I check the result

That is closer to real productivity than simple chat.

Its Impact on Software

If this kind of Agent capability continues to mature, software itself will also be affected.

In the past, software design mainly served human clicking.
In the future, software may also need to serve Agent operation.

This means:

interface elements need to be clearer
operation feedback needs to be more stable
local permissions need to be more granular
software may provide interfaces better suited for Agent calls
users may care more about whether software can be operated smoothly by AI

In the long run, the boundaries between applications may become thinner.
Users may care less about “which app should I open” and more about “what task do I want to complete.”

Do Not Overhype It Yet

Of course, it is not time to fully let go yet.

This kind of capability still has several clear limitations:

stability still needs observation
complex tasks may fail in the middle
permission boundaries must be handled carefully
account, payment, and file deletion operations should not be delegated casually
quota consumption is not something you can completely ignore

So at this stage, the best use case is not letting it take over the whole computer, but letting it handle low-risk, reviewable, step-heavy tasks.

For example:

organizing materials
generating drafts
moving content across tools
opening and checking files
running semi-automated workflows that can be reviewed by a human

One Last Line

The real importance of this Codex update is that it pushes AI from “answering questions” toward “operating the environment.”

In the short term, it is a computer use feature.
In the long term, it may mark a shift in how personal computers are used.

In the future, we may spend less time remembering buttons, finding menus, and switching windows.
More often, we will describe the goal, let an Agent execute it, and then let humans make the final judgment.