OpenRouter on KnightLi Blog

Two Ways to Use DeepSeek Models with Codex: Local Gateway and OpenRouter BYOK

Sun, 24 May 2026 09:52:55 +0800

If you want Codex to use DeepSeek, the first instinct is usually to edit ~/.codex/config.toml:

1
2

model = "deepseek-chat"
base_url = "https://api.deepseek.com"

That idea can work in some older versions or in regular OpenAI SDK scenarios. But with the current Codex CLI, it can easily run into a lower-level mismatch: custom model providers in Codex use the OpenAI Responses protocol, while DeepSeek’s official API is mainly exposed through an OpenAI-compatible Chat Completions interface.

My local version is currently codex-cli 0.111.0. codex --help shows support for configuration entry points such as --config, --model, and --profile. The official OpenAI Codex configuration reference is also explicit: model_providers.<id>.wire_api currently supports only responses, and defaults to responses when omitted.

DeepSeek’s official docs, meanwhile, show the call path as https://api.deepseek.com/chat/completions, with examples such as client.chat.completions.create(...). So the issue is not that DeepSeek cannot be called through OpenAI-style tooling. The issue is that the request semantics Codex sends are not exactly the same as what DeepSeek’s native API understands.

That is why changing base_url directly to https://api.deepseek.com may produce symptoms such as:

The request path does not match, resulting in a 404 or an unexpected response format.
Multi-turn conversations, tool calls, or patch generation fail during parsing.
tool_calls order, message structure, or streaming event format does not line up.
The model seems able to answer a plain prompt, but starts failing once Codex does real work.

The steadier approach is to put a translation layer between Codex and DeepSeek. There are two common routes.

Method 1: Bridge DeepSeek Through a Local Gateway

A local gateway should do more than simple forwarding. Its job is to convert Responses-style requests from Codex into Chat Completions-style requests that DeepSeek can handle, then convert DeepSeek’s result back into a format Codex can consume.

If you use a local gateway such as ccx, the configuration idea looks roughly like this:

[profiles.deepseek-ccx]
model = "deepseek-v4-flash"
model_provider = "ccx-bridge"

[model_providers.ccx-bridge]
name = "Local CCX Gateway"
base_url = "http://localhost:3000/v1"
env_key = "DEEPSEEK_API_KEY"

Then set the DeepSeek key in your terminal and start Codex with that profile:

1
2

export DEEPSEEK_API_KEY="your-deepseek-key"
codex --profile deepseek-ccx

In PowerShell:

1
2

$env:DEEPSEEK_API_KEY="your-deepseek-key"
codex --profile deepseek-ccx

There are two details to watch.

First, base_url should point to the gateway endpoint exposed to Codex, not the official DeepSeek address. The gateway calls DeepSeek behind the scenes.

Second, the correct value for env_key depends on how the gateway handles authentication. Some gateways read the official DeepSeek key directly. Others ask you to provide a local proxy key, while storing the DeepSeek key in the gateway backend. In that case, env_key should be changed to whatever environment variable the gateway expects.

This route is local and controllable, and it is easier to reason about latency and cost. The tradeoff is that you must confirm the gateway really supports the current Responses semantics used by Codex, rather than only acting as a basic Chat Completions proxy.

Method 2: Use OpenRouter BYOK as an Online Bridge

If you do not want to run a local gateway, OpenRouter BYOK is another option. BYOK means binding your own upstream provider key to OpenRouter, which then handles routing and forwarding.

The most common mistake here is the environment variable. Codex is calling OpenRouter, so env_key should usually be OPENROUTER_API_KEY, not DEEPSEEK_API_KEY. The DeepSeek key should be added in OpenRouter’s BYOK or provider key settings.

Example configuration:

[profiles.deepseek-openrouter]
model = "deepseek/deepseek-chat"
model_provider = "openrouter"

[model_providers.openrouter]
name = "OpenRouter"
base_url = "https://openrouter.ai/api/v1"
env_key = "OPENROUTER_API_KEY"

Start it like this:

1
2

export OPENROUTER_API_KEY="your-openrouter-key"
codex --profile deepseek-openrouter

PowerShell:

1
2

$env:OPENROUTER_API_KEY="your-openrouter-key"
codex --profile deepseek-openrouter

Then add your DeepSeek provider key in the OpenRouter dashboard. OpenRouter’s BYOK documentation says provider keys are stored encrypted and used for routing to the corresponding provider.

This route saves you from maintaining a local gateway and feels more like using a regular third-party API proxy. The downside is that an online service sits in the middle, so troubleshooting may require checking Codex, OpenRouter, and DeepSeek error messages together.

Should You Keep Using the deepseek-chat Model Name?

In DeepSeek’s documentation as of May 2026, the recommended model names include deepseek-v4-flash and deepseek-v4-pro, with a note that compatibility aliases such as deepseek-chat and deepseek-reasoner will be deprecated after 2026-07-24.

For new configurations, it is better to test:

`1`	`model = "deepseek-v4-flash"`

If you are using OpenRouter, follow OpenRouter’s model naming format, for example:

`1`	`model = "deepseek/deepseek-chat"`

The actual available names depend on your gateway or OpenRouter’s model page. When the model name is wrong, errors usually look like model not found, 404, or the provider failing to find the matching endpoint.

Why Directly Setting DeepSeek’s Official base_url Is Not Recommended

You can certainly try this as an experiment:

[profiles.deepseek-direct]
model = "deepseek-v4-flash"
model_provider = "deepseek"

[model_providers.deepseek]
name = "DeepSeek"
base_url = "https://api.deepseek.com"
env_key = "DEEPSEEK_API_KEY"

But this is more of a debugging experiment than a stable setup. Codex talks to custom providers through the Responses protocol, while DeepSeek’s official examples use /chat/completions. If DeepSeek or Codex adds a full compatibility layer later, direct connection may become simple. Until then, a bridge layer is more reliable.

What If Codex Still Uses OpenAI After Editing the Config?

First, confirm the config file location. The global config should be:

`1`	`~/.codex/config.toml`

The project-level .codex/config.toml is not the right place for machine-level provider settings such as model_provider and model_providers. The official OpenAI docs also note that project-level configuration does not override local provider and authentication fields.

If Codex still asks you to log in through the web, or appears to use the default OpenAI model, log out first:

`1`	`codex logout`

Some older tutorials write this as /logout inside the interactive UI. With the current CLI, running codex logout directly in the terminal is the more reliable option.

You can also run a quick check with a temporary profile:

`1`	`codex --profile deepseek-ccx`

Or:

`1`	`codex -c model_provider=ccx-bridge -c model=deepseek-v4-flash`

If that works, the config itself is readable. If it does not, check the profile name, TOML syntax, and whether the environment variable only exists in the current shell session.

Troubleshooting Checklist

401: The key is wrong, or env_key points to the wrong environment variable.
404: base_url or the model name is wrong, or a Responses request is being sent to an endpoint that only supports Chat Completions.
tool_calls, patch, or streaming parse errors: the protocol bridge is likely incomplete.
Still prompted to log in to OpenAI: run codex logout, then confirm you are using the correct profile.
PowerShell environment variable disappears in a new window: $env:... only applies to the current session. Use user environment variables if you need it to persist.
OpenRouter BYOK is not using your own DeepSeek key: check whether the provider key is bound in OpenRouter, whether the current OpenRouter API key is allowed to use it, and whether fallback is enabled.

Conclusion

Using DeepSeek with Codex is not impossible through config.toml. The catch is that changing only base_url is usually not enough.

The two steadier routes today are:

Use a local gateway as a protocol bridge: Codex talks to the local gateway, and the gateway talks to DeepSeek.
Use OpenRouter BYOK as an online proxy: Codex talks to OpenRouter, while the DeepSeek key is bound in the OpenRouter dashboard.

If you only want a quick test, OpenRouter is easier. If you want tighter control over keys, cost, and logs, a local gateway is better for long-term tinkering.

References:

free-claude-code: Connecting Claude Code to OpenRouter, DeepSeek, and Local Models Through a Proxy

Fri, 01 May 2026 03:41:49 +0800

free-claude-code is an Anthropic-compatible proxy for Claude Code.

Its idea is not to crack Claude Code, nor to provide an official free Claude service. Instead, it starts a local proxy service that looks like an Anthropic API, then forwards requests from Claude Code to other model backends. The README mentions backends such as NVIDIA NIM, OpenRouter, DeepSeek, LM Studio, llama.cpp, and Ollama.

In simple terms, it solves this problem: you like the terminal experience of Claude Code, but want to send model requests to another provider or a local model.

What Problem It Solves

Claude Code has an interaction model that works well for development tasks.

It can read code, edit files, run commands, and move tasks forward based on project context inside the terminal. But many users may not always want to use the same model backend:

They want to try different models on OpenRouter
They want to use models such as DeepSeek to reduce cost
They want to route requests to local Ollama
They want to run local models through LM Studio or llama.cpp
They want one proxy entry point in the development environment
They want to compare different models inside the Claude Code workflow

free-claude-code is positioned as a compatibility layer between Claude Code and these model services.

Claude Code still sends requests in an Anthropic-like style, while the proxy adapts those requests to different backends.

How It Works

You can think of it as three layers:

The frontend is Claude Code
The middle layer is the free-claude-code proxy
The backend is OpenRouter, DeepSeek, a local model, or another model service

Claude Code believes it is accessing an Anthropic-compatible API.

After the proxy receives a request, it selects a target provider according to configuration, transforms the necessary fields, and returns the response to Claude Code.

The benefit of this structure is that you do not need to modify Claude Code itself, and you do not need every model service to natively support Claude Code. As long as the proxy can align the interfaces, more models can be connected to the same workflow.

Supported Backends

The README lists these directions:

NVIDIA NIM
OpenRouter
DeepSeek
LM Studio
llama.cpp
Ollama

These backends represent different usage styles.

OpenRouter is more like a model aggregation entry point, useful for testing different commercial and open-source models.

DeepSeek is suitable for people who care about Chinese ability, coding ability, and cost.

LM Studio, llama.cpp, and Ollama are more local-model oriented. They are suitable for running models on your own machine or inside an intranet, reducing dependence on external APIs and making offline experiments easier.

NVIDIA NIM is more oriented toward enterprise and GPU inference deployment scenarios.

Why an Anthropic-Compatible Proxy

Claude Code was originally designed around Anthropic interfaces and model conventions.

If you want to connect it to other models, the most direct problem is interface mismatch:

Request fields differ
Model names differ
Streaming formats differ
Tool use is represented differently
Error response formats differ
Token and context limits differ

This is where the proxy layer is useful.

It keeps the interface seen by Claude Code close to the Anthropic shape, then adapts to the backend. For users, after configuring the proxy once, they can test different models inside the same Claude Code workflow.

Suitable Scenarios

free-claude-code is suitable for:

Using the Claude Code terminal workflow
Testing non-Anthropic models in Claude Code
Reducing model calling costs
Connecting Claude Code to OpenRouter
Connecting to compatible model services such as DeepSeek
Running local models through Ollama, LM Studio, or llama.cpp
Giving a team one unified model proxy entry point

If you only use official Claude Code normally and have no special needs around providers, cost, or local deployment, you may not need this type of proxy.

But if you often compare models, or want Claude Code to connect to local and third-party models, this type of tool is useful.

Difference from Directly Using OpenRouter or Ollama

Using OpenRouter, Ollama, or LM Studio directly usually means chatting with a model or calling it through an API.

The point of free-claude-code is not to replace those services, but to connect them to the Claude Code development workflow.

The difference is:

You still use the Claude Code terminal experience
AI can execute tasks around a code repository
The model backend can be changed to another provider
Local models can enter the Claude Code workflow
Configuration is centralized in the proxy layer instead of changed in each tool

So it is more like a bridge than a new chat client.

Notes About Local Models

Connecting Claude Code to local models is attractive, but there are real limitations.

First, model capability differs.

Claude Code tasks are usually not just chat. They include understanding code, planning modifications, editing files, and handling command output. Smaller local models may not complete these tasks reliably.

Second, context window matters.

Code tasks need a lot of context. If the model context is too small, it may fail to read full files, miss constraints, or lose background across multi-turn tasks.

Third, tool use compatibility matters.

Claude Code workflows depend on tool calls and structured behavior. Even if a backend model can chat, it may not follow tool-use protocols well.

Fourth, speed and hardware matter.

Local model speed depends on machine configuration, quantization, and model size. If code tasks respond too slowly, the experience drops noticeably.

So local models are better for experiments, low-risk tasks, and specific scenarios. For truly complex coding tasks, choose carefully according to model capability.

Usage Boundaries

Projects like this are easy to misunderstand from the title, so the boundaries should be clear.

First, it is not an official free Claude Code quota.

It only forwards Claude Code requests to other model backends. When using OpenRouter, DeepSeek, NVIDIA NIM, or other APIs, you still need to follow the pricing, quotas, and terms of the corresponding services.

Second, it is not a tool for bypassing authorization.

When using any proxy tool, you should follow the licenses and terms of Claude Code, model providers, and the project itself. Do not interpret it as a way to avoid official restrictions.

Third, the proxy handles your request content.

Code, command output, and project context may pass through the proxy and backend services. When deploying, consider logs, keys, network boundaries, and privacy. For company code or sensitive projects, use a controlled environment.

Fourth, model performance varies greatly.

The same Claude Code operation may behave very differently after switching models. Do not assume every model can replace Claude.

Relationship with Proxies Such as LiteLLM

Conceptually, free-claude-code belongs to the category of compatible interface proxies.

The shared goal of such tools is to reduce coupling between upper-level applications and lower-level model services. The upper-level application faces a relatively unified interface, while backend providers can be switched by configuration.

Different projects focus on different areas. Some are general model gateways, some focus on OpenAI-compatible APIs, and some specifically adapt tools such as Claude Code.

What makes free-claude-code worth noting is that it puts Claude Code directly at the center, rather than building a generic chat proxy.

Suitable Users

It is better suited to users who are comfortable tinkering:

Familiar with Claude Code
Know how to configure API keys and model providers
Understand proxy service startup and environment variables
Can troubleshoot network, port, model name, and streaming issues
Want to compare different models on coding tasks

If you only want something that works out of the box, the official configuration is usually simpler.

If you are willing to set up a proxy, switch models, tune parameters, and let Claude Code enter more model environments, this project is worth studying.

Reference

Alishahryar1/free-claude-code

Final Thought

The value of free-claude-code is not in the word “free,” but in the bridge it builds between Claude Code and more model backends.

When you want to keep the Claude Code development experience while testing OpenRouter, DeepSeek, local models, or enterprise inference services, an Anthropic-compatible proxy like this becomes useful.