Agent on KnightLi Blog

CLI-Anything: Turning Software into an Agent-Usable Command Line

Mon, 25 May 2026 00:24:36 +0800

CLI-Anything is an open-source Agent tooling project from HKUDS. Its goal is to turn software that was originally designed for human GUI operation into command-line interfaces that AI Agents can call more easily. It does not reimplement a simplified version of the software. Instead, it builds a CLI harness around the existing codebase and real backend, allowing Agents to complete tasks through stable commands, stateful sessions, and structured output.

This direction addresses one of the most common gaps when Agents use software: GUI automation depends on screenshots, clicks, and coordinates, so it is easily affected by interface changes; a single API is also often incomplete, forcing the Agent to stitch together a large amount of context on its own. CLI-Anything chooses to condense software capabilities into a command line because commands are naturally easier for models to read, combine, and verify, while also fitting neatly into scripts and automation workflows.

How it works

The official repository describes CLI-Anything as a pipeline for automatically generating CLIs. After receiving a local software source path or a GitHub repository URL, the process analyzes the code structure, identifies the backend and data models, designs command groups, and then implements the CLI, tests, and documentation.

The generated CLI usually supports two usage modes. One is a REPL for continuous work, which preserves project state. The other is a subcommand mode, which is better suited to scripts and pipelines. Commands also provide JSON output so Agents can parse results directly, while still keeping a human-readable format for debugging.

In the official example, the Claude Code plugin can be used like this:

1
2
3

/plugin marketplace add HKUDS/CLI-Anything
/plugin install cli-anything
/cli-anything <software-path-or-repo>

If a harness has already been generated for a piece of software, later usage is closer to a normal Python CLI:

cd <software>/agent-harness
pip install -e .
cli-anything-<software> --help
cli-anything-<software>
cli-anything-<software> --json <command>

Where it fits

CLI-Anything is especially suitable for scenarios where “the capability exists in real software, but the Agent cannot operate it reliably.” Examples include image, video, audio, office documents, 3D modeling, data analysis, or AI/ML toolchains. As long as the project has an analyzable codebase, a callable backend, or a clear data model, it has a chance to be wrapped as a command set that Agents can use.

Its value is not merely adding another layer of wrapping in the command line. The real value is turning key software operations into discoverable, composable, and testable interfaces. An Agent can first understand capabilities through --help, then receive results through JSON output, and connect multiple commands into a workflow. For tasks that require batch processing, automatic validation, and continuous iteration, this is more controllable than temporarily asking an Agent to click through an interface.

Boundaries to keep in mind

CLI-Anything does not mean that any software can be integrated instantly at no cost. It depends on the target software’s source code, backend capabilities, file formats, and testability. If a piece of software is highly closed and its key logic exists only in the GUI layer, the difficulty of generating a high-quality CLI rises significantly.

The official methodology also emphasizes real backends and test validation. This means generating a harness is not finished after writing a few command wrapper scripts. To use it for serious work, you still need to confirm command coverage, output format, dependency installation, real software invocation, and end-to-end test reliability. A more realistic approach is to first generate a CLI for a clearly defined workflow, then gradually fill in capabilities through commands such as refine, test, and validate.

Summary

CLI-Anything’s idea is direct: instead of making Agents adapt to fragile human interfaces, add a stable, structured, and testable command-line entry point to existing software. It is suitable for people who want to bring professional software into Agent workflows, and also for developers studying the shape of “Agent-native software.” In real adoption, the key question is not how much code one command can generate, but whether the generated CLI can call real capabilities, preserve state, output structured results, and stand up to testing.

DeepSeek V4 Flash for a Godot Game Demo: How Far Can a Few Cents Go?

Wed, 06 May 2026 09:22:18 +0800

Can DeepSeek V4 Flash handle Godot game demo development?

The focus is simple: can it create a small Godot demo that runs, can be observed, and includes physics effects?

The short answer is yes. The quality is not commercial-grade, but it is already enough for gameplay prototyping and physics interaction demos. More importantly, the cost is very low, which makes it suitable for quickly validating ideas.

Demo Performance

The focus of this demo is physics interaction.

Several visible effects include:

The rope can be cut.
The box falls to the ground.
After increasing the mass, box collisions become more forceful.
The rope shows noticeable elasticity.
After adjusting friction and elasticity, the box shows clear sliding and bouncing.

From what it presents, this is no longer just “a few generated Godot scripts”. It is a small prototype that can run and show observable physics behavior.

Usability

The value of this demo is that it can run, be viewed, and be modified. It is not a complete game, nor an engineering project ready for direct commercialization, but it already demonstrates several things:

DeepSeek V4 Flash can understand the basic goal of a Godot demo.
An AI Agent can turn requirements into a runnable project.
Non-web tasks such as Godot physics interaction are entering a low-cost prototyping stage.
For individual developers, it can quickly turn an idea into something visible.

If the goal is to build a formal game, it is obviously not enough. But if the goal is to verify whether a gameplay idea is interesting or whether the rough physics effect can be made, this demo is already usable.

Cost Significance

The most notable part is not how polished the visuals are, but the cost.

If a Godot physics demo can produce a runnable version with model costs at the level of a few cents, its significance is not replacing professional game development. It is sharply reducing the cost of prototype trial and error.

In the past, validating a small game idea usually required knowing Godot, writing scripts, setting up scenes, and adjusting physics parameters. Now an AI Agent can first generate a runnable version, and humans can judge whether the direction makes sense.

For indie developers, this kind of low-cost experimentation is useful:

Quickly validate gameplay concepts.
Generate temporary demos for others to see.
Explore Godot APIs and the physics system.
Turn ideas into an initial runnable project.
Reduce handwritten code cost before the direction is clear.

DeepSeek V4 Flash’s Performance

What is worth noting is that the model used here is DeepSeek V4 Flash, not a more expensive and heavier flagship model.

It performs well in the role of a low-cost prototype model. It is not the strongest, most stable, or most suitable model for delivering production engineering, but it is attractive in budget-sensitive scenarios where the goal is to quickly test a direction.

Suitable Scenarios

DeepSeek V4 Flash + Agent + Godot is better suited to these tasks:

Small gameplay prototypes.
Physics effect demos.
UI or interaction concept validation.
Teaching examples.
Helping understand Godot project structure.
Generating a first runnable project.

It is less suitable for directly taking on these tasks:

Large game architecture.
Complex character controllers.
Network synchronization.
Core code for commercial projects.
High-precision physics simulation.
Automated submission without human testing.

In other words, it is suitable as a first draft and testbed, not as the owner of production engineering.

What This Shows

This shows that AI coding is continuing to expand from websites, scripts, and backend APIs into game development and interactive prototyping.

Game development used to have a high barrier to entry, especially when engines, scripts, asset management, and physics systems were mixed together. Beginners could easily get stuck. Now models plus Agent tools can first set up the project, letting developers focus on gameplay judgment and effect tuning.

This may bring three changes:

First, game prototypes become cheaper. Many ideas no longer need to wait until full development to be validated; they can first become runnable demos.

Second, indie developers may become more willing to experiment. People who do not know Godot can still use AI to touch the project structure and basic workflow.

Third, model stability becomes more important. Game development is not just about code running. The effect also needs to be reasonable, the feel needs to be normal, and parameters need to be controllable. In the future, models that better combine actual visuals and runtime state will be more suitable for this kind of task.

Summary

DeepSeek V4 Flash for a Godot demo can be summarized in one sentence: not perfect, but cheap enough, fast enough, and suitable enough for prototyping.

It is still far from commercial games, but if the goal is to validate a small game idea at extremely low cost, it is already valuable.

For individual developers, the most realistic use is not handing the whole game to AI, but letting AI first produce a runnable project while humans handle judgment, trade-offs, and polishing. Used this way, low-cost models such as DeepSeek V4 Flash become genuinely appealing.

DeepSeek-V4 Preview Released: 1M Context, Two Models, and API Migration Notes

Fri, 24 Apr 2026 22:39:46 +0800

DeepSeek released DeepSeek V4 Preview Release on 2026-04-24. Based on the official announcement page, the update is centered on a few very clear themes: 1M context, a two-model lineup with V4-Pro and V4-Flash, dedicated optimization for agent scenarios, and API-side model migration.

If we reduce the release to one sentence, the main signal is this: DeepSeek is not just trying to make a stronger model. It is pushing ultra-long context and agent capabilities toward something that is ready for practical deployment.

1. What was released this time

According to the official page, DeepSeek-V4 Preview mainly includes two product lines:

DeepSeek-V4-Pro
DeepSeek-V4-Flash

The official descriptions are also very direct:

DeepSeek-V4-Pro: 1.6T total / 49B active params
DeepSeek-V4-Flash: 284B total / 13B active params

The naming already makes the strategy clear. This is not a single-model upgrade. DeepSeek is launching a higher-end model and a more cost-efficient model at the same time.

V4-Pro is positioned around performance ceiling, with DeepSeek saying it can compete with the world’s top closed-source models. V4-Flash, by contrast, is positioned around speed, efficiency, and lower cost, making it more suitable for workloads that care more about latency and API pricing.

2. `1M context` is the most visible headline

One of the most prominent lines on the official page is: “Welcome to the era of cost-effective 1M context length.”

DeepSeek is not merely saying the model supports long context. It is presenting 1M context as a default capability of this generation. The page is explicit that:

1M context is now the default standard across official DeepSeek services
Both V4-Pro and V4-Flash support 1M context

The importance of this is not just that you can fit more tokens. It directly affects tasks like:

Understanding large codebases
Long-document Q&A and information synthesis
Multi-turn agent workflows
Complex tasks spanning multiple files, tools, and stages

When the context window is large enough, the model is less likely to lose context midway and re-read material repeatedly. That matters a lot for agentic coding and complex knowledge work.

3. What `V4-Pro` is mainly emphasizing

From the wording on the official page, DeepSeek-V4-Pro focuses on three things:

Agentic coding capability
World knowledge
Reasoning ability

The page says V4-Pro reaches open-source SOTA on agentic coding benchmarks. It also claims leadership among current open models in world knowledge, trailing only Gemini-3.1-Pro, and states that its math, STEM, and coding performance surpasses current open models while rivaling top closed-source models.

In other words, V4-Pro is not positioned as a simple question-answering model. It is aimed much more at high-difficulty reasoning, complex coding, and long-horizon task execution.

4. `V4-Flash` is not just a cut-down version

Another notable point is that DeepSeek does not present V4-Flash as a low-end model. Instead, it stresses that the model is already strong enough for many practical tasks.

According to the announcement, V4-Flash:

Has reasoning ability that comes close to V4-Pro
Performs on par with V4-Pro on simple agent tasks
Uses fewer parameters, responds faster, and is more economical for API usage

That means the lineup is not a very split “one flagship, one entry-level” structure. It is closer to:

V4-Pro: optimize for higher performance and a stronger ceiling
V4-Flash: optimize for lower latency and better cost efficiency

For developers, that is often a more practical combination, because many production tasks do not need the absolute strongest model in theory. They need something strong enough, fast enough, and affordable enough.

5. The release puts clear emphasis on agent optimization

Another strong signal from the announcement page is that DeepSeek is actively pushing V4 toward agent use cases.

The page says DeepSeek-V4 has been seamlessly integrated with several leading AI agents, including:

Claude Code
OpenClaw
OpenCode

DeepSeek also says that V4 is already being used in its in-house agentic coding workflows.

That means the target is no longer limited to chat or ordinary completion. The model is being positioned for longer workflows: reading code, understanding structure, calling tools, generating outputs, and connecting the whole process together.

If you have been paying attention to coding agents recently, this is worth noticing. Model providers are no longer only competing on benchmarks. They are also competing on whether the model can actually plug into real workflows.

6. Structural innovation is serving long context efficiency

On the technical side, the page summarizes this release’s structural work as:

token-wise compression
DSA (DeepSeek Sparse Attention)

The direction is clear: make long context cheaper and more efficient while reducing compute and memory cost as much as possible.

The announcement page does not go into full technical detail, but it at least suggests that DeepSeek is not relying only on brute-force scaling to support longer windows. It is also making architecture-level optimizations specifically for long-context efficiency.

For actual users, that often matters more than just seeing a bigger context number, because real usability depends on more than whether 1M is technically available. It also depends on:

Whether speed stays acceptable
Whether cost stays acceptable
Whether long-context tasks remain stable in practice

7. The API is already available, but model migration matters

The official page clearly states that the API is available today.

The migration path is also relatively simple:

Keep the same base_url
Switch the model name to deepseek-v4-pro or deepseek-v4-flash

The page also says both models support:

1M context
Dual Thinking / Non-Thinking modes
OpenAI ChatCompletions
Anthropic APIs

That means if you already use the DeepSeek API, the upgrade path is not especially difficult. The main work is updating model names and validating behavior.

8. The retirement schedule for old models is explicit

For developers, one of the most important details on the page is actually the retirement notice for older models.

DeepSeek explicitly says:

deepseek-chat
deepseek-reasoner

will be fully retired and inaccessible after July 24, 2026, 15:59 UTC.

The page also notes that these two models are currently being routed to the non-thinking and thinking modes of deepseek-v4-flash.

That means if your project still directly references deepseek-chat or deepseek-reasoner, now is the time to plan the migration instead of waiting until the formal shutdown date gets close.

9. How this release is worth reading

If we compress the update into a few main takeaways, they look like this:

DeepSeek is turning 1M context from a premium feature into a default standard
The two-model strategy is clearer: one targets performance ceiling, one targets speed and cost efficiency
Agent capability has been moved into a very central role
The API upgrade path is relatively direct, but the old-model retirement timeline needs attention soon

For general users, the most visible change may be that long documents, long code contexts, and long workflows become easier to fit into one session.
For developers, the more important point is that if you are already building agents, coding assistants, knowledge workflows, or complex automation pipelines, this generation is very clearly designed for those scenarios.

This is not just a routine model update from DeepSeek. It reads more like a clearer statement of its next product direction: ultra-long context, agent optimization, and more practical API readiness.

DeepSeek official news page: https://api-docs.deepseek.com/news/news260424
Tech Report: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf
Open Weights: https://huggingface.co/collections/deepseek-ai/deepseek-v4

AI Terms Explained: Agent, MCP, RAG, and Token in Plain Language

Thu, 23 Apr 2026 13:13:40 +0800

When people first get into AI, what pushes them away is often not the models themselves, but the long list of terms that keeps showing up in every discussion. Agent, MCP, RAG, AIGC, and Token all look familiar, but without a simple explanation, many people only recognize the words without really understanding them.

This article follows a common beginner-friendly line of explanation and condenses 10 high-frequency AI terms into a set of meanings that is easier to remember. The goal is not to sound academic. It is to help you build a basic mental model that lets you follow everyday AI conversations.

10 common AI terms and what they mean

1. Agent: an AI that does more than chat

Agent can be understood as an AI assistant that actually gets work done.

A normal chatbot usually works in a simple question-and-answer pattern. An Agent goes a step further. It can break a task into steps, arrange a process, call tools, and return a finished result. If you ask it to organize materials, look something up, or generate a document, it may do more than give advice. It may actually chain those actions together and complete them.

That is why the key point of an Agent is not whether it can talk, but whether it can act.

2. OpenClaw: an AI assistant that stays on your computer

Here, OpenClaw is described as a kind of AI assistant that lives on your computer.

You can think of this type of tool as a more desktop-oriented AI helper. It does not only receive text. It may also observe the interface, call local tools, and execute tasks step by step. Compared with a normal web chat interface, this kind of tool emphasizes operational ability much more.

If Agent is the abstract idea of an execution-oriented AI, this kind of desktop assistant is a more concrete personal-computer version of that idea.

3. Skills: capability packs added to an Agent

Skills can be understood as functional modules or operating instructions for an Agent.

The same Agent can behave very differently depending on which Skills it has. Some may focus on copywriting, some on data organization, and some on code-related work. They are a bit like apps on a phone, and a bit like reusable workflows.

So in many cases, it is not that the model suddenly became smarter. It is that a clearer set of rules, tools, and steps was added behind it.

4. MCP: a unified way for AI to connect to tools

MCP stands for Model Context Protocol.

In everyday terms, it is a bit like a Type-C connector for the AI world. In the past, connecting a model to different tools often meant building separate integrations one by one. With a unified protocol, the way those tools connect becomes more standardized and easier to reuse.

For most users, the most important thing to remember is this: MCP is not about whether a model can answer a question. It is about how a model can connect to external tools and resources in a safe and stable way.

5. Gacha: AI output is inherently random

The term “gacha” often appears in AI image generation, video generation, and creative work.

The idea is simple. Even with the same prompt and the same general direction, the result can still be different each time. Sometimes the output is great. Sometimes it falls apart. That is why people compare repeated generation attempts to pulling gacha in a game.

What this really reminds us is that AI generation is not a fixed formula. It is a probabilistic process with variation.

6. API: the connection between an app and a model

API stands for Application Programming Interface.

You can think of it as the standard entry point through which programs communicate. When you call a model service from your own app, script, or editor, you are essentially using an API to send a request and receive a result.

If you compare a model service to a restaurant, then:

the menu is like the API documentation
placing an order is like making an API request
the kitchen sending back the dish is like the model returning a result

That is why many tools may look different on the surface while still calling some form of API underneath.

7. Multimodality: AI handles more than text

Multimodality means AI no longer only reads and writes text. It can process multiple kinds of input and output.

For example, it may be able to read images, understand voice, interpret video, generate pictures, or even support real-time voice and video interaction. Compared with early text-only models, multimodal models are much closer to having the combined abilities to see, hear, speak, and write.

That is also why many AI products are no longer centered around a single text box.

8. RAG: retrieve information first, then generate an answer

RAG stands for Retrieval-Augmented Generation.

It is useful for solving a practical problem: a model’s training data has a time boundary, and it does not automatically know your company’s newest documents, customer-service records, or business rules. The idea behind RAG is to retrieve relevant material from specified sources first, and then generate an answer based on that material.

Its value usually shows up in three ways:

answers are more likely to stay close to real source material
you can trace where the answer came from
new documents can be added and reflected quickly

That is why many enterprise knowledge bases, AI customer-service systems, and internal Q&A tools rely on RAG.

9. AIGC: the general term for AI-generated content

AIGC stands for AI Generated Content.

It is not a single tool. It is a broad label for content produced by AI, including text, images, audio, video, and more. AI writing, AI illustration, AI short-form video generation, and AI voice synthesis all fit under the umbrella of AIGC.

What matters most about this term is that it describes a way of producing content, not one specific model.

10. Token: the unit used to measure model processing

Token can be understood as the basic unit a model uses to process text.

It is not exactly the same as one character or one word, but in practice, you can treat it as the common unit used for model computation and billing. Your input consumes Token, the model’s output consumes Token, and the context kept in memory also takes up Token.

That is why model services keep talking about context length, cost control, and prompt compression. At the core, all of those topics are tied to Token.

Claude Code Multi-Agent Collaboration: How to Choose Between Subagents and Agent Teams

Wed, 22 Apr 2026 21:35:52 +0800

When people talk about multi-agent collaboration in Claude Code, the easiest two concepts to mix up are Subagents and Agent Teams. They both sound like “spin up several agents to work together,” but they are meant for different kinds of work. In short, the former is better for splitting off independent tasks, while the latter is better when several agents need to collaborate around the same problem and cross-check each other over time.

If you have used Skills before, this framing also helps:

A Skill defines the workflow and rules
A Subagent or Agent teammate does the actual execution

So the real question is not which one is “more advanced,” but what kind of collaboration problem you are solving.

Subagents: split off side tasks

Subagents are closer to temporary worker copies launched from the current session. Each one gets its own context window, and when it finishes, it returns only a summary of the result. The main conversation stays cleaner because it does not have to absorb all the intermediate logs and output.

That gives Subagents a few very practical strengths:

The main thread stays clean instead of being flooded by test logs, search results, or long output
Independent research or execution tasks can run in parallel
They work well for tasks where “just bring me the result” is enough

The original article notes that Claude Code comes with three built-in kinds of Subagents:

Explore: read-only, useful for quickly searching a codebase
Plan: read-only, useful for gathering information in the background during plan mode
General-purpose: can read and write, suitable for tasks that mix exploration and editing

Custom Subagents

If the built-in options are not enough, you can define your own Subagent. The mechanism is simple: write a Markdown file in one of these locations:

.claude/agents/: only active in the current project
~/.claude/agents/: active across all your projects

The file format looks like this:

---
name: code-reviewer
description: Expert code review specialist. Proactively reviews code for quality, security, and maintainability. Use immediately after writing or modifying code.
tools: Read, Grep, Glob, Bash
model: inherit
---
You are a senior code reviewer ensuring high standards of code quality and security.

When invoked:

1. Run git diff to see recent changes
2. Focus on modified files
3. Begin review immediately

Review checklist:

- Code is clear and readable
- Functions and variables are well-named
- No duplicated code
- Proper error handling
- No exposed secrets or API keys
- Input validation implemented
- Good test coverage
- Performance considerations addressed
Provide feedback organized by priority:

- Critical issues (must fix)
- Warnings (should fix)
- Suggestions (consider improving)

Include specific examples of how to fix issues.

The key field here is description. Claude uses it to decide when this Subagent should be called, so the more precise the description is, the more reliable the trigger tends to be.

A few other common configuration fields are also worth knowing:

tools: limits which tools the Subagent can use
model: chooses between sonnet, opus, haiku, or inherit
permissionMode: controls edit permissions and permission prompt behavior
memory: gives the Subagent a cross-conversation memory directory

If you only need a Subagent temporarily, you can also define it through the CLI:

claude --agents '{
  "code-reviewer": {
    "description": "Expert code reviewer. Use proactively after code changes.",
    "prompt": "You are a senior code reviewer. Focus on code quality, security, and best practices.",
    "tools": ["Read", "Grep", "Glob", "Bash"],
    "model": "sonnet"
  }
}'

When Subagents fit best

Subagents are usually the best fit for tasks like these:

Running tests and returning only the failure summary instead of flooding the main thread with thousands of log lines
Investigating several unrelated modules in parallel
Splitting “find the issue” and “fix the issue” into a simple pipeline

For example:

`1`	`Research the authentication, database, and API modules in parallel using separate subagents`

`1`	`Use the code-reviewer subagent to find performance issues, then use the optimizer subagent to fix them`

But if a task needs constant back-and-forth adjustments, shares a lot of context across stages, or concentrates changes in only one or two files, handling it directly in the main conversation is often simpler than spinning up a Subagent.

Agent Teams: multiple independent sessions working together

Agent Teams operate at a different level. Instead of launching worker copies inside one session, they start multiple fully independent Claude Code instances that collaborate around a shared task list and can also message one another directly.

That makes an Agent Team feel more like a real small team than a simple side-task worker setup.

The article notes that this is currently an experimental feature and needs to be enabled first:

{
    "env": {
        "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
    }
}

Once this is added to settings.json, you can ask Claude to organize a team around a specific goal. For example:

1
2
3

I'm designing a CLI tool that helps developers track TODO comments across
their codebase. Create an agent team to explore this from different angles: one
teammate on UX, one on technical architecture, one playing devil's advocate.

What an Agent Team consists of

An Agent Team mainly includes three parts:

Team lead: the main session you are using, responsible for organizing, assigning, and summarizing
Teammates: multiple independent Claude Code instances
Task list and Mailbox: the shared task list and communication channel

The biggest difference from Subagents is that teammates can communicate directly with one another instead of routing everything through the lead. Tasks usually move through states such as pending, in progress, and completed, and once a teammate finishes one task, it can pick up the next one.

When Agent Teams fit best

When a task needs several perspectives, active discussion, conflicting hypotheses, or parallel work across modules, Agent Teams are a better fit.

The article gives several representative examples:

Several reviewers inspect the same PR in parallel, each focusing on a different dimension
Multiple agents investigate the same bug with competing explanations and challenge each other’s conclusions
Frontend, backend, and testing move forward in parallel on different parts of the project

For example, parallel code review:

Create an agent team to review PR #142. Spawn three reviewers:
- One focused on security implications
- One checking performance impact
- One validating test coverage
Have them each review and report findings.

And for debate-style debugging:

Users report the app exits after one message instead of staying connected.
Spawn 5 agent teammates to investigate different hypotheses. Have them talk to
each other to try to disprove each other's theories, like a scientific
debate. Update the findings doc with whatever consensus emerges.

The common pattern here is that you do not just want one answer. You want several agents to exchange judgments, challenge assumptions, and gradually converge on a stronger conclusion.

How to choose between them

If you want a quick rule of thumb, use this:

If you just need the result, use Subagents
If the work requires discussion and cross-validation, use Agent Teams

Expanded a bit further, the main differences are:

Communication style: Subagents mainly report results back to the main session, while Agent Teams members can talk directly to one another
Coordination model: Subagents depend more on the main conversation to orchestrate them, while Agent Teams work from a shared task list that members can claim themselves
Token cost: Subagents are cheaper, while Agent Teams cost more because each teammate is an independent instance
Best-fit tasks: Subagents are better for independent, result-oriented work, while Agent Teams are better for discussion-heavy and cross-check-heavy work

Practical cautions

Agent Teams are more powerful, but that does not mean every task deserves a full team. The article specifically calls out a few practical concerns:

token usage is noticeably higher
if multiple teammates edit the same file at once, overwrite conflicts become very likely
adding too many teammates increases coordination cost without guaranteeing better results

A safer default is usually:

start with 3 to 5 teammates
split tasks by module or file to avoid edit conflicts
if the lead starts doing teammate work too early, explicitly tell it to wait for the others first

The current experimental version also has a few limitations, such as:

no support for /resume and /rewind for in-process teammates
task status can lag and sometimes needs manual correction
one lead can manage only one team at a time
teammates cannot spawn child teams of their own

Short conclusion

These two features are not substitutes for one another. They solve two different collaboration problems.

If your goal is “parallelize side tasks and keep the main context clean,” start with Subagents. If your goal is “let several agents work like a small team, discuss, and cross-check each other,” then Agent Teams are the better tool.

Trying both in a real task usually makes the distinction obvious very quickly: one is optimized for context isolation and result collection, and the other is optimized for multi-perspective collaboration and ongoing interaction.

Original article: https://cloud.tencent.com/developer/article/2652960

Agent on KnightLi Blog

CLI-Anything: Turning Software into an Agent-Usable Command Line

How it works

Where it fits

Boundaries to keep in mind

Summary

DeepSeek V4 Flash for a Godot Game Demo: How Far Can a Few Cents Go?

Demo Performance

Usability

Cost Significance

DeepSeek V4 Flash’s Performance

Suitable Scenarios

What This Shows

Summary

DeepSeek-V4 Preview Released: 1M Context, Two Models, and API Migration Notes

1. What was released this time

2. 1M context is the most visible headline

3. What V4-Pro is mainly emphasizing

4. V4-Flash is not just a cut-down version

5. The release puts clear emphasis on agent optimization

6. Structural innovation is serving long context efficiency

7. The API is already available, but model migration matters

8. The retirement schedule for old models is explicit

9. How this release is worth reading

Related links

AI Terms Explained: Agent, MCP, RAG, and Token in Plain Language

10 common AI terms and what they mean

1. Agent: an AI that does more than chat

2. OpenClaw: an AI assistant that stays on your computer

3. Skills: capability packs added to an Agent

4. MCP: a unified way for AI to connect to tools

5. Gacha: AI output is inherently random

6. API: the connection between an app and a model

7. Multimodality: AI handles more than text

8. RAG: retrieve information first, then generate an answer

9. AIGC: the general term for AI-generated content

10. Token: the unit used to measure model processing

Claude Code Multi-Agent Collaboration: How to Choose Between Subagents and Agent Teams

Subagents: split off side tasks

Custom Subagents

When Subagents fit best

Agent Teams: multiple independent sessions working together

What an Agent Team consists of

When Agent Teams fit best

How to choose between them

Practical cautions

Short conclusion

Related links

2. `1M context` is the most visible headline

3. What `V4-Pro` is mainly emphasizing

4. `V4-Flash` is not just a cut-down version