AI Security on KnightLi Blog

The AI Vulnerability Discovery Era Has Arrived: Why Copy Fail, Dirty Frag, Fragnesia, and ssh-keysign-pwn Clustered Together

Thu, 21 May 2026 09:30:01 +0800

In recent weeks, Linux kernel vulnerabilities have appeared in rapid succession: Copy Fail, Dirty Frag, Fragnesia, and ssh-keysign-pwn have all entered security discussions. Some allow local privilege escalation, some can expose highly sensitive files, and some affect container hosts and multi-tenant environments. Many people’s first reaction is: why did Linux suddenly become so insecure?

A more accurate answer is: Linux did not suddenly get worse. Long-hidden problems are being found faster and more systematically.

What really matters in this wave is not just the fixes for a few CVEs, but the change in how vulnerabilities are discovered. In the past, only a small number of top researchers could spend a long time connecting cross-subsystem logic bugs. Now AI-assisted auditing, automated static analysis, fuzzing, and security research platforms are amplifying that work. The vulnerabilities did not appear overnight, but the speed at which they are found, reproduced, and spread has increased.

What These Vulnerabilities Have in Common

First, put the recent incidents side by side.

Vulnerability	Main Impact	Key Characteristics	Risk Focus
Copy Fail / CVE-2026-31431	Local privilege escalation	Linux crypto / AF_ALG related path, involving a page cache write issue	Ordinary user to root; especially sensitive in container environments
Dirty Frag / CVE-2026-43284, CVE-2026-43500	Local privilege escalation	Page cache write primitive in XFRM/ESP, RxRPC, and related paths	Chainable exploitation; affects host and container boundaries
Fragnesia / CVE-2026-46300	Local privilege escalation	Logic issue in the XFRM ESP-in-TCP subsystem	Related attack surface to Dirty Frag
ssh-keysign-pwn / CVE-2026-46333	Local sensitive information disclosure and privilege escalation risk	Linux kernel `__ptrace_may_access()` logic flaw	Risk to SSH host keys, `/etc/shadow`, and other sensitive files

They are not the same vulnerability, but several patterns repeat:

They are not traditional remote RCE issues, but local privilege escalation or local sensitive information disclosure.
They require attackers to first obtain some kind of local execution, such as a normal shell, command execution inside a container, CI task permissions, or a low-privilege account.
Most of the risk sits at kernel boundaries: page cache, crypto/network subsystems, ptrace permission checks, and container-shared kernels.
Modern cloud-native environments amplify the blast radius, because containers are not a strong security boundary and the host kernel remains the shared base.

So the question is not only “is there a patch?” The deeper question is: why did these low-level, hidden, long-dormant issues appear in such a short period?

First Reason: Many Bugs Are Historical Debt, Not Newly Written Code

When people see a vulnerability disclosure date, they often assume the bug was introduced recently. That is often not true.

The key point of issues like Copy Fail is that a vulnerability can remain dormant for years until someone connects the right call path, permission boundary, and memory semantics. Public information indicates that Copy Fail relates to kernel optimization history around 2017. Dirty Frag and Fragnesia also point to deep cross-paths involving networking, crypto, and page cache.

The scary part is not that one line of code obviously looks dangerous. It is that several assumptions happen to overlap:

A subsystem performs in-place processing for performance.
An interface allows unprivileged users to reach kernel functionality.
A path connects read-only file pages, page cache, network fragments, and crypto buffers.
An implicit invariant was never written into types, assertions, or documentation.
The result becomes a path where an ordinary user can affect kernel state that should be out of reach.

This is not the kind of issue ordinary code review is best at finding. One reviewer may understand the crypto subsystem, another the network subsystem, and a third memory management. The bug lives at their intersection.

Second Reason: Linux Kernel Complexity Has Outgrown Manual Review

Linux’s strengths are openness, generality, broad hardware support, and a powerful ecosystem. Those strengths also carry costs.

The modern Linux kernel is not a small kernel. It includes scheduling, memory management, filesystems, network protocols, crypto frameworks, drivers, virtualization, container-related mechanisms, eBPF, LSM, security modules, and hardware platform support. Each subsystem has its own history, maintainers, performance goals, and compatibility baggage.

The problem is that vulnerabilities often do not live inside one module. They live where modules intersect:

splice() connects file pages and pipes.
AF_ALG connects user space to the kernel crypto API.
XFRM/ESP connects network packets, crypto, and memory pages.
RxRPC and ESP-in-TCP make the network stack more complex.
Containers make low-privilege local execution a much more common real-world precondition.

From an engineering perspective, the Linux kernel is no longer small enough for “many eyes” to reliably inspect every corner. Open source makes problems easier to fix and review, but it does not mean every corner receives continuous security review. Very few people can understand cross-subsystem vulnerabilities, and those are exactly the bugs that can have large impact.

Third Reason: Performance Optimizations Often Thin the Security Boundary

One theme appears repeatedly in this wave: reducing copies, reusing buffers, and processing data in place for performance.

These optimizations are reasonable. The kernel is infrastructure. A small performance loss can affect cloud providers, databases, networking, storage, and container platforms. One fewer copy, faster encryption/decryption, or one fewer memory allocation can matter in production.

But the security cost is also clear. When the boundaries between read-only data, shared pages, user-controlled input, kernel buffers, and crypto output become thin, one subsystem’s misunderstanding of input/output contracts can cause unauthorized writes or reads.

In other words, performance optimization itself is not wrong, but it creates more fragile combinations:

In-place encryption/decryption reduces copying, but relies more on correct isolation of input and output buffers.
Page cache improves file access performance, but can also become attack surface.
Zero-copy improves throughput, but lets different subsystems share the same memory objects.
Containers improve deployment efficiency, but shared kernels make the blast radius of local privilege escalation larger.

Security boundaries cannot rely on everyone remembering not to make a mistake. They have to be enforced through types, permission checks, immutability constraints, tests, fuzzing, and continuous auditing. Otherwise, the more performance optimizations and implicit assumptions accumulate, the more likely these bugs are to be found.

Fourth Reason: Containers Raise the Value of Local Vulnerabilities

In the past, “local privilege escalation” often sounded less urgent than remote vulnerabilities because the attacker already needed local access. Cloud-native environments changed that judgment.

Today, local execution can come from many places:

A web application turns into a normal shell.
A CI/CD task executes untrusted code.
A container runs user-uploaded workloads.
A multi-tenant platform lets users run notebooks, plugins, scripts, or build jobs.
AI code execution environments, sandboxes, and online judges are becoming more common.

Once an attacker has execution inside a container, kernel LPE is no longer just a “local machine” issue. Containers share the host kernel, so a kernel vulnerability can cross the container boundary and affect the host and other tenants.

This is why Copy Fail and Dirty Frag attract so much attention from cloud, security, and container teams. They can upgrade “low-privilege local code execution” into “host-level risk.”

AI’s Impact: The Cost of Finding Vulnerabilities Has Dropped

The most era-defining part of this wave is AI-assisted vulnerability discovery.

Public information about Copy Fail mentions Theori’s Xint Code participating in the discovery process. Whatever the exact tool capabilities, this represents a trend: AI does not necessarily invent vulnerabilities out of thin air, but it is very good at helping researchers shorten the search path.

AI affects vulnerability research in several ways:

Faster reading of unfamiliar code
Kernel subsystem codebases are large. Researchers cannot manually read every path. AI can help summarize functions, call chains, input/output relationships, and suspicious patterns.
Easier discovery of cross-module connections
Many vulnerabilities hide in chains such as “user-space entry -> network stack -> crypto framework -> memory pages -> file cache.” AI can help map these cross-file, cross-directory, cross-subsystem paths.
Easier generation of audit hypotheses
Questions like “which paths write user-controlled data into page cache,” “which APIs let unprivileged users reach crypto subsystems,” and “which functions assume input and output buffers never overlap” can now be enumerated more systematically.
Easier conversion into reproducible examples
AI cannot replace the judgment of kernel researchers, but it can help write validation code, organize PoC ideas, explain faulty paths, and generate tests.

The result is a lower unit cost for vulnerability discovery.

In the past, a high-quality kernel vulnerability might take elite researchers a long time to find. Now, someone who understands systems and has AI tools can triage suspicious paths faster. The ceiling on vulnerability supply has risen, making clustered disclosures more likely.

But AI Is Not the Only Reason

There is another extreme to avoid: blaming everything on AI.

AI is an accelerator, not the root cause. The real roots are still:

Long accumulation of historical code.
Implicit contracts in performance optimizations that were never enforced.
Excessive cross-subsystem complexity.
Too much kernel functionality exposed by default.
Security tests that do not cover all combined paths.
Containers, multi-tenancy, and automated execution environments raising the value of local vulnerabilities.

Without those conditions, even powerful AI would not find so many high-impact bugs. Conversely, as long as these conditions exist, more mature AI will make vulnerabilities easier to find systematically.

What This Means for Defenders

For operations, security, and platform teams, this wave has several direct lessons.

First, stop treating local privilege escalation as low priority.
If your environment has containers, CI, online execution, plugins, notebooks, or multi-tenant workloads, local privilege escalation can become host risk.

Second, kernel patch cadence must get faster.
Critical hosts, Kubernetes nodes, CI runners, AI sandboxes, and virtualization hosts should not stay on old kernels for long periods. Kernel updates, reboot windows, live patching, and rollback plans need explicit processes.

Third, reduce unnecessary kernel attack surface.
Protocols, modules, user namespaces, special sockets, and debugging interfaces that are not needed should be tightened according to business needs. Enabled by default does not mean exposed by default.

Fourth, container security should assume the kernel may be broken.
Running containers as non-root, minimizing capabilities, using seccomp, AppArmor/SELinux, read-only filesystems, and isolating sensitive mounts remain important. They may not stop every kernel bug, but they reduce preconditions and follow-on damage.

Fifth, monitoring should focus on privilege escalation chains.
Do not only watch remote entry points. Also watch abnormal processes, sensitive file reads, kernel module loading, container escape signals, CI runner anomalies, and access to high-value files such as /etc/shadow and SSH host keys.

What This Means for Open Source Communities

For Linux and large open source projects, AI-assisted vulnerability discovery creates a two-sided pressure.

On one hand, AI can help defenders find old problems faster. More latent vulnerabilities being publicly fixed is good in the long run.

On the other hand, AI also creates noise. Low-quality automated reports, false positives, duplicates, and “AI found a bug” claims without context consume maintainer time. The real challenge is not whether to use AI, but how to bring AI output into responsible security processes:

Reports need minimal reproduction.
Reports must define impact scope and threat model.
Reports must distinguish theoretical issues, triggerable bugs, and exploitable vulnerabilities.
Embargoes, distribution coordination, and fix windows must be respected.
Maintainers need better automated tests, fuzzing, static analysis, and regression validation.

AI makes vulnerability discovery faster, and that requires more mature repair and coordination mechanisms. Otherwise, higher security research productivity becomes maintainer pressure and user panic.

Conclusion: Not Myth Collapse, But a Changed Security Reality

Linux remains one of the most transparent, controllable, fixable, and hardenable foundations in mainstream operating system infrastructure. The issue is not that Linux suddenly became insecure. It is that modern kernel complexity, cloud-native usage patterns, and AI-assisted vulnerability discovery are changing the speed of vulnerability exposure.

Similar events are likely to appear again. Not because every time is a new disaster, but because complex paths accumulated in the past are now being searched more efficiently.

The security assumptions need to change:

Do not treat “no disclosed vulnerability” as “no vulnerability.”
Do not treat local privilege escalation as low risk.
Do not treat container isolation as a strong security boundary.
Do not rely only on manual review for systems with tens of millions of lines of code.
Do not wait only for patches; shrink attack surface, speed up patching, and build defense in depth.

AI has pushed vulnerability discovery into a higher-output era. That is true for attackers and defenders. The difference is whether defenders can turn that capability into faster auditing, earlier fixes, and more stable infrastructure governance.

References

APT45 Uses AI to Validate Vulnerabilities at Scale: The Barrier to Zero-Day Attacks Is Falling

Sun, 17 May 2026 19:52:39 +0800

Google Threat Intelligence Group published a new AI Threat Tracker on May 11, 2026. The important point is not simply that attackers are using AI. The more important shift is how they are using it: moving from writing, translation, and reconnaissance into vulnerability research, PoC validation, malware obfuscation, and automated attack orchestration.

Two points are easy to mix together, so they should be separated first.

First, Google said it identified what it believes is the first zero-day exploit developed with AI assistance. That case involved an unnamed cybercrime group. The target was a popular open-source web-based system administration tool, and the vulnerability could bypass 2FA when the attacker already had valid credentials. Google said it worked with the affected vendor on responsible disclosure and may have prevented a mass exploitation event.

Second, APT45 was not attributed as the actor behind that zero-day case. GTIG separately noted that APT45, a North Korea-linked threat group, was observed sending large volumes of repetitive prompts to AI models to recursively analyze different CVEs and validate PoC exploits. In other words, APT45 is using AI as a vulnerability research and exploit arsenal management tool, not merely as a phishing-email assistant.

What the AI zero-day case shows

This zero-day was not a typical memory corruption bug, input filtering error, or simple misconfiguration. GTIG described it as a high-level semantic logic flaw: the developer hardcoded a trust assumption inside an authentication flow, creating a contradiction between 2FA enforcement logic and its exceptions.

These bugs are hard for traditional scanners. Static analysis and fuzzing are better at finding crashes, dangerous sinks, input-output paths, and known patterns. They are not always good at understanding what the developer intended to guarantee and where an exception quietly breaks that guarantee.

That is where large language models become risky. They may not be stronger than expert security researchers, but they are good at reading context, explaining intent, comparing similar code paths, and pointing out inconsistent business logic. Once attackers connect that ability to automation, logic flaws that used to require long manual review may become easier to screen at scale.

GTIG also noted several AI-generation traces in the exploit code, including educational docstrings, a hallucinated CVSS score, and a textbook Python style. Google also said it does not believe Gemini was used, while expressing high confidence that the actor used some AI model to support discovery and weaponization.

Why APT45 deserves long-term attention

APT45 has long been tracked as a North Korea-linked threat group with activity spanning espionage, financial gain, and strategic intelligence. What GTIG emphasized this time was its AI workflow: large, repetitive, recursive CVE analysis, PoC validation, and the accumulation of more reliable exploit capabilities.

That is different from asking AI to write a short script.

If an organization can connect AI to vulnerability triage, PoC validation, payload adjustment, and test environments, its human bottleneck changes. In the past, the number of vulnerabilities a team could study at the same time depended on researcher count, experience, and time. Now AI can absorb part of the repetitive reading, summarization, variant testing, and first-pass judgment, leaving humans to focus on target selection, exploitability verification, and delivery.

For defenders, this means the window for known vulnerabilities gets shorter.

After a CVE is disclosed, attackers do not need to manually read the advisory, inspect patch diffs, build test environments, and fix PoCs from scratch. AI can help them understand impact, generate test ideas, troubleshoot failures, and summarize version differences. Even if human correction is still required, the overall throughput improves.

This does not mean AI can hack everything by itself

This should not be read as proof that AI can independently complete full intrusions.

GTIG’s report is more precise: multiple parts of the attack chain are being accelerated by AI. Vulnerability research, malware obfuscation, reconnaissance, social engineering, information operations, mobile UI automation, and supply-chain abuse all show signs of AI involvement.

But AI still fails. It can hallucinate vulnerabilities, misjudge exploitability, generate broken code, or get lost in complex enterprise authorization logic. The real danger is not that AI is perfect. The danger is that attackers can now try cheaply. When large-scale trial and error becomes cheap enough, bad outputs can be filtered away and usable outputs can move into operations.

That is why cases like APT45 matter. State or state-adjacent groups have targets and patience. If AI reduces repetitive labor, they can spend more resources on high-value targets.

Defenders should focus on shrinking the exposure window

Many organizations used to divide risk into two buckets: known vulnerabilities are handled by patch management, while zero-days are handled by defense in depth. As AI enters vulnerability research, that boundary becomes less clean.

The more practical questions are:

After a new CVE is disclosed, how long does it take external attackers to produce a usable exploit?
Can your asset inventory tell you the same day which systems are affected?
Can WAF, EDR, logs, and identity systems detect abnormal attempts?
Do high-risk systems use MFA, least privilege, and network isolation by default?
Are open-source components, AI agent plugins, and third-party connectors included in supply-chain review?

AI zero-days do not make basic security obsolete. They punish environments where basic security has been neglected for too long.

If patch cycles are slow, asset inventories are unclear, internet exposure has no owner, logs are hard to search, and account privileges are excessive, AI only changes attacker efficiency. The underlying problem was already there.

The AI supply chain is also an attack surface

GTIG also highlighted attacker interest in the AI software ecosystem itself, including agent skills, third-party data connectors, open-source wrapper libraries, and automation frameworks. The risk does not necessarily come from the model being compromised. It can come from poisoned tools around the model.

This matters for anyone using AI coding tools, AI agents, and automation plugins.

A malicious skill, backdoored dependency, or over-permissioned connector can turn an AI system from a helper into an attacker-controlled execution path. When an agent can access files, browsers, terminals, cloud accounts, or enterprise data, supply-chain review has to extend beyond traditional applications.

At minimum:

Do not install agent skills and plugins from unclear sources.
Isolate tools that can execute commands, read files, or access secrets.
Do not run unreviewed AI-generated scripts directly in production.
Scan dependencies, GitHub Actions, PyPI / npm packages, and AI project components.
Apply least privilege and leakage monitoring to model API keys, cloud secrets, and GitHub tokens.

Practical advice for security teams

First, move vulnerability response earlier. High-risk CVEs should not wait for a monthly patch window, especially for VPNs, gateways, system administration panels, identity systems, CI/CD, and remote management tools.

Second, build a queryable asset inventory. If AI helps attackers locate targets faster, defenders must be able to answer quickly: do we run this system, which version, and where is it exposed?

Third, use behavior detection to supplement signature detection. AI-generated exploits and malware may change surface features, but authentication bypass, abnormal logins, bulk probing, failed request patterns, and privilege escalation still leave behavioral traces.

Fourth, bring AI tools into security governance. Internal coding agents, browser agents, document agents, automation scripts, and plugin marketplaces need approval, review, logging, and rollback paths.

Fifth, do not reduce AI defense to buying a security model. The useful work is putting AI into vulnerability prioritization, log analysis, patch impact assessment, code review, and configuration baseline checks so defensive speed can rise too.

Summary

GTIG’s report sends a clear signal: AI is accelerating the pace of offense and defense.

The AI-assisted zero-day case shows that logic bugs and authentication bypasses may become easier for models to surface. APT45 shows that mature threat groups are already using AI to analyze CVEs and validate PoCs at scale. PROMPTSPY, AI-generated obfuscation, and agent supply-chain abuse show that AI is becoming part of the attack toolchain.

This is not doomsday, but it is not ordinary news either.

For organizations, the practical response is not panic. It is faster, clearer, and more verifiable work on patching, assets, logging, identity, supply chain, and AI tool permissions. AI improves attacker trial speed. Defenders must improve discovery, judgment, and remediation speed as well.