AI Chips on KnightLi Blog

Behind Cerebras' IPO Surge: Can Wafer-Scale AI Chips Challenge Nvidia?

Mon, 18 May 2026 00:19:51 +0800

Cerebras Systems has finally entered the public market.

The company, known for its “wafer-scale AI chips”, began trading on Nasdaq on May 14, 2026 under the ticker CBRS. According to Cerebras’ official announcement, the IPO price was $185 per share, with 34.5 million shares of Class A common stock offered, including the underwriters’ full exercise of a 4.5 million share over-allotment option.

On its first trading day, Cerebras opened sharply higher and briefly approached $386. Based on the IPO price, the company raised more than $5.5 billion, making it one of the most closely watched AI hardware IPOs in the U.S. market in 2026.

That is why many media outlets call it an “Nvidia challenger”. But it is not accurate to simply describe Cerebras as “the next Nvidia”. What makes it unusual is that it has chosen a technical path very different from traditional GPUs.

Cerebras Is Not Building a Normal GPU

Cerebras’ core product is WSE, short for Wafer-Scale Engine.

Traditional chip manufacturing cuts a whole wafer into many small chips, then packages, tests, and ships them. Cerebras takes the opposite approach: it tries to turn an entire wafer directly into one giant chip.

The advantages of this route are straightforward:

Larger chip area.
More on-chip compute units.
On-chip SRAM closer to compute cores.
Shorter data movement inside the chip.
Better fit for certain AI inference and training workloads.

In AI computing, moving data is often harder to optimize than raw computation. Cerebras’ idea is to keep compute and storage on the same piece of silicon as much as possible, reducing the latency and energy cost caused by data repeatedly leaving the chip.

That is the most attractive part of the WSE approach. Instead of scaling along the same GPU path, it uses a much larger single chip to pursue higher on-chip bandwidth and lower data movement cost.

Why the Market Got Excited

The AI chip market is currently highly dependent on Nvidia. Whether companies are training large models, deploying inference services, or building AI data centers, Nvidia GPUs remain the mainstream choice.

That makes the market naturally interested in two kinds of companies:

Companies that can reduce dependence on Nvidia’s supply chain.
Companies that can offer higher performance or lower cost for certain AI workloads.

Cerebras fits both narratives.

It is not building a general-purpose CPU or an ordinary accelerator card. It designs systems directly around AI training and inference. The company has also repeatedly emphasized that its wafer-scale chips and cloud inference platform can deliver very high throughput in certain model inference scenarios.

This kind of story is easy for the market to amplify in 2026. AI infrastructure is still expanding, and enterprises, cloud providers, and model companies are all looking for more compute sources. If a chip company can prove that it is not just “another small GPU” in some scenarios, the market will pay attention.

The OpenAI Partnership Expands the Upside Story

Another reason Cerebras is closely watched is its relationship with OpenAI.

According to media reports, Cerebras signed a cooperation agreement with OpenAI worth more than $20 billion. The original Sohu article noted that, as of the end of 2025, the remaining performance obligations from that agreement reached $24.6 billion.

For a newly listed AI hardware company, such long-term agreements are important. They suggest that the company has not only a technical story, but also demand from major customers.

Still, long-term orders are not the same as realized revenue. AI data center deployment depends on manufacturing capacity, packaging, power supply, delivery schedules, customer budgets, and changes in model strategy. For chip companies, winning orders is only the first step. Delivering on time, scaling reliably, and building margins are harder.

Customer Concentration Remains a Major Risk

Cerebras also has an obvious risk: high customer concentration.

The Sohu article noted that G42 contributed 85% of Cerebras’ revenue in 2024, falling to 24% in 2025, while Mohamed bin Zayed University of Artificial Intelligence contributed 62% of revenue in 2025. This means that even after G42’s share declined, Cerebras’ revenue still depended heavily on a small number of large customers.

For AI infrastructure companies, customer concentration has two sides.

The benefit is that large customers can bring rapid growth, long-term contracts, and order visibility.

The risk is that if customers cut budgets, change technical direction, delay data center construction, or face regulatory changes, revenue volatility can be significant.

That is why Cerebras should not be judged only by its IPO pop. The first-day stock price reflects enthusiasm and expectations. Long-term valuation will still depend on revenue structure, delivery capability, margins, and customer diversification.

The Technical Limitation: Memory Capacity

WSE has clear strengths, but its limitations are also clear.

The Sohu article noted that the WSE-3 chip has 44GB of SRAM, while Nvidia’s B200 has 192GB of memory. Cerebras places a large amount of compute and SRAM on the same wafer, which reduces data movement, but also limits available memory capacity.

For large models, memory capacity directly affects context length, batch size, and deployment architecture. Context windows are getting longer, and flagship models are increasingly moving toward million-token context windows. In that trend, on-chip SRAM capacity becomes a real constraint.

Traditional GPUs can continue expanding memory through HBM stacking, packaging expansion, and multi-GPU interconnects. Cerebras’ wafer-scale approach is harder to expand in a simple way because the wafer area is already occupied by compute units and SRAM. Adding more SRAM may mean sacrificing compute area.

This does not mean the Cerebras architecture has failed. It means it is an architectural choice optimized for specific workloads. It may be very strong in certain inference scenarios, but it does not necessarily cover every AI training and inference need.

Can It Replace Nvidia?

In the short term, Cerebras is unlikely to replace Nvidia.

Nvidia’s advantage is not only GPU performance. It also includes the CUDA ecosystem, developer tools, system integration, networking, full-stack server solutions, cloud provider support, and customer migration costs. AI companies often choose Nvidia not because one chip wins on one metric, but because the entire ecosystem is the most stable.

Cerebras’ more realistic opportunity is to become a complementary option for specific AI workloads:

High-throughput inference.
Specific large-model services.
Tasks sensitive to latency and on-chip bandwidth.
Customers that want to reduce dependence on a single GPU supply chain.
Model companies willing to test new architectures for performance.

In other words, it is not an “Nvidia killer”. It is more like an aggressive alternative path in the AI compute market.

Summary

Cerebras’ IPO surge shows that capital markets are still willing to pay a high premium for AI infrastructure stories.

Its wafer-scale chip architecture is genuinely distinctive, separating it from ordinary AI accelerator companies. Together with major customer relationships such as OpenAI, Cerebras has a strong market narrative.

But the risks are just as real: customer concentration, delivery pressure, memory capacity limits, ecosystem barriers, and the system-level gap with Nvidia will all determine how far it can go.

For ordinary readers, the most interesting part of Cerebras is not how much the stock rose. It is that the company proves AI compute competition will not have only one GPU path. Future large-model infrastructure may include GPUs, wafer-scale chips, in-house accelerators, and cloud-based specialized inference platforms at the same time.

References

The U.S. Clears Nvidia H200 Sales: 10 Chinese Companies Approved, but Delivery Is Still Uncertain

Sat, 16 May 2026 17:12:09 +0800

The U.S. export license process for Nvidia H200 sales to China has finally made concrete progress.

According to Reuters-related reports, the U.S. Commerce Department has approved about 10 Chinese companies to buy Nvidia H200 AI chips. The approved list includes major internet companies and supply-chain firms, such as Alibaba, Tencent, ByteDance, JD.com, Lenovo, and Foxconn. However, as of May 14, 2026, H200 chips had still not been delivered to the Chinese market.

This needs to be read carefully: the U.S. side has granted some licenses, but that does not mean the chips have arrived, nor does it mean Chinese companies can immediately deploy them at scale.

What Was Approved

There are three key points in this approval.

First, the U.S. Commerce Department approved about 10 Chinese companies to purchase H200 chips. According to reports, approved customers may buy directly from Nvidia or through authorized intermediaries and distributors.

Second, each approved customer may buy up to about 75,000 H200 chips. If fully delivered, this volume would significantly improve high-end GPU supply for major cloud providers and large-model companies.

Third, Lenovo has confirmed that it is one of the companies that received Nvidia export licenses and is allowed to sell H200 in China. Companies like Lenovo and Foxconn are not only buyers; they may also handle server systems, rack integration, and distribution.

The most important caveat is that a license is not the same as delivery. Public reports emphasize that no H200 shipments to China have been completed yet.

Why H200 Matters

H200 belongs to Nvidia’s Hopper-generation accelerator lineup and is positioned above the H20, which was previously designed for the Chinese market. H20 was a reduced-spec product built to fit earlier export restrictions, while H200 offers stronger compute and memory capabilities.

Public information shows that H200 comes with 141GB of HBM3e memory, making it valuable for large-model training, inference, long-context services, and enterprise AI deployments. It is not Nvidia’s latest Blackwell-generation product, but for Chinese cloud providers and AI companies, it is still a high-end compute resource.

That is why H200 has remained sensitive in U.S.-China AI chip controls. The U.S. wants to limit China’s access to the most advanced AI compute while avoiding a complete loss of Nvidia’s China business. China, meanwhile, wants to reduce reliance on U.S. GPUs and direct more compute investment toward domestic chips and local ecosystems.

It Has Not Really Landed Yet

The easiest mistake is to read “approved to buy” as “supply has reopened.”

Based on current public information, there are still several variables:

U.S. approval is only the first step; orders, review, shipment, and compliance workflows still need to continue.
Whether China will allow actual import and deployment still requires clearer policy guidance.
Whether approved companies place orders immediately depends on price, delivery time, domestic alternatives, and long-term policy risk.
Nvidia may need to re-coordinate H200 capacity because its focus had already shifted to Blackwell and later products.

In other words, H200 sales to China now look more like an opened license window than a supply chain that is already moving chips into Chinese data centers at scale.

What It Means for Nvidia

For Nvidia, the China market remains too important to ignore.

After export restrictions tightened, Nvidia’s share in China’s high-end AI accelerator market was clearly affected. Jensen Huang has repeatedly argued that the U.S. should not casually give up the Chinese market, because doing so would hurt Nvidia’s revenue and weaken the influence of the U.S. technology ecosystem among global AI developers.

If H200 can eventually be delivered, Nvidia can partially recover Chinese customer orders and keep CUDA in Chinese large-model and cloud-computing workflows.

But this business will not return to the old frictionless state. Licenses, quotas, revenue-sharing arrangements, third-party verification, re-export restrictions, and customer identity review may all become long-term costs. For Nvidia, H200 is not just a product sale; it is a way to maintain market presence in a narrow policy corridor.

What It Means for Chinese Companies

For Chinese companies, H200 is short-term compute supply, not long-term certainty.

If approved companies can actually receive H200 chips, large-model training, inference services, AI cloud, agent platforms, and enterprise private deployments will all benefit. Teams already deeply tied to the CUDA toolchain face far lower migration costs with H200 than with a completely new hardware ecosystem.

But policy uncertainty will make companies cautious. Being able to buy H200 today does not mean stable procurement next year. Buying one batch does not mean a long-term expansion path exists. Even if major companies buy, they will likely continue pushing domestic GPUs, heterogeneous compute, inference optimization, and model compression to avoid being trapped again by a single supply chain.

So H200 is more of a buffer for Chinese AI companies than a final solution.

Pressure on Domestic Chips Will Not Disappear

U.S. approval of H200 does not reduce pressure on domestic AI chips. In some ways, it may make competition more direct.

If H200 really enters the Chinese market, domestic chip vendors will face a stronger benchmark in both performance and ecosystem. Customers will compare training stability, inference throughput, memory capacity, software toolchains, cluster communication, and operations cost.

Domestic chips still have room, however. As long as high-end GPU imports remain policy-sensitive, companies will not put their entire long-term compute base on Nvidia. Domestic solutions still have opportunities if they can provide controllable cost, stable supply, and usable software in specific scenarios.

A more realistic pattern may be: high-end training and critical inference continue to seek Nvidia resources such as H200, while large-scale inference, government and enterprise projects, and controllable supply-chain scenarios shift more toward domestic or mixed compute.

How to Read This

The most accurate reading is that U.S.-China AI chip friction has loosened temporarily, but has not returned to full openness.

The U.S. granted licenses to rebalance controls and commercial interests. Nvidia wants to use H200 to return to China’s high-end AI chip market. Chinese companies want stronger compute, but they also need to evaluate import uncertainty and domestic substitution strategy.

The key questions are not only whether the U.S. “allows” the sale, but what happens next:

Whether the first H200 batch is actually delivered to Chinese customers.
Whether approved companies disclose purchase scale and deployment scenarios.
Whether China provides clearer guidance on import, procurement, and usage.

Until those questions land, H200 remains an opened window for the Chinese market, not a fully restored supply chain.