OpenAI builds its first inference chip, Jalapeño, in nine months with Broadcom

A faster chip-development sprint and a full-stack ambition means decision-makers should watch who controls AI costs and compute.

ByHessa Al-FalehBusiness Desk, The Executives Brief

about 5 hours ago·4 min read

OpenAI builds its first inference chip, Jalapeño, in nine months with Broadcom

Executive summary

OpenAI and Broadcom announced Jalapeño, OpenAI's first inference chip, with Broadcom serving as implementation and integration and OpenAI designing the architecture. OpenAI says early testing shows Jalapeño will deliver performance per watt substantially better than current state-of-the-art, with tape-out in just nine months.

OpenAI and Broadcom just claimed a nine-month path from initial chip design to manufacturing tape-out for Jalapeño, OpenAI’s first inference chip. The companies also said they are running engineering samples “at target frequency and power,” but they are not sharing technical details yet, promising a performance report in the coming months. In the same breath, OpenAI insists early testing shows Jalapeño will deliver performance per watt “substantially better” than current state-of-the-art.

That combination matters because inference chips, not just frontier model training, are increasingly where AI economics get real. Training power is expensive and episodic; inference is constant and budget-breaking. If OpenAI’s performance-per-watt claim holds up when more details ship, it could become a lever for speed, reliability, and cost. That is the wedge here: Jalapeño is positioned as the first piece of a broader stack that OpenAI wants to control end-to-end.

According to the press release, Jalapeño was “co-developed” from initial design to tape-out in nine months, and OpenAI attributes the pace to deep software-hardware co-development with OpenAI’s engineering teams, Broadcom’s silicon implementation expertise, and the use of OpenAI models to accelerate parts of the design and optimization process. In other words, the same ecosystem building the models is also being used to help shape the chips that will run them. The company also said Jalapeño is built using an AI-assisted ASIC architecture for inference.

The details, though, are deliberately thin. The announcement does not provide specifications beyond the engineering sample status, and it punts the rest to “a report on its performance” released in the coming months. That’s not unusual for new silicon, but it does put the burden of proof on OpenAI’s testing claims. When chip teams talk early, they often mean internally validated benchmarks under specific conditions. So the decision-makers reading this should treat “substantially better” as a promise under conditions that have not been publicly stress-tested.

OpenAI also frames Jalapeño as more than a one-off chip. The company says it is “just the first of its AI accelerators,” designed to define its “vision for the future of LLM inference.” The vision is full-stack control: OpenAI says it wants the infrastructure underneath its models and products, including “chip architecture, kernels, memory systems, networking, scheduling, deployment systems, and product experience.” That list is doing a lot of work. It signals that OpenAI is trying to reduce dependency on external hardware roadmaps and integration ecosystems by owning more of the path from model to deployed service.

This is why the story reads as “chippy” in the first place. OpenAI, as an organization, is simultaneously trying to stay credible as a frontier model lab and build the industrial-grade machinery that frontier labs usually leave to semiconductor partners. Other big players have already been walking that path for several generations. The release points out that Amazon, Google, Meta, and Microsoft have been building and using their own silicon for AI. It also notes that OpenAI arch-rival Anthropic is reportedly considering a similar move.

For boards and finance teams, the chip question is inseparable from capital intensity. The source flags that OpenAI had an operating loss of over $20 billion last year, citing leaked financials reported by Ed Zitron, and it also mentions apparent commitments to massive infrastructure spending over the next few years, with figures floated as $600 billion or $1.4 trillion. Those numbers are not confirmed in the source, but the underlying point stands: custom silicon is expensive, and the payback depends on volumes, utilization, and operational uptime over time.

Regulatory pressure is not front and center in the release, but it becomes relevant indirectly because infrastructure strategy affects supply chains, data centers, energy use, and compute governance. When a model provider pushes toward owning more of the compute stack, it can tighten control over deployment and performance monitoring. That can help reliability and potentially improve transparency in internal operations, but it can also increase the importance of oversight around how those systems are built and used. In practice, regulators typically follow where the scale goes, and inference at scale is where the scale lives.

Strategically, Jalapeño is a message to the rest of the AI infrastructure ecosystem: OpenAI wants to compete not only in model capability, but in the cost structure and performance envelope that determines which services can scale profitably. If performance-per-watt gains materialize and the nine-month development cycle becomes repeatable, OpenAI could compress the time from architecture ideas to deployed inference capacity. And for anyone building in the same lane, from hardware partners to cloud providers to model deployers, the stake is straightforward: whoever controls inference efficiency can set the terms for speed, reliability, and affordability. That is the battlefield where budgets get decided, and where “race-to-the-bottom” accusations either stick or get answered with receipts.

Executive ActionsLocked

This story's Key Insights and Take-aways are locked.

Create a free account to unlock Executive Actions for one credit.

Always free for Executives Club members. Join the Club

Taggedopenai broadcom asic inference llm hardware semiconductors ai-accelerators capital-expenditure compute

OpenAI builds its first inference chip, Jalapeño, in nine months with Broadcom

This story's Key Insights and Take-aways are locked.

More in Business

SpaceX sells $25B in debt under two weeks after IPO, despite $90B in orders

Accenture’s $4.18bn play fails as AI fears spark a 20% worst-ever stock plunge

SpaceX stock jumps 3% after it overtakes Amazon’s market cap