Pinecone’s Jeff Zhu says Nexus cuts token burn by building agent context upfront

Agent adoption is surging, and database vendors want to tame usage-based AI costs with context engines and agent workspaces.

ByLama Al-RashidTechnology Correspondent, The Executives Brief

about 7 hours ago·4 min read

Pinecone’s Jeff Zhu says Nexus cuts token burn by building agent context upfront

Executive summary

Pinecone’s Jeff Zhu is rolling out Nexus, a “knowledge engine, not a retrieval system,” embedded in Microsoft OneLake to pre-build task-specific context for agents. Tiger Data’s Ajay Kulkarni also pitches Ghost, disposable PostgreSQL workspaces with compute-hour billing to contain experimentation costs and risk.

Tech leaders are staring at a familiar AI problem, but with a sharper edge now: the more agentic systems you deploy, the more your spend can turn into a variable you cannot ignore. That is because major players are shifting parts of their offerings from flat-rate subscriptions toward usage-based billing, and agents multiply calls, reads, and reasoning loops. Database vendors are stepping in with a pitch that sounds almost like a cheat code: stop runaway token consumption by fixing the “data plumbing” around models.

Pinecone is leading with Nexus, and Pinecone product veep Jeff Zhu frames it as a direct way to stop agents from repeatedly rediscovering your business context every time they need to answer. In his view, coding agents are great at exploratory work, but that exploration burns tokens. Zhu’s example is blunt: an agent may fetch a table schema, inspect top rows, and eventually land at the right response, but it does this understanding from scratch across calls because it keeps re-deriving context from raw data. Nexus is designed to structure, contextualize, and compose specialized contexts, derived artifacts, in advance of agent demand, so the agent can reuse that pre-built context instead of paying the token bill for re-learning your data on every run.

This is not just a product claim. It sits inside a bigger market shift that analysts say is already underway. IDC research director Devin Pratt says demand for underlying capability is strong because agentic adoption is already broad, and around 79 percent of organizations are either investing significantly in agentic AI with a set budget or running agentic applications in production. Pratt also points to an open question that matters for enterprise architecture: whether specialists win or the capability gets absorbed into the platforms enterprises already use, much like vector databases saw their function move into broader database and data platforms over time. Pinecone’s answer is to move from “vector retrieval” as the core story to a “knowledge layer” that changes how often agents need to go back to raw sources.

Under the hood, Pinecone positions Nexus as a semantic layer of business data for a given use case or outcome. A finance analyst agent and an HR agent might pull from the same underlying data, but they want different outcomes, and therefore need different contexts. Nexus uses the organization’s data sources, including SQL databases, unstructured documents, and PDFs, to build a task-specific context used for an individual agent’s job or role. For agents, the practical implication is fewer repeated cycles: less back-and-forth between agent behavior and raw data, and more “reasoning once, upstream, and store reusable context” rather than rediscovering it per inference call.

Pinecone also leans into the cost narrative using features that would make a CFO or an IT leader feel less blind. Nexus includes a query language called KnowQL, which incorporates a budget primitive. The platform also provides a single dashboard for token usage and spend. IDC’s Pratt calls out the design logic: Nexus treats cost as a design constraint, not an afterthought, with budget control and token accounting built into the query layer. In other words, instead of hoping engineers can guess spend patterns after launch, the system tries to help shape and measure it while queries and contexts are constructed.

Tiger Data is tackling a different part of the agentic cost and risk equation. If Pinecone focuses on reducing repeated context work, Tiger Data focuses on isolating experimentation. Its Ghost platform targets developers working with AI agents who experiment constantly, where failure should not spill into shared environments other agents and humans depend on. Ghost offers instant PostgreSQL databases with fast forking, accessible through the Ghost CLI or an MCP server, plus a terabyte of free storage. The key for budgeting is Tiger Data’s shift in charging model: instead of traditional database-level packaging, it provides usage-based pricing at the compute-hour level, so cost scales with how much compute the agents consume, not with how many databases exist.

Tiger Data’s CEO Ajay Kulkarni describes the design goal: “no matter how many databases you have - it could be one, it could be 50 - it'll cost the same,” because metering happens by compute hours. Tiger Data offers a free tier with 100 compute-hours per month, and users can pay for additional usage in 15-minute active windows. Pratt argues Ghost solves a concrete deployment problem by giving each agent or task its own disposable PostgreSQL database, enabling an agent to branch a dataset in seconds and discard it afterward. That matters for enterprises that want to keep PostgreSQL while still enabling agent workflows that act more like software development sandboxes than like steady analytics jobs.

Together, these pitches underline why the “agent problem” is increasingly a “data problem,” especially around security, compliance, and cost. IDC’s Data Management survey found that IT leaders name security and compliance constraints and cost as the two biggest data roadblocks for scaling generative and agentic AI. Fragmentation compounds it, with nearly two-thirds of organizations running 11 or more distinct database technologies. Platform vendors and embedded database services are likely to keep expanding, but these specialists are betting that agent-ready infrastructure becomes its own category, with context engines on one side and agent workspace isolation on the other.

For executives deciding where to place bets, the strategic stake is straightforward: agentic AI will not stay within model pricing once it goes production, because agents need continuous access to data and they do it in loops. If context and workspace costs are not engineered, they become budget surprises. Pinecone and Tiger Data are trying to move those costs from “runtime shock” into “upfront design choices,” and they are doing it in ways that fit existing enterprise tools and billing habits. The question for everyone else is whether this capability gets absorbed into platforms like Microsoft OneLake, or whether the specialist layer becomes the default way enterprises control token burn and experimentation blast radius.

Executive ActionsLocked

This story's Key Insights and Take-aways are locked.

Create a free account to unlock Executive Actions for one credit.

Always free for Executives Club members. Join the Club

Taggedpinecone tiger-data agentic-ai token-costs data-infrastructure postgresql microsoft-onelake idc usage-based-billing data-governance

Pinecone’s Jeff Zhu says Nexus cuts token burn by building agent context upfront

This story's Key Insights and Take-aways are locked.

More in Technology

German researchers show medical AI can reveal whether a patient was in training

8BitDo adds a built-in screen to its all-button Arcade Controller Pro

Zoox refreshes its bidirectional robotaxi as Amazon waits on FAA-style exemptions