UBS: about 60% of enterprises throttle AI token spend with “guardrails”

UBS analysts see less carefree AI spending, not a full pause, and CFOs are driving new token discipline.

ByMohammed Al-ShehriBusiness Desk, The Executives Brief

about 2 hours ago·4 min read

UBS: about 60% of enterprises throttle AI token spend with “guardrails”

Executive summary

UBS analysts Karl Keirstead, Timothy Arcuri, and Taylor McGinnis report that based on a dozen+ conversations with enterprise IT execs, roughly 60% of enterprises are throttling AI spend by adding guardrails. For decision-makers, this signals an emerging headwind for AI model makers and a shift from experimentation to ongoing token efficiency engineering.

UBS is flagging a quiet but consequential shift in enterprise AI budgets: in conversations with enterprise IT execs, UBS analysts found that about 60% of enterprises were throttling AI spend by putting “some degree of guardrails in place.” The analysts, Karl Keirstead, Timothy Arcuri, and Taylor McGinnis, stressed that this does not mean companies are slamming on the brakes. In their framing, there are “no alarm bells” because no one is fully pausing AI deployment.

That distinction matters, because “throttling” sounds like panic budgeting, while UBS is describing something closer to disciplined usage controls. Their report characterizes the situation as an “emerging headwind” after discussions beginning earlier in June and reaffirmed in more recent conversations. The practical takeaway for any executive funding AI is that token cost pressure is now crossing from theoretical risk to day-to-day enforcement. It is showing up as guardrails on how users can access AI, how much they can consume, and which AI workflows get prioritized when budgets tighten.

So why now? The source points to a familiar pressure point for CFOs and CTOs: rising AI bills alongside questions about ROI. UBS notes that token spending has become a major concern for larger enterprises, where finance leaders increasingly want clearer accounting for AI costs. The issue is not only that models can be expensive, but that AI usage tends to expand quickly once teams start experimenting. That is where the “tokenmaxxing” era comes in. UBS says the carefree days of carefree spending are over, replaced by a more conscious approach.

Importantly, UBS’s analysts also describe that the “extent of the impact varies” by organization. In most organizations, they argue, token spend optimization has become “a key issue,” producing what some teams experience as a “big spending speed bump.” But for others, the speed bump is smaller because of different incentives and maturity levels. UBS describes examples where some enterprises are either too early in their AI deployments, or they are deeper in the stack but unwilling to throttle users because the offsetting ROI is strong, or because AI use supports a higher priority for innovation.

For boards and exec teams, the next-order implication is that AI budgeting is fragmenting. Not everyone is responding the same way. The source includes an excerpt from a company UBS analysts spoke with where the CTO originally “went all in on AI early on,” only to later shift course. They reported having “5 AI tools internally and all of the LLM products,” and they ran into a constraint where they had already “used most of our token budget for the entire year.” The new approach: “Now we're only using 2 AI tools and being careful around usage.” That is guardrails in plain English, and it illustrates a governance problem many AI programs face: experimentation is not free, and budgets do not scale linearly with team enthusiasm.

UBS also makes a market call that should matter to AI companies selling models and infrastructure. They write that AI model makers, including OpenAI and Anthropic, are likely to be “most exposed to cost-cutting in the near term.” The logic is straightforward: if enterprises throttle token consumption, demand for tokens and usage-driven compute can slow. The source further points to a twist in who benefits. UBS says open-sourced and Chinese models like DeepSeek could be the “biggest potential beneficiaries,” especially for enterprises looking for models for “non-coding-related tasks.” Translation: when buyers get serious about cost per output, they may diversify away from the most expensive options, particularly for use cases where engineering substitutes can deliver adequate performance.

Even with that cost-cutting pressure, UBS repeatedly emphasizes it is not an outright retreat from AI. The analysts call optimization “a healthy problem,” describing some spend optimization as normal and arguing that “no one is hitting the brakes on AI deployment.” They suggest the environment might even be setting up for new models trained on “next-gen chips” that could drive token costs down further. And they point to real product narratives already in circulation. The source mentions Google’s Gemini 3.5 Flash model. It also notes that Anthropic rolled out Claude Sonnet 5 on Tuesday, saying it “runs autonomously at a level that just a few months ago required larger and more expensive models.” These moves matter because they reinforce the idea that enterprises are not rejecting AI. They are demanding better cost efficiency, faster time-to-value, and tighter controls.

There is also a strategic framing shift that matters for operators. One company UBS analysts spoke with said the industry is moving away from the phase of AI experimentation. UBS summarizes the mindset as: “The question isn't whether to use tokens, it's how to use them efficiently.” In that framing, optimization becomes “an ongoing engineering discipline rather than a reaction to a budget crisis.” For executives, that is the difference between temporary cost tightening and permanent operational redesign. The enterprises adding guardrails are likely also building internal measurement, routing, and governance mechanisms that will outlast the current budgeting cycle.

For peers, the stake is simple: if you are still funding AI like a prototype, you may be underestimating how quickly token spend becomes an enforceable constraint. And if you are an AI company watching enterprise pipelines, UBS’s “guardrails” datapoint implies that usage demand may not fall to zero, but it will become more targeted, more conditional, and harder to monetize at broad, unconstrained volumes. Either way, the era of “just let the experiment run” is ending. The new winning posture is efficiency with intent.

Executive ActionsLocked