Clifton AI CTO Joe Bertolami: faster agent coding reveals a new software bottleneck

If teams ship more code, but products do not improve, Bertolami says the bottleneck moved from typing to ambiguity and operations.

ByOmar Al-BalawiTechnology Correspondent, The Executives Brief

about 2 months ago·5 min read

Clifton AI CTO Joe Bertolami: faster agent coding reveals a new software bottleneck

Executive summary

Joe Bertolami, CTO and co-founder of Clifton AI, argues agentic AI is now accelerating code generation, but the real limits are requirements, system integration, and software operations. For executives, the consequence is clear: governance, security, and measurement must evolve or AI will speed up failure, not product improvement.

Agentic AI has gone from “cool demo” to core engineering practice. Companies are using agents to generate more code, faster, and with more reach than before. But Joe Bertolami, CTO and co-founder of Clifton AI, keeps hearing a sharper question from business leaders: if we are shipping code faster than ever, why aren’t our products improving at the same rate?

Bertolami’s answer is blunt. Writing code was never the rate limiter. Even with agents compressing execution time, they do not compress ambiguity, accountability, or operational complexity. In other words, agents can sprint through the mechanical work, but the hard parts of software engineering are still there, and now they get stress-tested at higher speed.

That is the first second-order trap: when agents flood an organization with new code, the hard parts get harder. Defining the right requirements still requires judgment, alignment, and clarity about what success actually means. Integrating with complex systems still demands understanding of dependencies, failure modes, and the ugly realities of production data. And maintaining software under real-world conditions still means dealing with incidents, regressions, and the slow accumulation of technical debt. Agents compress time spent drafting code. They do not automatically compress the uncertainty of what the code should do, or who is responsible when it breaks.

The next bottleneck is organizational, not technical. As AI-generated code scales, human review is becoming a massive new choke point, and engineers are losing the context they need to catch agent mistakes. If reviewers no longer understand the surrounding system well enough to verify intent, then “more PRs” becomes “more risk,” and “faster merging” becomes “faster scrambling.” The companies that handle this well will move forward deliberately. The ones that do not will reach for an even simpler conclusion: reduce headcount and increase AI spend.

Bertolami is basically saying that response is directionally wrong. Enterprise leaders are making irreversible structural decisions while the technology is still moving quickly. That means the right approach is not vibes, it is governance. He outlines a deliberate playbook in three phases.

Phase 1 is financial and risk governance. The goal is to protect the downside and prevent runaway costs and fragmented processes. Bertolami emphasizes treating governance as a tier-one risk, not an afterthought. If teams are allowed to experiment without centralized structure, you get duplicated work and uncontrolled spending. He also argues that agent configuration should be treated like production infrastructure, which means versioning, review, and testing prompts and skills before rolling them out gradually.

He also calls for least privilege for non-human actors. The reason is simple: engineers usually have broad access because they carry contextual judgment and ultimate accountability. Agents do not. If an agent inherits full permissions of its human operator, you introduce an accountability gap into your systems. The playbook here is strict separation between read and write or execute access, plus human-in-the-loop approval gates for destructive or production-altering actions. And as agents shift from suggesting code to autonomously executing tasks, they must be incorporated into the security model as first-class citizens.

Bertolami’s cost caution is not theoretical. He points to Uber capping its AI spend after burning its 2026 budget by April. He also cites an Axios report about an unnamed company incurring a $500 million Anthropic bill in a single month due to runaway agentic loops. That combination, accelerated execution plus insufficient control, is exactly how budgets and reliability can both disappear fast.

Phase 2 is the technical strategy. The first move is to “build the engine,” which means choosing the right models and measuring success. Bertolami argues for multi-model and multi-vendor approaches because no single model excels at every task. You need to characterize behavior and performance boundaries across models so you can route tasks to the systems best equipped to handle them. Standardizing on a single vendor or model can sacrifice capabilities and creates a single point of failure.

He also says not to optimize only for token price. Pay for premium frontier models that deliver higher quality output and reduce costly rework. The cheapest model is not automatically the one with the lowest token cost; it is the one that maximizes efficiency while minimizing downstream risk. For measurement, Bertolami warns that deployments, lines of code, and pull requests were never good productivity metrics, and with AI they can become actively misleading. Instead, he says to target business outcomes like feature adoption and retention, plus engineering durability like change failure rate, escaped defects, and code survival over time. For AI efficiency, measure task success per dollar and rework time, not token counts.

Phase 3 is talent and organization. When agents handle much of code generation, the human bottleneck shifts to review, architectural alignment, and cross-system integration. Bertolami argues engineers need to move from syntax to systems. That means upskilling for guiding agentic processes, managing complex integrations, and maintaining architectural vision that agents can struggle to preserve. He also says performance and incentives need a redesign. Traditional metrics can be ineffective overhead when one engineer can generate the output of a former squad. Teams should reward expanded business impact, cross-system reliability, and effective orchestration, not sheer volume.

Finally, he warns against cutting headcount too early. If you have not integrated agentic workflows, measured augmented output in production, and reworked your roadmap around faster execution, cutting headcount is not discipline. It is blindness. The goal is not only smaller teams. It is teams capable of covering more strategic surface area.

The bottom line from Bertolami is that AI is not a replacement for engineering judgment; it is a force multiplier. In well-structured systems it can safely accelerate delivery. In poorly understood systems it accelerates failure. The fallout he points to includes outages, rising technical debt, and unexpected cost spikes from poorly governed adoption. For C-suites and boards, understanding this dynamic is no longer optional, because execution velocity is outpacing the industry’s ability to manage consequences. The stake is whether your organization turns AI into durable advantage or just buys faster ways to break things.

Executive ActionsLocked