Thomson Reuters and May Mobility push “fiduciary grade” AI verification to satisfy regulators
Fortune executives say you must retrace agentic AI steps, validate outputs, and avoid self-grading systems.

Fortune Brainstorm Tech featured Edwin Olson of May Mobility and Caitlin Halferty of Thomson Reuters discussing how businesses can verify agentic AI work. The push is for accountability mechanisms that let companies explain decisions to regulators and clients.
Agentic AI systems are doing more and more work. The problem is not whether they can produce outputs. It is whether humans can verify the path those outputs took, then show regulators you fixed what went wrong.
That exact accountability theme ran through Fortune Brainstorm Tech in Aspen, Colorado, where executives from multiple leading companies zeroed in on the same operational need: being able to follow, and if necessary re-trace, every step an AI or agentic AI system took when performing a task. Edwin Olson, founder and CEO of May Mobility, put it bluntly: you want the system “as right as often as you can possibly make it,” but because it will “eventually make mistakes,” you also need “transparency and introspectability” so you can understand why it erred. The second half matters just as much as the accuracy goal, because it is what enables conversations with regulators about how you corrected the issue moving forward.
Caitlin Halferty, chief data officer at Thomson Reuters, echoed that validation cannot be purely vibes-based. In the source discussion, she stressed creating a way to validate outputs of any model being used. Thomson Reuters, with AI-enabled services aimed at professionals in areas like legal and tax compliance, has had to treat AI accountability as a near-term business requirement rather than a distant governance project. Transparency is one of four pillars of what the company calls “fiduciary grade” products. The other pillars Halferty cited were data privacy and security, subject matter experts, and reliable content. In other words, verification is not just an engineering feature. It is part of an end-to-end compliance and trust stack.
If accountability is the destination, the hard part is the route. When AI systems begin to operate like agents, the “audit trail” becomes longer, more complex, and often less legible to the people who are ultimately responsible for sign-off. One approach raised by panelists is designing systems that can effectively regulate each other. At May Mobility, Olson described a technique that involves systems installed in autonomous cars that can simulate and assess various scenarios simultaneously, then choose the best option. While that example is automotive, Olson’s framing also points to a broader corporate idea: use specialized systems to test, compare, and pressure decisions before they become actions in the real world.
Elena Kvochko, founder and CEO of Trustguard AI, offered a more explicit analogy for what “verification-as-a-process” can look like. She called it the “LLM as a judge” technique. Her newsroom metaphor is simple: one agent acts as the writer, and another acts as the editor whose sole purpose is to find mistakes or inaccuracies the writer might have missed. The system is designed to be self improving in the sense that the editor agent can detect issues and feed that learning back into the writing agent’s workflow.
But Kvochko also drew a line that will matter to boards and risk committees: the verification has to be structured in separate AI systems, because “you don’t want AI to grade its own work.” That is a governance point disguised as an implementation detail. If one model both generates and evaluates, you can lose the very independence you need for accountability. The takeaway for operators is that auditability may require extra models, extra steps, and extra compute. The payoff is that you can point to separation of duties inside the AI workflow, not just after-the-fact documentation.
As these systems take on more tasks, the verification problem scales faster than headcount. SentinelOne chief AI officer Gregor Stewart described the emerging bottleneck bluntly: you end up in a space where there is so much work done, and so much work to audit, that you “can’t truly be accountable.” He pointed to computer coding as an example where the industry is roughly a year ahead of other domains, and where teams are already exploring techniques that emulate processes developed decades ago for safety-critical technologies. The goal is not to hand a human the entire burden of reviewing output. It is to borrow proven safety practices and bring them into “average practice.” Stewart suggested this is likely to mean a resurgence of safety-focused techniques inside routine workflows.
For decision-makers, the strategic stake is straightforward: regulators and clients will increasingly expect demonstrable control, not just claims of improvement. If your AI system can be audited step-by-step, and if you can show how you validate outputs and correct mistakes, you are better positioned to deploy agentic automation without turning accountability into a permanent backlog. The companies discussed at the event are essentially building trust infrastructure now, because the alternative is operating at speed today and defending credibility later.
This story's Key Insights and Take-aways are locked.
Create a free account to unlock Executive Actions for one credit.
Register to UnlockAlways free for Executives Club members. Join the Club
More in Business

Gina Rinehart backs SpaceX with a $1B+ stake after its $2.5T debut valuation
The Aussie mining billionaire just put Hancock Prospecting behind Musk's rocket-and-satellite combo, and markets noticed.

Fox agrees to buy Roku for $22B, paying $160.00 per share
What looks like a simple streaming bet is actually a $22 billion corporate reshuffle with board and regulatory gravity.

SpaceX jumps 6% in premarket, valuing the company at $2 trillion+ after its debut
The stock’s first-day surge pushes SpaceX past $2 trillion, reshaping how investors and regulators think about private space risk.
