Managers make more mistakes when they call AI agents “employees,” study finds

A Boston University experiment shows naming an agent “Alex” as staff shifts responsibility and escalation behavior in real workflows.

ByLama Al-RashidTechnology Correspondent, The Executives Brief

about 22 hours ago·4 min read

Managers make more mistakes when they call AI agents “employees,” study finds

Executive summary

Emma Wiles, a Boston University business professor, studied how managers behave when an AI tool is framed as an “AI employee” versus a chatbot, using scenarios with an agent called “Alex.” The findings suggest the branding choice is changing error rates and who gets stuck owning risky outputs, which matters as companies roll out agent teams in health care, warfare, education, and government.

The new AI agents are coming to work with employee badges. And a Boston University study suggests that is not a harmless metaphor. When managers were told an agentic “AI employee” produced the work, they made 18% fewer errors than when the same kind of task work was framed as coming from a chatbot. But the story gets more uncomfortable fast: how people assign roles to AI also changes how much responsibility they feel they personally carry, and that can lead to a different kind of failure.

In the research, the AI tool was described like a new underling, even given a name. Participants were presented with a scenario where an agent called “Alex” was treated as an “employee” with a title and defined responsibilities. The headline result was clear: people made 18% fewer errors when the work was said to have come from an agentic “AI employee” rather than a chatbot. Yet that improvement came with a tradeoff. Participants also saw themselves as less responsible for the agent’s output and were 44% more likely to escalate questionable work to a manager for further review rather than trusting their own corrections.

Those numbers are interesting because they point to something that goes beyond UI. This is not just about whether an AI agent is technically good at a task. It is about organizational psychology: who thinks they are accountable, when they second-guess, and how they decide whether to fix problems themselves or hand the issue upward. In the study, framing an agent like staff effectively shifted blame and judgment. And if you are designing workflows, training programs, or even org charts for agentic systems, you are not only adopting a new tool. You are rewriting behavior.

This timing is not accidental. Agentic AI has been getting better at more complicated tasks. Agents are AI tools programmed to work in a loop until they achieve a goal. That matters because it makes the tools feel less like calculators and more like workers. So Silicon Valley is leaning in. The story notes that last year Nvidia’s CEO Jensen Huang talked about workplaces of “digital humans.” Since April, Microsoft, OpenAI, Anthropic, and Google have released new tools oriented toward managing teams of AI agents, many explicitly advertised as digital colleagues with the flexibility and cognitive power of actual humans.

And the study suggests companies are not just dreaming about this future. Nearly a third of the 1,261 managers in Wiles’s study said their companies already frame AI agents as employees, with 23% listing them on org charts. That is the kind of detail boards should care about, because org charts are where accountability becomes real. Once you visually place “Alex” alongside humans, you can easily end up with an organization where humans feel less responsible for output quality, and managers become a default escalation sink.

There is also a bigger second-order implication: if blame becomes easier to offload, the incentive structure for oversight changes. The story flags a growing risk that, as AI agents are embedded into health care, warfare, education, and government, they become a convenient place to dump blame for failures that may actually stem from human decisions, incentives, and oversight. The article points to an example: the bomb strike on a girls’ school in Iran was popularly blamed on Claude, while “all signs point to a cascade of human errors.” The lesson is not that AI is never responsible. The lesson is that naming AI as the culprit can obscure the real chain of human and organizational choices that led to the outcome.

This is where the “agents replace humans” narrative starts to break down. The story quotes Daron Acemoglu, an MIT economist and Nobel Prize winner in 2024, saying AI agents are being marketed as things that can replace humans and calling that “just a losing proposition.” He argues agents should be optimized to improve human capabilities, which he says is not what they have been at the moment. For decision-makers, that is a framing challenge. If you treat agents like coworkers, you may be unintentionally training your team to reduce their own agency, even as you ask them to manage more complicated systems.

Stanford’s research effort referenced in the story adds another layer. Researchers presented 1,500 workers in 104 jobs with information about tasks AI could potentially do, then asked what would actually be most helpful and productive. Workers did want automation in certain areas, like law clerks believing AI could help ensure adequate progress across cases. But often the tasks tech experts deemed most suitable for AI-like work were exactly the tasks workers said they definitely did not want or need an agent to do, including an example of verifying customer credit ratings for sales reps. So even when agent performance is strong, acceptance depends on perceived usefulness, autonomy, and trust.

That brings us back to Alex, the central branding move. Calling an agent an employee is convenient, especially when something goes wrong. It can feel like progress, like modernizing your operation. But according to Wiles’s research, it can make humans around the tool worse at their jobs by changing responsibility and escalation patterns. And in a world where AI is being woven into high-stakes sectors, that is not a marketing footnote. It is operational risk.

If you are a CEO, CTO, product leader, or board member, the strategic stake is simple: the way you label and deploy agentic systems will shape human behavior, quality control, and oversight. The future you are building is not only made of models. It is made of incentives, accountability, and workflow design. Treating agents like coworkers might improve some metrics, like error rates, but it could also quietly erode the human sense of ownership that safety and reliability depend on. Humans still have the agency that AI tries to replicate. They deserve better than a neat org-chart story.

Executive ActionsLocked