Nace’s 90/10 agent autonomy works because it generates task adapters, not giant brains
Hypernetworks build per-task model weights from a firm’s policies at inference time, aiming to reduce forgetting and context rot.

Nace.AI, a Palo Alto company that raised a $21.5 million seed round in May, is betting on hypernetworks to generate task-specific model adapters from a company’s policies at inference time. The goal is 90/10 split workflows where agents handle most work while human experts validate, making autonomy less brittle when knowledge changes.
Enterprise AI agents keep hitting the same ugly wall: they demo fine, ship to production, run for a bit, then stall because someone has to top up context and sanity-check the output. That turns “autonomy” into supervision. And for leaders, supervision is where cost and risk go to multiply.
Nace.AI says its agents can run with a 90/10 split, where the system handles the bulk of a workflow and human experts validate the result. The company is using a generator it calls a MetaModel that produces parameter adaptations at inference time from a company’s policies, specifically for regulated work like audit, compliance, and risk assessment. The pitch is simple: if you can keep the agent current on policy without retraining every time and without stuffing ever-longer prompts, you buy more reliable autonomy, not just faster demos.
To see why this matters, zoom out to the underlying failure modes that usually cap agent runs. When teams rely on frontier models as-is, they often run into two patterns. First is fine-tuning, which bakes knowledge into the model weights but stays vulnerable to catastrophic forgetting, a problem identified in the 1980s and described as still unresolved in 2026: teaching the model something new can erode what it already knew. Teams respond by isolating tasks into their own fine-tuned models or adapters, which creates a sprawling “model zoo” that raises cost and governance overhead. And if a policy changes, a fine-tuned snapshot is stale the day it is trained, so updating involves expensive, slow retraining.
Second is in-context learning, which skips retraining by placing relevant policies in the prompt at run time. This is where context rot bites. Retrieval helps narrow what goes into the prompt, but a retrieval miss can look identical to a confident answer. Meanwhile, cost and latency climb with every token added. The end result is the same for both approaches: the model can appear sure while it is wrong or out of date, and that means you cannot reliably tell what is wrong without checking the output all of it. That is why the human never gets to leave.
Nace’s bet is a third approach that shifts the locus of “updating” from retraining or prompt stuffing into generating small, task-specific model weights on demand. The mechanism is a hypernetwork. In plain terms, a hypernetwork is a network whose output is the weights of another network. The idea of hypernetworks dates to 2016, but using them to generate specialist language model adapters from text or documents is recent and active. Sakana AI’s Text-to-LoRA, presented at ICML 2025, generates a model adapter from a plain-language description in a single pass. A 2026 system called SHINE calls hypernetwork adaptation a promising new frontier because it aims to sidestep both fine-tuning’s retraining cost and prompting’s context limits.
The strategic reason this could outperform the “prompt and pray” era is governance. Generating adapters rather than training and storing a sprawling library is designed to collapse per-task LoRAs into one network that can produce them on demand, including for tasks it has not seen. That maps directly to the failure loop executives hate: the same change that fine-tuning struggles with, catastrophic forgetting, is avoided when the system does not lock knowledge into a static snapshot. Meanwhile, the approach targets context limits by keeping the relevant work in generated weights rather than stretching prompts.
A capability nuance also matters for board conversations about autonomy claims. Nace, like other agent systems, is describing autonomy as an outcome, not a magic dial. The source argues that a narrow, current, small model has a smaller surface where it can be wrong. Fewer errors in a known domain means fewer outputs that have to be escalated to a person. That is the operational basis for a 90/10 narrative: not “we set autonomy to 90,” but “the architecture causes fewer bad calls that require humans to intervene.”
But this is still early, and two questions decide whether the autonomy is trustworthy or merely fast. The first is grounding, meaning tying outputs to their source so reviewers can verify rather than redo. Research systems like HalluGuard label each claim as supported or not and cite the passage it relied on. Nace ships grounding models and reasoning traces for the same reason. A 10% review only pays off if a human can confirm provenance in seconds.
The second is the feedback loop, which forces a board-level question: when experts validate outputs, whose model improves, and where does it live? The compounding asset could belong to the vendor or the customer depending on the setup. The source says Nace uses an external network of certified experts for some engagements, and for direct enterprise deployments it can use the customer’s own staff, with the resulting model kept inside the customer’s cloud. Each choice changes where learning and ownership go.
Finally, calibration is the linchpin. The value depends on the model knowing when it is unsure. And the source flags that research into generating adapters has found they do not automatically improve calibration over ordinary fine-tuning, with gains appearing only under specific constraints. It also notes that generator quality depends heavily on policy data curation, and that scale is an open frontier because published hypernetworks have been small. In the interview context, Nace says it has scaled its generator beyond published sizes and derived a scaling law for performance, though the excerpt cuts off before detailing the result.
If you are an operator or investor watching the agent market, the takeaway is clear: autonomy fails most often not because models are too dumb, but because knowledge placement breaks in production. Hypernetworks are trying to fix that placement. The exec stakes are the difference between pilots that stall and systems that can run long jobs with humans validating only the tail.
This story's Key Insights and Take-aways are locked.
Create a free account to unlock Executive Actions for one credit.
Register to UnlockAlways free for Executives Club members. Join the Club
More in Technology

Aura’s e-ink photo frame makes “digital” feel old-fashioned again
Aura Ink uses e-ink to display rotating family photos in a way that visually escapes the “tech gadget” vibe.

NASA’s ERNEST rover hits 16 miles in 37 hours, 10x Mars speed
JPL’s active-suspension prototype drove 0.6 mph in desert tests, using reinforcement learning to move faster than rovers in orbit.

Fitness trackers can work on tattooed skin, but the right tech decides
How tattoos interact with optical sensors, what to test before you buy, and why regulators care.
