What Human-Led AI Actually Means (And Why Most Companies Are Getting It Wrong)

May 20

"Human-led AI" is becoming one of those phrases that sounds good in a deck and means almost nothing in practice.

Every CFO, every finance platform, every vendor pitching automation right now is using some version of it. And almost none of them can tell you what it actually looks like on a Tuesday afternoon when the numbers don't reconcile and someone has to make a call.

So let me be specific. Because the difference between AI that creates leverage and AI that creates liability is almost entirely about where humans are positioned in the workflow, and most companies are positioning them in exactly the wrong place.

The Two Wrong Ways to Do It

The first wrong way is what I call "AI with a human rubber stamp." The AI runs the process, produces the output, and a human reviews it for three minutes before signing off. This is the most common version of "human-in-the-loop" right now, and it is not governance. It is the illusion of governance. When something goes wrong, and it will, nobody actually caught it because nobody was actually looking.

The second wrong way is "AI as a helper with a human doing the real work." The human is still running every step of the process, but now they have a chatbot they ask questions to. This is not AI transformation. This is an expensive calculator.

Neither of these is human-led AI. They are both reactions to AI rather than intentional designs around it.

What Jim Collins Got Right Before AI Existed

In Good to Great, Jim Collins identified something that the best companies understood about technology that most others missed: technology is an accelerator, not a creator, of greatness. The companies that transformed their industries with technology did not start with the technology. They started with disciplined thinking about what they were trying to accomplish, then used technology to accelerate it.

The companies that failed treated technology as the strategy. They chased tools because the tools were impressive, not because they had a clear operational vision the tools could serve.

Finance AI is playing out exactly the same way right now. The companies building human-led AI finance operations that will actually scale are starting with the operational design. The companies that will write the cautionary case studies are starting with the vendor demo.

Where Humans Actually Belong

Human-led AI is not about how often a human reviews AI output. It is about where in the workflow human judgment is required and designing the system around that requirement from the start.

There are three categories of work in any finance function. The first is high-volume, rule-based, low-ambiguity work: transaction categorization, reconciliation matching, data extraction, report population. This is where AI creates the most immediate leverage and where human oversight means setting the rules and monitoring exceptions, not reviewing every line.

The second category is judgment-intensive work: revenue recognition decisions, materiality calls, variance explanations, forecasting assumptions. This is where AI can surface information and flag anomalies, but a human needs to own the conclusion. Not review it. Own it.

The third category is governance and accountability: investor reporting, audit sign-offs, board presentations, accounting policy decisions. AI has no business operating independently here. Not because the technology is not capable of producing an output, but because accountability cannot be delegated to a system. Someone's professional judgment and reputation has to be on the line. That is what governance means.

Anthropic, whose models power many of the AI tools in the finance space, has been clear about this in their alignment research: the goal of responsible AI deployment is not to remove humans from the loop but to make human oversight more effective. The humans in the loop need to have real authority and real information, not just a checkbox to click.

What It Looks Like in Practice

A well-designed human-led AI finance operation for a mid-sized company might work something like this. Reconciliations run automatically every night. Exceptions are surfaced to a human the next morning with context attached, not raw data. The human reviews exceptions, not matches. Close checklists update in real time. Variance commentary is drafted by AI and edited by a human who understands the business context. The controller reviews a summary, not a spreadsheet.

The humans are doing less volume and more judgment. The AI is doing more volume and no judgment. That is the design principle.

When something goes wrong, and AI accounting automation will produce errors, there is a clear owner. The human who was responsible for the workflow is responsible for the error. Not because we are looking for someone to blame, but because accountability is what makes governance real.

‍ ‍

The Honest Version of Human-Led

The honest version of human-led AI is not a selling point. It is a constraint. It means accepting that some things are slower because a human needs to be meaningfully involved, not ceremonially involved. It means designing workflows where the AI handles the work and the human handles the judgment, with a clear handoff between them. It means building systems where the AI failing is caught early rather than discovered during the audit.

That is harder to build than full automation. It is also the only version that actually works.

If your AI finance implementation is faster because humans are less involved, you have not built a human-led system. You have built a liability with good marketing.

‍ ‍

What We Do Differently at SISU

Every engagement we take starts with the same question: where does judgment live in this workflow? We map it before we touch any tooling. We design the human roles before we design the AI roles. And we are explicit with every client about what the AI will and will not own.

That is what human-led actually means.

Ready to design an AI finance operation where governance is built in, not bolted on?

Book a SISU Finance Operations Assessment →

We will map your workflows, identify where AI creates leverage and where humans need to stay in control, and build the operational design before we recommend a single tool.

References: Good to Great, Jim Collins | Anthropic AI Alignment & Safety Research | High Output Management, Andy Grove

Leticia Esteve