Lead Phoenix AI

Your AI Mandate Stalls at the Data Layer, Not the Model

You bought the Copilot licenses. The agent reads your inbox but can't touch the GL. Most finance AI mandates fail at the data layer, not the model. Here's the fix you can ship this quarter without waiting for the platform team.

AI data layer
Source response to "CFO tips: Confronting data challenges as AI scales" by CFO Dive, on Gartner CFO research, published 2026.

You buy the Copilot licenses. The board tells you to ship AI wins across finance this year. You sit down at your desk and try the obvious test.

"Why did gross margin drop in EMEA last quarter?"

Copilot reads your inbox. It reads the close-week slides. It reads the email thread where your controller flagged a freight variance. It cannot read the GL. It cannot touch the lake. It cannot see the Spotfire extract your team rebuilds every Monday. The agent is technically running, and it cannot answer the question you actually asked.

This is the AI data layer problem, and it's the single reason most finance AI mandates stall before they start.

Every pilot hits the same wall

The MIT NANDA report that's been quoted everywhere says 95% of GenAI pilots return zero. You can argue with the number. You can't argue with the texture. Gartner says 38% of leaders running stalled AI projects blame poor data quality or limited data availability. Cloudera and Harvard Business Review's analytic team found only 7% of enterprises say their data is completely ready for AI.

For Microsoft 365 Copilot specifically, 40% of organisations delayed their rollout by three or more months over data exposure concerns. The Copilot diagnosis from one practitioner post-mortem is blunt: what you put in dictates what you get out. What goes into Copilot is your SharePoint, your Outlook, your Teams. What does not go in is the lake, the GL, the data marts, the Spotfire-trapped extracts. Every Copilot license you bought is a chatbot reading the wrong files.

CFO Dive, summarising Gartner's CFO research, names what finance leaders actually live with: incomplete data, misaligned definitions, forgotten fields and competing sources of truth. The dashboards your team built were designed for a human to look at. From an agent's perspective, a Spotfire view is opaque. No API. No SQL endpoint. No semantic layer the model can query.

So you have an AI mandate, a tool that works, and no path between them.

The orthodox answer is a re-platform you don't have time for

Ask any analyst and the answer is the same. Fix the data foundation first. Build the catalog. Stand up the semantic layer. Re-platform the lake. BCG's 10-20-70 rule puts data work at twice the weight of the technology work. Gartner has said 60% of AI projects without adequate data foundations will be cancelled by end of 2026.

That destination is correct. The timeline is a fantasy for the person who owns the mandate this quarter.

You don't have 18 months. You have a bonus tied to AI wins, an executive sponsor losing patience, and a team already running Excel against extracts that someone manually exported from the lake. Waiting for the platform team to deliver a governed semantic layer is the same thing as failing the mandate and hoping nobody notices.

The fix isn't waiting. The fix is shipping scoped agents against the slice of data you can already reach, in parallel with the platform work, while it happens.

What "the slice you can reach" actually means

Walk your finance team's desktop for one day. Notice what's already on it.

The controller pulls a GL extract every month for the close. That extract exists. The FP&A analyst runs a Spotfire view and downloads it to CSV every Monday. That CSV exists. The AR lead exports an ageing report from the ERP. That export exists. Someone built a pivot table that reconciles two systems by hand. That pivot table exists.

This is your reachable surface. It's the data your team has already done the hard work of making accessible, even if the platform team has not. The agent doesn't need the lake. It needs the same files your team already uses, pointed at a model that knows how to read them and a workflow that knows what to do with the answer.

The ex-Microsoft team behind Maximor raised on exactly this thesis: build the agents on the Excel and extract surface area finance already maintains, not on an idealised lake. The CFA Institute's research on agentic finance describes the same shape. Agents act as preparers, humans act as reviewers, the way junior staff prepare and managers approve.

This isn't a workaround. It's the right architecture for the reality of your stack.

The human-approval gate is the trust layer

Every CFO who's been burned by a hallucinated number asks the same question. How do you ship an agent on a CSV extract without losing audit and compliance the moment the auditor walks in?

You build the gate in from day one. The agent prepares. The human approves. Every output cites the row, file, or extract it pulled from. Every action above a threshold sits behind a sign-off. The agent can draft a reconciliation, flag a variance, or summarise an exception. It's not allowed to write back to the GL without a human eyes-on step.

This is the same shape we use for any sensitive finance workflow. We've written about it for board-pack drift and for hallucinations in close numbers. The pattern is identical. The agent does the 85%. The human owns the last 15% and the sign-off. Audit gets a log of who ran what, when, with what inputs.

That gate isn't friction. It's the reason a CFO can actually deploy this without burning their career on a model error.

One move you can make this week

Pick one question your team already asks every month, and that the existing tools cannot answer cleanly.

Not the question Copilot answers. The one Copilot can't.

Where are we under-billing this quarter? Which customers are slipping on payment behaviour before the ageing report flags them? Which projects had margin drift between forecast and actual, and what changed?

Find the extract or CSV your team is already pulling that has the inputs for that question. Build one scoped agent on that file, with a human-approval gate, and let it run for four weeks. Measure the time it saves and the catches it surfaces.

That's your first AI win. It doesn't require the lake. It doesn't require the catalog. It doesn't require anyone to wait.

Most finance teams don't have an AI tooling problem. They have an AI ownership problem. Someone has to look at the reachable surface, pick the first question, and ship the agent against it while the platform work happens in parallel. That's the role we play.

If you want a two-week scoped read of what your team's reachable surface actually looks like and where the first agent should sit, that's what our AI Readiness Audit is built to do.

Frequently Asked Questions

Why can't Copilot just query our data lake?

Copilot reads what you've already authorised in Microsoft 365 — SharePoint, OneDrive, Outlook, Teams. Your data lake, ERP, GL and BI dashboards live outside that boundary unless someone explicitly connects them through Graph or an MCP server. For most finance teams that connection hasn't been built, which is why Copilot can answer questions about emails and decks but not about transactions or margins.

Isn't a scoped agent on a CSV just a fragile workaround?

Only if you treat it as permanent. The point is to ship a real outcome on the slice you can reach now, while the platform team builds the catalog and semantic layer. When the lake becomes addressable, the agent gets re-pointed at the cleaner source. The workflow stays, the data input upgrades.

How do auditors react to an agent built on an extract instead of the system of record?

They react well when the audit trail is clean. Every output cites its source row or file. Every action above a threshold is human-approved. The system of record is never written to without sign-off. Auditors aren't afraid of automation; they're afraid of automation without a log.

What's the realistic shelf life of one of these scoped agents before the underlying extract changes shape?

Three to nine months for most finance extracts, in our experience. Schema drift is real, which is why the agent should validate input shape on every run and abstain when it can't. Maintenance is part of the cost of running this, and it's still dramatically cheaper than waiting for the lake to be ready.

Where does this leave the bigger data platform programme?

Exactly where it was. The 12 to 24 month work on catalog, semantic layer and governed access is still the right destination. Scoped agents don't replace it; they buy you time and credibility while it gets built, and they teach your team what questions are worth answering at scale once the platform lands.

Sources

Cited inline above:

  • Fortune — MIT report: 95% of generative AI pilots at companies are failing
  • Cloudera / Harvard Business Review Analytic Services — Only 7% of enterprises say their data is completely ready for AI
  • Computerworld — Microsoft 365 Copilot rollouts slowed by data security, ROI concerns
  • Avantiico — Why Microsoft 365 Copilot adoption fails and what fixes it
  • CFO Dive — CFO tips: confronting data challenges as AI scales

Additional sources consulted for this piece:

  • Gartner — AI projects in infrastructure and operations stall ahead of meaningful ROI returns
  • Gartner — Lack of AI-ready data puts AI projects at risk
  • Teradata — AI agent financial analysis
  • CFA Institute — The Automation Ahead: Agentic AI for Finance
  • BCG — The CFO's AI agenda: from automation to advantage
  • Atlan — Data catalog for AI
  • TechCrunch — Former Microsoft executives launch AI agents to end Excel-driven finance
  • Concentric — Too much access: Microsoft Copilot data risks explained
  • Marketing AI Institute — That viral MIT study claiming 95% of AI pilots fail? Don't believe the hype
  • Search Engine Land — Gartner: 40% of agentic AI projects will fail, making humans indispensable