Seventy-eight percent of AI agent pilots never reach production.[1] That is not a technology problem. It is an architecture problem, and at the centre of most failed pilots sits the same overlooked question: when a system has multiple AI agents, who decides which agent handles which input, and what happens when it gets that decision wrong?

The answer is routing. It sounds like plumbing. In a regulated firm, it is the difference between an AI deployment that can be audited, governed, and trusted, and one that cannot.

What is AI agent routing, and why does it matter for your firm

Routing is the mechanism that decides which agent in a multi-agent system handles a given input at each step in a workflow. In a single-agent setup this question does not arise. But as soon as a system involves more than one AI component (a client query classifier, a document drafter, a compliance checker, a data retrieval agent), something has to coordinate them.

That something is the router. In legacy automation systems, routing worked on rigid category matching: if the input contained the word “complaint”, it went to the complaints handler. Modern LLM-based routers analyse intent, tone, and context to make that assignment, which is more flexible and more powerful.[2]

It is also more complex to govern, and that matters directly to a financial advice firm operating under Consumer Duty and SMCR.

What goes wrong when routing fails

Misrouted inputs in multi-agent systems can cause compounded errors, not just single mistakes.[2] A query that should go to a compliance-checking agent but instead reaches a response-drafting agent produces output that has never been reviewed against your obligations. That output may reach a client, or land in a file, before anyone notices.

The risk is structural, not just operational. If a firm cannot demonstrate that its AI workflows correctly route inputs to appropriate review steps, it cannot demonstrate oversight. And oversight is what the FCA and the Financial Policy Committee are now looking for explicitly.[3]

The FCA and Bank of England joint statement in early 2026 confirmed that frontier AI models now exceed baseline cyber resilience capabilities, moving AI governance from internal best practice to a formal regulatory obligation for UK financial services firms.[3] That statement was not primarily about chatbots. It was about firms that have deployed AI into operational workflows without being able to account for how those systems make decisions.

Routing is one of the places where accountability either exists or does not.

A firm that cannot explain which agent handled a given input, and why, cannot demonstrate the oversight that regulators are now looking for.

The three routing patterns you will encounter

The knowledge base identifies three main patterns.[2] Understanding them is useful because they carry different governance implications.

Single-agent routing directs every input to one agent. Simple, auditable, but limited. Appropriate for narrow, well-defined tasks where the scope is unlikely to change.

Multi-agent parallel routing sends an input to multiple agents simultaneously, then aggregates the outputs. Faster, but harder to audit: if two agents produce conflicting outputs, the system needs a defined resolution rule, or a human does.

Hierarchical routing uses a coordinating agent (sometimes called an orchestrator) to assign inputs to specialist sub-agents in sequence, based on what each step produces. This is the most capable pattern and the most complex to govern. The orchestrator’s decision logic needs to be documented, testable, and reviewable, because it effectively determines the entire downstream workflow.

For regulated tasks (suitability checking, client communication, AML screening), hierarchical routing requires that the human review step is explicitly built into the routing logic, not added as an afterthought at the end. Any routing decision that bypasses human review of regulated output is a governance gap, not an efficiency gain.

What the EU AI Act adds to the picture

The EU AI Act classifies AI systems used in financial services, including those involved in creditworthiness assessment and client suitability, as high-risk.[4] High-risk classification carries specific obligations: documented risk management systems, data governance controls, transparency to affected individuals, and human oversight mechanisms that are effective in practice, not just described in a policy document.

For a multi-agent system, that means the routing logic itself is subject to scrutiny. If the system routes a suitability-related query through an automated decision path with no human checkpoint, that is not compliant architecture under the Act’s high-risk requirements. The orchestrator’s logic is part of the system’s risk profile, not separate from it.

Firms within scope of the Act (including those serving EU clients) should treat the routing layer as a regulated component, document it accordingly, and test it as part of ongoing risk management.

Why most pilots stall before any of this is addressed

Only 17% of firms have deployed AI agents in production. Over 60% plan to within two years.[5] The gap between those two numbers is explained partly by the statistic at the start of this article: 78% of pilots stall, and 89% of failures trace back to five causes.[1]

Gartner’s identification of “agent-washing” is relevant here: the majority of vendor products marketed as AI agents in 2026 are rebranded legacy automation, not genuine agent architectures.[6] A firm that buys an “AI agent” platform and discovers it has no meaningful routing capability, no audit trail of which agent handled what, and no configurable human oversight steps has not deployed a multi-agent system. It has deployed a slightly more expensive workflow tool with a better marketing budget.

Due diligence on vendor routing capability, audit logging, and human oversight configuration should be part of any procurement conversation, not a technical afterthought.

What to do if your firm is building or evaluating a multi-agent system

First, map the inputs. List every type of input your proposed system will receive, and for each one, ask which agent should handle it and why. If you cannot answer that question clearly, you do not yet have a routing design. You have a system where the routing will be decided implicitly, which means you cannot audit it.

Second, document the routing logic explicitly. For hierarchical systems, write out what the orchestrator decides and on what basis. This does not need to be formal engineering documentation at the start. It needs to be honest enough that you could explain it to a regulator, or to a senior responsible individual under SMCR.

Third, build the human review step into the routing, not onto the end. Every regulated output (a draft suitability letter, a compliance flag, a client communication) should route through a defined human review node before it leaves the system. If the current architecture treats that review as optional or sequential (done after everything else), move it inside the routing logic.

Fourth, test with adversarial inputs. Send inputs that are ambiguous, misleading, or edge-case. Does the system route them correctly? Does it fail quietly or visibly? A routing failure you cannot detect is more dangerous than one that surfaces clearly.

Fifth, assess your vendor’s routing transparency. Can they show you the routing logic? Can they demonstrate an audit trail showing which agent handled which input? If the answer is no, that is a governance gap you are taking on, not them.

How this maps to your AI integration decision

The question I am asked most often is: “Do we need to build this ourselves, or can we buy it?”

For most advice firms, the honest answer is Level 2 (integration). You do not need custom engineering to get a governed multi-agent workflow. You need to configure existing orchestration tools (n8n, Make, or similar) with explicit routing rules, human review nodes, and audit logging. That is configuration work, not development work. It typically takes days to weeks, not months.

Level 3 (custom build) becomes relevant when the volume of inputs is high enough that manual routing review is impractical, or when the regulated tasks are complex enough that a generic orchestration tool cannot express the logic your compliance team needs. That threshold is higher than most firms assume.

Most firms that tell me they need a custom multi-agent build actually need someone to sit with their compliance team, map their workflow inputs, and configure a Level 2 integration that routes correctly and logs everything. That is a much smaller project than the vendor proposals they have usually received.

If your firm is at the point of evaluating multi-agent systems or trying to understand why a pilot has stalled, a discovery call with Cordrey Consulting is a good place to start.


This article is for informational purposes only and does not constitute regulated financial advice or a compliance opinion. Consult a qualified compliance professional for advice specific to your firm.

This article reflects the EU AI Act as understood at the date of publication. Implementation timelines have been subject to amendment. Verify current requirements against primary EU sources and take qualified legal advice for your specific circumstances.

This article does not constitute legal advice. Data protection obligations vary by circumstance and jurisdiction. Consult a qualified solicitor or data protection adviser for advice specific to your firm.


Sources

  • [1] Relevance AI, Why 78% of AI Agent Pilots Never Reach Production (2026). Cited for 78% pilot stall rate and 89% of failures traced to five causes.
  • [2] Patronus AI, Ultimate Guide to AI Agent Routing (2026).
  • [3] FCA, Bank of England, and HM Treasury joint statement on frontier AI and systemic cyber risk (2026)., https://www.fca.org.uk (primary regulatory source; check FCA website for the current version of this statement).
  • [4] European Parliament and Council, Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (AI Act), Articles 6 and 9 on high-risk AI systems and risk management obligations., https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689
  • [5] Relevance AI, Enterprise AI Agents in Production 2026: Governance, Workflows and Deployment Strategy (2026). Cited for 17% deployed / 60%+ planning within two years.
  • [6] Gartner, Hype Cycle for Agentic AI 2026 (2026). Cited for identification of “agent-washing” in vendor marketing.