KYC for AI Agents: Bringing Financial-Services Identity Discipline to Autonomous Systems

A bank does not open an account because a customer remembers a password. A password proves the person at the counter is the same person who was at the counter last week. It does not prove that person should have been at the counter at all. That second question — who are you, where did you come from, are you on a list anywhere, and who is going to be accountable when something goes wrong — is the work of KYC. Banks have been refining the answer for forty years, since the FATF Recommendations first turned the discipline into a global rulebook.

AI agents are roughly where retail banking sat in the late 1980s. We have authentication and authorisation; the principal-of-record chain handles those well enough. We have almost nothing else. There is no industry standard for proving who deployed an agent, what it is made of, whether it has been banned for misuse elsewhere, who is accountable when it acts, or how its behaviour is monitored over its lifetime. Most regulated estates are running agents that would fail a customer-due-diligence check at the bank that owns them. This piece is the case for closing the gap, before the gap becomes the next enforcement headline.

Key takeaways

IAM for AI agents handles authentication and scope. KYC for AI agents handles identity provenance, beneficial ownership, ban-list screening, ongoing monitoring, and suspicious-behaviour reporting — the controls that decide whether an agent should be operating in the estate at all.
The five FATF KYC pillars map cleanly: customer identification becomes signed agent manifests at deployment; due diligence becomes scope and tool-call attestation; sanctions screening becomes a ban-list service for deprecated models and red-team-failed adapters; ongoing monitoring becomes continuous re-attestation against eval suites; SARs become suspicious-agent reports with an automatic kill-switch.
Beneficial ownership for agents is a graph — end-user, orchestrator service owner, model provider, deploying organisation — and FATF Recommendation 24 is the right structural model.
Politically Exposed Agents are agents with elevated authority in treasury, HR, procurement, or legal. They warrant enhanced due diligence on the same logic as FATF Recommendation 12 PEPs.
CBN's Risk-Based Cybersecurity Framework, FATF Rec 10, MAS Notice 626, NDPA Section 65, and EU AI Act Articles 12 and 14 already converge on the same shape. A KYC-for-agents programme today is anticipation, not invention.

Five KYC pillars mapped onto AI agents — identification at deployment via signed manifest, due diligence against declared scope, ban-list screening against a distributed signed list, continuous monitoring through re-attestation against an eval suite, and suspicious-agent reporting with automatic kill-switch. A beneficial-ownership graph runs underneath the pillars connecting the end-user, the orchestrator service owner, the model provider, and the deploying organisation. The principal-of-record chain from the companion IAM piece feeds the due-diligence pillar. — Figure 1 — Five pillars over a beneficial-ownership graph, with the principal-of-record chain feeding due diligence

Knowing the name is not knowing the customer

The companion piece argued for the principal-of-record chain. Every credential carries the human as data, every hop preserves the chain, every leaf system authorises against the human rather than the agent. That solves a real problem. It does not solve every problem.

The chain answers the question "who caused this action?" It does not answer "should this agent have been doing anything at all?" The first is authentication and authorisation. The second is identity provenance and ongoing trustworthiness — the question banking learned to ask the hard way.

A bank with perfect authentication and no KYC lets verified criminals move money. A regulated firm with perfect agent IAM and no agent KYC is a firm where every action is correctly attributed — including the actions of an agent that should never have been deployed, was running a deprecated model with a known data-poisoning susceptibility, was owned by a vendor that lost its safety certification six months ago, and started behaving anomalously last Tuesday with nobody looking. The attribution is clean. The exposure is not.

A passport at the door

FATF Recommendation 10 starts with identification. At the moment a relationship is established, the institution captures who the customer is, verifies against reliable independent sources, and records the basis. An institution that does not know who its customer is cannot screen them, monitor them, or report them.

The agent equivalent is identification at deployment. Every agent that enters the estate should arrive with an artefact that says, verifiably, what it is. Think of it as a passport at the border crossing — a document that lists the bearer, names the issuer, and carries the signature that ties the two together. The minimum content reads like a bank's onboarding pack rewritten for software: the base model identifier and version with checkpoint hash, the fine-tune adapters loaded and their provenance attestations, the prompt set in effect with version and approver, the tool list and MCP server set the agent can call, the declared purpose in language a regulator would recognise, the deploying organisation and named responsible human, and a signature that binds the whole thing together.

This is an AI Bill of Materials with an identity wrapper. The AIBOM tells you what the system is composed of. The KYC manifest tells you who is operating it. The same content serves both.

Verification matters as much as capture. The signature should chain to a recognised root — the deploying organisation's key registered with the regulator's directory, or an industry signing authority where one exists. SLSA-for-ML and Sigstore are the obvious infrastructure. The agent's identity is not a name in a config file. It is an attestation that survives an auditor reading it.

The trap that catches most teams: identification is not a one-time event. The manifest changes every time the base model updates, the adapter set rotates, the prompt is revised, or the tool list expands. In KYC terms, each change is a new customer relationship. The manifest must be re-issued and re-signed, not edited in place. A passport stamped over in pencil is not a passport.

Watching the relationship, not just the signature

FATF customer due diligence asks "what should this customer be doing, and is the relationship consistent with that?" The bank establishes expected purpose, calibrates monitoring against the expectation, and treats deviations as signals worth a second look. Agent due diligence asks the same question. What data is this agent permitted to touch, what tools is it permitted to call, and is its behaviour consistent with the declaration?

The principal-of-record chain is load-bearing here. The chain tells you which human caused each action. Due diligence asks whether the action sat within the declared scope of the agent that carried it. Three pieces of machinery make this practical. The declared scope must be machine-readable — a structured policy listing data classes, tool calls, and action types, not a prose paragraph in a Confluence page. Every tool call should be evaluated against the declared scope at runtime; the same policy engine that enforces RBAC on the leaf system can enforce scope against the agent. And scope changes are events that themselves require due diligence. An agent whose tool list expanded from read-only to read-and-write last month is a different agent than it was last month. The change should generate the same review a corporate customer expanding its line of business would generate at a bank.

The maturity gradient runs from "the agent's permissions are whatever the developer wrote into the IAM role" to "the agent's permissions derive from a signed scope declaration reviewed at the same cadence as the policy that authorised it." Most production estates sit at the first end. Regulated estates need to be at the second.

The list nobody is keeping yet

The OFAC list, the UN consolidated list, the EU restrictive measures register, and national PEP lists are the screening substrate of modern banking. Every transaction and every onboarding runs against them. The lists are centrally maintained, signed, widely distributed, frequently updated. The institution does not get to decide who is sanctioned. The institution is obliged to know.

There is no equivalent for AI agents. There should be, and the absence is one of the more striking gaps in the current ecosystem.

What would the list contain? Deprecated model versions with unpatched safety failures. Fine-tune adapters that failed industry red-teaming. Model providers whose certifications were withdrawn. Specific agent identifiers, keyed off the manifest signature, associated with incidents at participating organisations. Model families susceptible to specific prompt-injection or data-poisoning attacks not yet mitigated.

The list does not need to be operated by a regulator to be useful. The OFAC-equivalent shape is the right one — distributed, cryptographically signed, mirrored by participating organisations, queried on every deployment. ENISA, NIST, MITRE, and the major model providers are all positioned to operate components. CISA's evolving AI-SBOM work is converging on the same artefact set. Someone is going to operate this. The question is whether regulated enterprises participate in defining its shape or inherit whatever shape arrives.

In the interim, build the local version. Maintain an internal banned-models list, banned-adapters list, and banned-providers list. Make every deployment query the list. Re-query nightly. Fail the deployment on a hit and quarantine any agent already running on a hit. This is the institution-level equivalent of subscribing to a sanctions feed before the regulator demands it.

The customer who looked fine on Monday

KYC is not a single event. The customer is monitored for the lifetime of the relationship. Transactions are screened against expected patterns. Changes in behaviour generate review. The bank does not finish KYC at onboarding any more than a doctor finishes a patient's care at the first appointment.

For agents, ongoing monitoring catches the agent that was fine at deployment and is no longer fine at runtime. The failure modes are well-understood. Model drift — the same prompt, the same context, increasingly inconsistent outputs as the foundation model updates underneath. Adapter staleness — a fine-tune current against the eval suite three months ago, silently below threshold now. Prompt erosion — small revisions accumulating into a prompt that no longer matches the manifest. Tool-surface expansion — the MCP server set has grown without the manifest being re-issued.

The discipline that catches these is periodic re-attestation. The agent is re-evaluated against its eval suite on a regular cadence. The results are recorded against its identifier and signed. Deviations generate review. The eval suite is the equivalent of the bank's transaction-monitoring rules — a structured statement of what good behaviour looks like.

The honest question, which nobody in the industry has yet answered, is what cadence. Banking transaction monitoring is continuous; periodic customer review runs from annual for low-risk to monthly for high-risk. The cmdev working position is that high-risk agents — anything touching financial data, personal data, irreversible actions, or external communication — should be re-attested weekly with continuous output sampling against an automated eval. Low-risk agents can be re-attested monthly. The regulator has not specified a cadence. The institution should specify its own and document the reasoning.

What every cadence has in common: the re-attestation result is data, joined to the agent's identifier, stored in the same audit store as the manifest and the principal-of-record chain. The auditor's question — show me the trustworthiness history of this agent — is answered by a single query.

The SAR for software

The SAR regime is the institution's obligation to surface, internally and to the regulator, behaviour that crosses a threshold of concern. The threshold is deliberately vague. The obligation to look is not.

The agent equivalent is the suspicious-agent report. When behaviour crosses a threshold — anomalous tool-call patterns, output drift beyond a tolerance band, repeated failure of safety classifiers, scope violations, principal-of-record chain breaks — the system fires three things at once. An automatic kill-switch on the agent. An internal report to the responsible human. And, where the threshold is sufficient, a regulatory disclosure.

The threshold question will land where banks landed under FATF Rec 20: institution-defined, defensible, documented, audited. The report itself should contain the agent's identifier, the manifest in effect at the time, the behaviour observed, the principal-of-record chain for the actions in question, the eval-suite results from the last re-attestation, and the action taken.

The structural piece that does not yet exist in most estates is the named human accountable for the report. Under NDPA Section 65, the designated DPO is the operational equivalent of an AML MLRO for personal-data incidents. Under EU AI Act Article 14, "human oversight" implies an identifiable human capable of intervening. The regulated estate should designate the AI equivalent of an MLRO — a named human, structurally independent of agent operations, with authority to invoke the kill-switch. The role does not need a new title. It can sit with the existing CISO, DPO, or chief risk officer. It does need to be designated.

Reading the cap table for software

FATF Recommendation 24 obliges institutions to identify, for every corporate customer, the natural persons who ultimately own or control the entity. The recommendation exists because shell companies and nominee structures hide accountability, and accountability is the predicate for every other AML control.

Agents have an equivalent problem. Four parties are always involved. The human end-user whose request triggered the action. The orchestrator service owner inside the deploying organisation. The model provider whose foundation model is in the loop. And the deploying organisation itself, including the named responsible human.

For most actions the principal-of-record chain resolves to the end-user, and that is correct for attribution. For accountability the picture is wider. An agent that acted within scope, using a model whose provider had quietly withdrawn a safety attestation, deployed by an organisation whose responsible human had left six months earlier and not been replaced — that is an accountability gap the principal chain alone does not surface.

The shape of the answer is a graph. For every agent, a queryable graph names the four parties, the relationships between them, and the timestamps at which each was current. The graph is queryable forwards — for this agent, who is accountable — and backwards — for this responsible human, what agents are they accountable for. When a responsible human leaves, every agent they are accountable for surfaces immediately, and the absence of a designated replacement becomes a finding before it becomes an incident.

This is the same shape FATF Rec 24 imposes on corporate customers, and the same shape post-Panama-Papers transparency registers have institutionalised. Importing the discipline is not novel work. It is recognising that agents are the new corporate entity from an accountability standpoint.

The treasury agent is a head of state

FATF Recommendation 12 imposes enhanced due diligence on Politically Exposed Persons — individuals whose position carries elevated authority and therefore elevated risk of abuse. PEPs are not banned. They are scrutinised more closely, monitored more actively, and subject to senior-management approval at onboarding.

The framing transfers cleanly. Politically Exposed Agents are agents with elevated authority in domains where misuse is consequential — treasury, payments, HR, procurement, legal, customer data at scale, regulatory filings, anything that produces irreversible external commitments. These agents warrant enhanced due diligence on the same logic. The base manifest is not enough. Senior-management approval is required at deployment. The eval suite is more demanding. The re-attestation cadence is shorter. The threshold for a suspicious-agent report is lower.

The framing is unfamiliar — most teams have not thought of their treasury automation agent as the AI equivalent of a foreign minister — but the discipline is identical to a regime that already works.

Regulators who already wrote half the rules

The framing is anticipatory but not speculative. Multiple existing regimes already imply most of the components.

The CBN Risk-Based Cybersecurity Framework for the Nigerian banking sector requires identification of digital actors, ongoing risk assessment, and incident reporting. Applied to an AI estate, those obligations require the artefacts the KYC programme produces.

FATF Rec 10 and Rec 12 are the source material the framing borrows from — reasoned discipline rather than prescription, which is what makes them transferable.

MAS Notice 626 codifies customer identification, ongoing monitoring, and suspicious-transaction reporting obligations in unusually concrete language. It is a useful pattern document for any institution drafting an internal agent KYC policy.

NDPA Section 65 designates the data fiduciary's DPO as the responsible human for personal-data processing — the closest existing analogue to the AI MLRO, already in force.

EU AI Act Articles 12 and 14 — both enforceable from 2 August 2026 for high-risk systems — start looking like the AML record-keeping and SAR regimes when you read them next to FATF Rec 11 and Rec 20. Article 14 in particular implies an identifiable human with authority to intervene, which is the SAR-officer role in everything but name. Article 11 and Annex IV require the artefact the deployment manifest, the AIBOM, and the principal-of-record chain together compose.

None of these regimes uses the word KYC. All of them require the same machinery.

What the system looks like, written down

The components compose into an architecture at most twelve months from being the default expectation in regulated estates, and at most thirty-six from being the default everywhere else.

The agent identity registry is the system of record for the manifest. Every agent is registered before deployment, the manifest is signed, the registry is queryable forwards and backwards in time, and an unregistered agent cannot be deployed. Think of it as the corporate register at Companies House — every entity that operates in the jurisdiction is listed, the listing is the predicate for everything else.

The ban-list service screens every deployment and every re-attestation against internal banned-models, banned-adapters, and banned-providers lists. The interface is simple — a hash query against a signed list. When industry-level signed lists arrive, the same interface consumes them.

The continuous verification pipeline runs the eval suite against every deployed agent on the declared cadence, signs the results, attaches them to the registry, and fires an alert on deviation beyond tolerance. This is the agent equivalent of an AML transaction-monitoring system.

The suspicious-agent reporter aggregates the alerts, applies the institution's threshold, fires the kill-switch, notifies the named responsible human, and produces the report artefact. It joins data from the eval pipeline, the principal-of-record chain, the runtime tool-call logs, and the registry. The output is a single document the auditor can read.

The beneficial-ownership graph is the connective tissue. Every agent in the registry is linked to its end-user population, its orchestrator service owner, its model provider, and the responsible human. The graph is queryable in both directions and updated as ownership changes.

Underneath all of it sits the principal-of-record chain from the companion IAM piece. Identity flows through the credentials, every action is attributed, and the KYC layer reads the chain to produce the artefacts the regulator will ask for.

What the rulebook hasn't pinned down yet

Several pieces of the framing are not yet settled. Worth being honest about.

The re-attestation cadence is not specified anywhere. Banking transaction monitoring is continuous. KYC review is periodic. Neither maps cleanly. The cmdev working position is weekly for high-risk and monthly for low-risk. We do not expect this to be the final answer.

The threshold for a suspicious-agent report is not specified, and will likely be left to institutions to define. The same was true for SARs for two decades before guidance hardened. Regulated estates should be defining their thresholds now, defensibly and in writing.

The criteria for Politically Exposed Agent designation are not specified. The treasury and HR domains are obvious. The long tail is not. The institution should make the call itself and revisit it quarterly.

The standard for ban-list distribution does not exist. The OFAC-equivalent shape is the obvious one. The operator could be ENISA, CISA, the FSB, an industry consortium, or a regulator. Participating in defining its shape is more valuable than waiting for it to arrive.

What is not uncertain is the direction. Authentication is solved at the design level. KYC is the next problem. If your authentication is perfect and you cannot answer whether your agent should have been there in the first place, what exactly are you authenticating?

FAQs

How is KYC for AI agents different from IAM for AI agents?

IAM handles authentication and authorisation — who the agent is acting for, and what it is permitted to do once it gets there. KYC handles identity provenance and ongoing trustworthiness — who deployed the agent, what it is composed of, has it been banned anywhere, is its behaviour drifting, and who is accountable when it acts. The principal-of-record chain from IAM feeds the due-diligence pillar of KYC, and the KYC artefacts answer questions the IAM chain cannot.

Is there actually a ban-list for AI agents today?

Not at industry level. ENISA, CISA, NIST, MITRE, and the major model providers hold pieces of the substrate, but no signed, distributed, frequently updated list exists in the OFAC sense. Regulated estates should build the local version now, queryable on every deployment and re-queryable nightly, and participate in defining the shape of the industry-level list when it arrives.

What re-attestation cadence does cmdev recommend?

Weekly for high-risk agents — anything touching financial data, personal data, irreversible actions, or external communication — with continuous output sampling against an automated eval suite. Monthly for low-risk agents. The regulator has not specified a cadence; the institution should specify its own, document the reasoning, and revisit it as the threat model evolves.

Who should be the named responsible human for an agent?

The role does not need a new title. It can sit with the existing CISO, the DPO designated under NDPA Section 65, or the chief risk officer, depending on which function has the operational authority to invoke a kill-switch and the structural independence to file a report. What matters is that the human is named in the manifest, the authority is real, and the chain of succession is in place when the human moves on.

How does this map onto the Nigerian regulatory landscape?

The CBN Risk-Based Cybersecurity Framework requires identification of digital actors, ongoing risk assessment, and incident reporting; applied to an AI estate, these obligations require the agent identity registry, the continuous verification pipeline, and the suspicious-agent reporter the KYC framing describes. NDPA Section 65 designates the data fiduciary's DPO as the responsible human for personal-data processing, which is the closest existing analogue to the AI MLRO role. Neither regime uses the word KYC; both require the machinery.

Companion Content

How to Engage

If you are running an agent estate that needs KYC discipline before the regulator arrives — or building one and want to wire the controls in structurally rather than retrofit them — talk to us at creativeminds.dev/contact.