IAM for AI Agents: Delegation, Principal Scoping, and the Audit Chain That Survives Contact With an Auditor

The first time a regulator asks the question, the architecture has already failed or already won. The question is simple. Show me, for this specific action on 12 March, which human caused it, what their authorisation was at that moment, what scope the agent was operating under, and who approved the agent's permission set in the first place. If the answer requires correlating four log sources, two ticketing systems, and a Slack thread, the architecture has failed. If a single query against a single audit store returns the chain end-to-end, the architecture has won. Most enterprise IAM today is in the first state. AI agents make it worse before they make it better.

Key takeaways

Classical IAM assumes a one-to-one mapping between actor and principal. AI agents break it at three points: the agent is not the human, subagents are not the human, and leaf systems authorise against whichever identity reaches them.
The four hard problems are delegation chains, principal scoping, action attribution, and revocation propagation — and the failure mode of each only surfaces in an audit, where it is a finding rather than an incident.
The pattern that works carries the principal of record as data in the credential itself — AWS transitive session tags, RFC 8693 token exchange with the `act` claim in Okta, the OAuth On-Behalf-Of flow with `oid` preservation in Entra.
Revocation is the harder half. Short-lived credentials, session-state polling on the hot path, and a kill-switch broadcast compose into a defence; long-lived credentials make the problem unsolvable.
NIS2's 24-hour attribution requirement and EU AI Act Article 14's meaningful-oversight obligation make principal-of-record discipline a regulatory forcing function from 2 August 2026.

The principal-of-record chain across six hops — user, frontend, orchestrator agent, subagent, tool call, leaf system — with PrincipalOfRecord tag preserved at every hop, the trust boundary at the user-to-frontend handoff, audit events emitted per hop joined on trace ID, and the auditor's first question resolved by a single query against the audit store. — Figure 1 — Identity carried forward, audit chain resolves the auditor's question in one query

IAM Was Built for a World That No Longer Exists

The dominant IAM model was designed for two principal types. Humans, who authenticate interactively and act on systems. And services, which authenticate non-interactively and act on themselves — a payment processor calling a billing database, an ingestion pipeline writing to a data lake. Both of these models assume a clean one-to-one mapping between the actor on the wire and the principal in the policy.

AI agents break that assumption at three points. The agent is not the human, but it acts as the human. The agent may call subagents, each of which is also not the human but acts as the human. The agent and its subagents call tools — APIs, databases, internal services — that need to authorise the request against the original human's permissions, not against the agent's permissions, and not against some shared service account the agent happens to hold credentials for. By the time the request reaches the database, four hops have happened, and the database does not know whose data it is reading.

This is not theoretical. Agent identity is the single fastest-growing identity category in enterprise estates, and the most poorly governed. The regulator-aware framing is sharper. NIS2 incident reporting requires that an organisation can attribute every action on critical infrastructure to a natural person within 24 hours of detecting an incident. The EU AI Act Article 14 requires meaningful human oversight of high-risk systems with audit trails sufficient to demonstrate that oversight was actually exercised, not merely designed. An agent that calls a tool against a regulated data set without a clean principal-of-record chain is, on both regimes, a reporting failure waiting to happen.

The Four Hard Problems

Delegation Chains

User authenticates to a frontend. Frontend hands the request to an orchestrator agent. Orchestrator dispatches to a research subagent. Research subagent calls a tool that hits an internal vector store, which is fronted by Lake Formation, which gates access by IAM principal. Who is the principal at each hop? If the answer is "the orchestrator's service role" you have lost the human in hop one. Every subsequent hop is fiction.

Principal Scoping

The agent's permissions must remain a strict subset of the user's, under every circumstance, in every code path. There is no acceptable scenario in which the agent can do something the user could not have done directly. This sounds obvious. In practice, agents are deployed with broad service roles because broad service roles are easier to provision than dynamic per-user scoping, and the failure mode — the agent doing something the user was not authorised to do — only surfaces in an audit. By then it is a finding, not an incident.

Action Attribution

A tool is called. The tool writes to a database. The database logs the principal. If the principal is the agent's role, the attribution is broken — the audit trail tells you which agent did the write, not which human caused it. Attribution must survive every hop, must be carried as data rather than reconstructed from correlation, and must reach the eventual leaf system in a form that system's audit log can record natively.

Revocation Propagation

A user is disabled at 10:14:33. At 10:14:34, the agent acting on their behalf is mid-task, holding a credential, with three tool calls queued. Every one of those tool calls must fail. Long-lived credentials make this impossible. Short-lived credentials make it tractable but not free — the revocation has to propagate faster than the credential's natural expiry, which means either a session-state lookup on every tool call or a kill-switch broadcast to every agent runtime. Both are engineering work that does not show up until the first time someone is disabled and their agent keeps running.

The Wrong Patterns Already in Production

We have seen four of these in the last six months alone. Each looks reasonable in isolation and becomes a finding in aggregate.

A single service account shared across every agent in the organisation, with broad write access, justified on the grounds that "the agents are trusted internal code." The audit trail cannot distinguish which agent invocation, on behalf of which user, did anything. CloudTrail shows one principal: the service account. The compliance posture is unrecoverable without re-architecting.

Hardcoded credentials in agent prompts or environment variables, often production credentials, often in repositories. The agent is trusted to "use them responsibly." This requires no further commentary.

OAuth tokens with excessive scopes, where the agent requests read:everything because the developer did not know which scopes the user-task would need. The agent now has more authority than any human user would be granted interactively, because no human user would have clicked through that consent screen without escalation.

No audit-chain link between the human and the eventual action. The frontend logs who logged in. The agent logs what it decided. The tool logs what it did. Nothing connects the three except wall-clock time. When the regulator asks, you produce three log files and an apology.

The Right Pattern

The pattern that works has four moving parts, and the parts must be wired in this order.

One. The user authenticates against the identity provider — IAM Identity Center, Okta, or Entra ID — and receives a token that identifies them as the principal. This is unchanged from existing practice.

Two. The agent does not authenticate as itself. The agent assumes a role as the user, using a credential exchange that carries the user's identity forward. In AWS, this is sts:AssumeRole with the user's identity packed into session tags. In Azure, it is the On-Behalf-Of (OBO) flow exchanging the user's token for a downstream token. In Okta, it is a token-exchange against the user's session with the agent declared as the actor. In every case, the new credential carries two pieces of data: the agent is the immediate actor, the user is the principal of record.

Three. Every subsequent hop preserves the principal of record. Session tags marked transitive in AWS. The act claim chained in OAuth token exchange (RFC 8693). The oid claim preserved through OBO in Entra. The downstream system never sees the agent's identity in isolation; it sees the chain.

Four. The leaf systems — the database, the vector store, the SaaS API — authorise against the principal of record, not against the agent. Lake Formation does this natively via trusted identity propagation. Snowflake does it via OAuth claims. Custom services do it by reading the principal-of-record tag out of the credential and applying it to their authorisation logic. The audit log on the leaf system records the principal of record, the agent that carried the request, and the trace ID that ties them to the originating session.

This is not novel cryptography. Every piece exists in current identity-provider documentation. The novelty is wiring them together with discipline.

Mapping to Identity Providers

AWS IAM Identity Center

The pattern in AWS is built on three primitives: STS AssumeRole with session tags, transitive tags through role chaining, and trusted identity propagation for AWS-managed targets.

The user authenticates to IAM Identity Center. The agent runtime calls sts:AssumeRole against an agent role, passing the user's identity as session tags — PrincipalOfRecord=user@org, SessionId=..., AgentId=... — and marks PrincipalOfRecord as transitive. The trust policy on the agent role requires sts:TagSession and constrains which session tags may be passed. If the agent dispatches a subagent, the subagent assumes its own role through the same mechanism; because PrincipalOfRecord is transitive, it persists through the chain. Every CloudTrail event recording the assumed-role session carries the principal tag, queryable by aws:PrincipalTag/PrincipalOfRecord. For AWS-managed services that support trusted identity propagation — Lake Formation, Redshift, S3 Access Grants, Athena, Q Business, SageMaker Studio, OpenSearch — the identity context is carried natively rather than via session tags, and authorisation runs against the real user.

The one configuration detail that traps people: the target role's trust policy must explicitly permit sts:TagSession, or the AssumeRole call fails when tags are passed. Most reference templates omit this. Add it.

Okta

Okta's agent identity stack uses token exchange with the act claim to preserve the principal. The user authenticates and receives a session. The agent requests a token via token exchange, presenting the user's token as the subject_token and declaring itself the actor. The resulting token carries the user in sub and the agent in act. Subagents repeat the exchange, with the chain of actors recorded in nested act claims (RFC 8693, Section 4.1).

Inline Hooks intercept token issuance to enforce policy at the moment of delegation — useful for blocking agent token issuance when the user's session shows anomalous risk, or for downscoping the issued token below the user's full authority. Okta's Agent Gateway sits in front of tool calls and enforces the agent's scope-of-action against policy, and emits audit events that carry the full delegation chain.

The pattern that survives audit: every agent token is short-lived (15 minutes maximum), Inline Hooks enforce principal preservation at issuance, the Agent Gateway logs every tool call with the actor chain, and revocation of the user's session invalidates every downstream token via the Okta session graph.

Azure AD / Entra ID

Entra's pattern is the OAuth 2.0 On-Behalf-Of flow, now formally supported for AI agents via Microsoft Entra Agent ID and the Microsoft Agent ID SDK. The agent receives the user's access token from the frontend. The agent calls Entra's token endpoint with grant_type=urn:ietf:params:oauth:grant-type:jwt-bearer, presenting the user's token and requesting a downstream token for the tool's resource scope. Entra returns a new token where the oid claim still identifies the user and the agent's identity is recorded in the audit log as the requesting client.

OBO chains across multiple hops, but each hop requires an explicit OBO call rather than passive propagation — design accordingly. For Azure Logic Apps and Azure AI Foundry, the OBO pattern is wired natively into the agent framework, and the audit log in Entra records the full delegation chain with both the user and the agent principal at each hop.

The trap in Entra: developers reach for client credentials flow because it is simpler, and lose the user identity in the first hop. If the tool needs to authorise against the user, the OBO call is mandatory. There is no shortcut.

Revocation Is the Hard Half

The architecture above gets you delegation and attribution. Revocation is harder and gets less attention because it only matters in the worst moments — terminations, compromise, regulator-mandated suspensions.

Three patterns work, in order of operational difficulty.

Short-lived credentials with aggressive rotation. Every agent credential expires in 15 minutes. The agent re-exchanges on each cycle. When the user is disabled, the next exchange fails. The worst case is 15 minutes of action after revocation. For most regulated workloads this is acceptable; for high-risk actions it is not.

Session-state polling on the hot path. Every tool call checks the user's session state against the identity provider before executing. Adds latency. Eliminates the 15-minute window. Required for actions that touch financial transactions, personal data subject to GDPR or NDPR erasure requests, or any action that cannot be reversed.

Kill-switch broadcast. When the user is disabled, an event fires to every agent runtime instructing it to terminate in-flight work for that user. Requires the runtime to subscribe to identity events and to be capable of cooperative termination. This is the most operationally complete pattern and the most engineering-intensive to build.

In practice, regulated workloads should combine pattern one with pattern two for high-risk tools. Pattern three is the standard for production agent platforms at scale.

The Audit Chain

Return to the regulator's question. Show me, for this specific action on 12 March, which human caused it, what their authorisation was at that moment, what scope the agent was operating under, and who approved the agent's permission set in the first place.

The architecture above answers it in one query against a single audit store. The query joins on a trace ID that originates at the user's authentication event and is carried — as a session tag in AWS, as a custom claim in Okta and Entra — through every subsequent token, credential, and API call. The trace ID resolves to the user's identity at the authentication event, the role and scope assumed by the agent, the policy version in force at the moment of assumption, and the change-management record that approved the policy.

Three implementation details make this work in practice.

The trace ID must be generated at the first hop and must be cryptographically bound to the user's authentication — not a UUID picked by the agent runtime. Otherwise an agent can fabricate a trace and orphan its actions from the user.

The audit store must capture both the identity-provider events and the leaf-system events, joined on the trace ID. CloudTrail alone is insufficient; you need the application logs from the tool calls as well. SIEM ingestion with a normalised schema is the standard answer.

The policy in force at the time of assumption must be queryable historically. IAM policy versioning, Okta policy snapshots, Entra Conditional Access policy history — all three providers offer this; few organisations actually use it. If you cannot reproduce the policy that was in effect on 12 March, you cannot answer the question fully.

Compliance Framing

NIS2's 24-hour attribution requirement is the practical forcing function for European critical infrastructure operators. If you cannot attribute an action on a regulated system to a natural person within 24 hours, you cannot meet the reporting obligation. Agent platforms without a principal-of-record chain fail this test on day one.

EU AI Act Article 14, which becomes enforceable on 2 August 2026 for high-risk systems, requires that human oversight be meaningful and demonstrable. The audit trail must show that a human caused, reviewed, or could have intervened in each action — not in aggregate, but per action. An audit chain that resolves to a principal of record satisfies this; one that resolves to a shared service account does not.

SOC 2 Trust Services Criteria CC6.1 and CC7.2 require logical access controls and system monitoring sufficient to detect and respond to anomalies. Auditors are now explicitly asking how agent actions are attributed and how the principal of record is preserved through delegation chains. The maturity gap between organisations that have wired this correctly and organisations that have not is now visible in audit reports.

What This Teaches Us About Enterprise Scaling

The pattern of preserving the principal of record through every hop, expressed in data carried by the credential itself rather than reconstructed from correlation, is the structural answer to a class of problems that goes beyond AI agents. It is the same pattern that fixes service-mesh attribution, that fixes cross-account access in multi-tenant SaaS, that fixes data-residency enforcement in multi-region architectures.

AI agents are the forcing function because they make the attribution gap impossible to ignore. Organisations that build the principal-of-record discipline now will find that it pays dividends across every other identity-sensitive system they own. Organisations that defer it will retrofit it under regulatory pressure, which is more expensive and more disruptive, and which never produces as clean a result.

The architectural advantage compounds in one specific direction. Once the principal of record is structurally preserved, every new agent, every new tool, every new integration inherits the property without needing to be designed for it. The audit chain extends itself. The regulator's question gets easier to answer with each system added, not harder. That is the test of a load-bearing architecture, and it is the test most agent platforms in production today would fail.

FAQs

Why can't we just give the agent its own service role and call it done?

Because every leaf system — database, vector store, SaaS API — then authorises against the agent's role rather than the human's. CloudTrail records the service account, not the person who caused the action, and the regulator's first attribution question cannot be answered from the audit log. The compliance posture is unrecoverable without re-architecting.

What is the principal-of-record pattern in one sentence?

The agent never authenticates as itself; it assumes a role *as the user*, and every subsequent hop carries the user's identity forward as data in the credential — transitive session tags in AWS, the `act` claim chained per RFC 8693 in Okta, the `oid` claim preserved through On-Behalf-Of in Entra.

How short is short enough for short-lived credentials?

Fifteen minutes is the working ceiling for regulated workloads. That bounds the post-revocation action window at fifteen minutes, which is acceptable for most controls. For high-risk actions — financial transactions, irreversible writes, personal-data operations subject to GDPR or NDPA erasure requests — session-state polling on the hot path is required to close the window further.

What is the most common configuration mistake in AWS?

Forgetting that the target role's trust policy must explicitly permit `sts:TagSession`. Most reference templates omit it, and the `AssumeRole` call then fails when the user's identity is passed as a session tag. The second most common mistake is not marking `PrincipalOfRecord` as transitive, which breaks the chain at the first subagent.

Does this pattern only matter for AI agents?

No. Carrying the principal of record as data in the credential, rather than reconstructing it from log correlation, is the structural answer for service-mesh attribution, multi-tenant SaaS cross-account access, and multi-region data-residency enforcement. AI agents are the forcing function because they make the attribution gap impossible to ignore — but the discipline pays dividends across every identity-sensitive system.

Companion Content

How to Engage

If you are building an agent platform that needs to survive an auditor's first hard question — or retrofitting one that will not — talk to us at creativeminds.dev/contact.