Cloud Security

Operating the AWS Architecture for a Nigerian Bank: The 60-Day Implementation

cmdev18 min read
Operating the AWS Architecture for a Nigerian Bank: The 60-Day Implementation
Share
~27 min

Series · Securing Nigerian banks on AWS · Part 3 of 3 ← Part 1: The threat picture (DSI ↗) · ← Part 2: The architecture · The operational implementation

The Friday the audit becomes real

A CBN examiner is sitting in a conference room in Victoria Island. The bank's CISO has just finished walking through the architecture diagrams from Part 2 — Frankfurt primary, Cape Town DR, Direct Connect dual-carrier, the ledger still on-prem. The examiner nods, then asks the question that ends most of these conversations: show me the role that provisioned this. Who created it, when, from where, with what session. If the answer is I will have my team get back to you, the architecture is a slide deck. If the answer is a SQL query that returns in four seconds, the architecture is a platform.

A diagram is not an architecture. An architecture is not a control. A control is not evidence. The work this piece documents is the work that turns the topology from Part 2 into something the examiner can countersign — the Terraform that lays the foundation, the IAM that constrains the humans, the detection that catches what they miss, and the queries that produce the evidence on demand.

The pattern below is what we run for a Nigerian bank engaging us on a standard 60-day platform delivery. It takes the architectural decisions from Part 2 as given — Frankfurt primary, Cape Town DR, Direct Connect from Lagos via dual carriers, the core ledger still in the data centre — and names the modules, shows the configurations that decide whether the platform survives audit, and walks the Friday-evening incident scenario from Part 1 step by step to show how the architecture catches it.

Key takeaways

  • The 60-day delivery splits into four phases: Days 0-14 land the Organisation, OUs, SCPs, and Transit Gateway; Days 14-28 wire IAM Identity Center and per-role permission sets; Days 28-42 deploy GuardDuty, VPC Lattice, and the S3 Object Lock vault; Days 42-60 produce runbooks and the CSAT evidence pack.
  • Service Control Policies on the Production OU are the structural defence — even a fully-compromised principal in a production account cannot disable GuardDuty, delete CloudTrail, or modify the KMS key policy, because the SCP denial happens above the principal's authority.
  • The SOC analyst permission set is built defence-in-depth: read-only scope, explicit Deny on destructive actions, MFA condition on every action, eight-hour session boundary. Three independent constraints, not one.
  • The Friday-evening runbook contains in 18 seconds: GuardDuty surfaces, EventBridge routes, Lambda freezes the role and applies an emergency VPC Lattice deny — all before the SOC analyst's phone stops ringing. CloudTrail Lake produces the containment-validation query inside 3 minutes.
  • The handover at day 60 includes IaC, the control catalogue (every CSAT control mapped to its Security Hub standard and evidence query), runbooks for the top eight sector scenarios, version-controlled detection rules, the CSAT evidence pack, and a cost-monitoring dashboard. The platform runs and the bank's team is trained on it.

Two weeks to lay the floor

Everything begins with the AWS Organisation, the way every habitable building begins with a foundation. Not a single AWS account that swells into a mess of mixed workloads — that path ends in the same place every time, with an auditor staring at a permissions sprawl no one can explain. An Organisation, with Organisational Units that map the bank's operational boundaries, and accounts scoped strictly to one workload each. Core banking in one. Payments in another. Analytics over there. The compliance vault behind its own door with its own key.

The Terraform that builds this:

module "landing_zone" {
  source = "[email protected]:cmdev/aws-landing-zone-banks.git//modules/baseline?ref=v1.4.2"

  organization_id = var.organization_id
  primary_region  = "eu-central-1"
  dr_region       = "af-south-1"

  ous = {
    production  = { name = "Production",  scp = "scp-production-baseline" }
    compliance  = { name = "Compliance",  scp = "scp-compliance-vault" }
    security    = { name = "Security",    scp = "scp-security-services" }
    shared      = { name = "SharedSvc",   scp = "scp-shared-services" }
  }

  accounts = {
    "core-banking-prod"  = { ou = "production", workload = "core-banking" }
    "payments-prod"      = { ou = "production", workload = "payments" }
    "analytics-prod"     = { ou = "production", workload = "analytics" }
    "compliance-vault"   = { ou = "compliance", workload = "vault" }
    "log-archive"        = { ou = "security",   workload = "logs" }
    "audit"              = { ou = "security",   workload = "audit" }
    "shared-services"    = { ou = "shared",     workload = "shared" }
  }

  baseline_controls = {
    enable_guardduty         = true
    enable_security_hub      = true
    enable_cloudtrail_lake   = true
    enable_macie             = true
    enable_iam_access_analyzer = true
    enforce_kms_cmk          = true
    enforce_s3_block_public  = true
    enforce_mfa_for_console  = true
  }
}

The Service Control Policies on the Organisational Units are the structural defence — a steel beam that runs above the workload, not a fence around it. The scp-production-baseline denies the actions that would weaken posture from the inside: disabling GuardDuty, deleting CloudTrail, modifying the KMS key policy that protects customer data, granting IAM permissions to a principal outside the organisation. Even a fully compromised principal in a production account cannot turn these off, because the SCP denial fires above the principal's authority — the way a sealed envelope can be carried but not opened by the courier who carries it.

That is the first two weeks. Nothing the end user sees. The platform foundation, laid in code, version-controlled, reviewed by both teams across a shared screen.

The other piece that lands in this window is the Transit Gateway — the highway interchange that becomes the routing fabric between Direct Connect and every workload VPC. The DX circuit itself should have been ordered weeks earlier. AWS publishes the full procurement walkthrough at AWS Direct Connect: Getting Started ↗, covering location selection, partner engagement, LOA-CFA issuance, and BGP bring-up. Lead times run four to twelve weeks. The platform Terraform is ready to apply on day one of week three, but only useful if the cable is already lit. The AWS Direct Connect + Transit Gateway whitepaper ↗ documents the canonical pattern: a transit VIF on the DX Gateway terminates on the TGW, and each workload VPC attaches to it. The route tables then enforce network-layer segmentation that complements the L7 VPC Lattice policies that arrive later.

resource "aws_ec2_transit_gateway" "core" {
  description                     = "core-banking-mesh"
  amazon_side_asn                 = 64512
  default_route_table_association = "disable"
  default_route_table_propagation = "disable"
  dns_support                     = "enable"
  vpn_ecmp_support                = "enable"

  tags = { Name = "tgw-core", workload = "platform" }
}

resource "aws_dx_gateway_association" "core" {
  dx_gateway_id         = aws_dx_gateway.primary.id
  associated_gateway_id = aws_ec2_transit_gateway.core.id

  allowed_prefixes = ["10.0.0.0/8"]
}

# One route table per workload group — no shared default.
resource "aws_ec2_transit_gateway_route_table" "by_workload" {
  for_each           = toset(["core-banking", "payments", "analytics", "vault", "shared"])
  transit_gateway_id = aws_ec2_transit_gateway.core.id
  tags               = { Name = "rtb-${each.key}", workload = each.key }
}

# Analytics attachment: associate to its own table, propagate only to shared.
resource "aws_ec2_transit_gateway_route_table_association" "analytics" {
  transit_gateway_attachment_id  = aws_ec2_transit_gateway_vpc_attachment.analytics.id
  transit_gateway_route_table_id = aws_ec2_transit_gateway_route_table.by_workload["analytics"].id
}

resource "aws_ec2_transit_gateway_route_table_propagation" "analytics_to_shared" {
  transit_gateway_attachment_id  = aws_ec2_transit_gateway_vpc_attachment.analytics.id
  transit_gateway_route_table_id = aws_ec2_transit_gateway_route_table.by_workload["shared"].id
}

The deliberate move is default_route_table_association = "disable" and default_route_table_propagation = "disable". New attachments do not automatically inherit a route to everything else. Think of it as the difference between a hotel where every key opens every room and one where the front desk hands out exactly the keys you asked for. The platform team writes the association and propagation rules explicitly, per workload, per direction. Analytics propagates to shared services. It does not propagate to ledger. The Layer 3 path simply does not exist.

This configuration is what makes the eventual VPC Lattice policy a second line of defence rather than the only one. Misconfigure VPC Lattice and network routing still blocks the cross-workload path. Misconfigure TGW routes and the L7 service policies still deny. Two layers, two failure modes — both must fail before the architecture fails open.

Two more weeks for the human boundaries

Phase two is where the posture starts taking its operational shape. Identity Center is provisioned. Permission sets are written for each operational role the bank will assign. The Active Directory connector is wired in. Every human who will ever touch the AWS console arrives through SAML, never through a static credential left on a sticky note.

The permission set for a SOC analyst:

resource "aws_ssoadmin_permission_set" "soc_analyst" {
  name             = "SOCAnalyst"
  instance_arn     = local.sso_instance_arn
  session_duration = "PT8H"
  description      = "Read-only access to security telemetry across all bank accounts"

  inline_policy = data.aws_iam_policy_document.soc_analyst.json
}

data "aws_iam_policy_document" "soc_analyst" {
  statement {
    sid    = "ReadSecurityTelemetry"
    actions = [
      "guardduty:Get*", "guardduty:List*",
      "securityhub:Get*", "securityhub:List*", "securityhub:Describe*",
      "cloudtrail:LookupEvents",
      "cloudtrail-data:StartQuery", "cloudtrail-data:GetQueryResults",
      "macie2:Get*", "macie2:List*",
      "config:Get*", "config:List*", "config:Describe*",
      "logs:Describe*", "logs:Get*", "logs:FilterLogEvents",
    ]
    resources = ["*"]
  }

  statement {
    sid     = "DenyDestructiveActions"
    effect  = "Deny"
    actions = ["iam:*", "guardduty:Delete*", "securityhub:Disable*", "cloudtrail:Delete*"]
    resources = ["*"]
  }

  statement {
    sid    = "RequireMFA"
    effect = "Deny"
    not_actions = ["sts:AssumeRoleWithSAML", "sts:GetSessionToken"]
    resources = ["*"]
    condition {
      test     = "BoolIfExists"
      variable = "aws:MultiFactorAuthPresent"
      values   = ["false"]
    }
  }
}

Three things hold this configuration together, and they hold independently. The first is the explicit Deny on destructive actions. An analyst who is socially engineered into running a destructive API call has the call denied at the policy layer, no matter what the rest of the permission set permits. The second is the MFA condition. Any session without active MFA is rejected for everything except the SAML assumption itself — a closed gate with no key. The third is the eight-hour session boundary. Credentials are short-lived by design; the locker re-locks itself at the end of every shift.

One permission set per role: SOC analyst, on-call engineer, incident responder, platform operator, auditor, compliance officer, executive read-only. Each one is reviewed against the principle of least privilege for the role's actual operational needs, not what the title suggests.

Two weeks for the senses

The third phase is where the platform starts hearing things. GuardDuty turns on across every account, with Runtime Monitoring for the EKS and EC2 workloads in payments and core banking. Security Hub becomes the aggregation point, with custom Insights mapped to the CBN CSAT control framework. CloudTrail Lake stands up as the long-retention audit store, with a separate Trail in the log-archive account that the security OU cannot modify — the log book locked in a different room, behind a different key.

VPC Lattice is where lateral-movement defence becomes operational. Every service-to-service edge in the platform is declared explicitly:

resource "aws_vpclattice_service" "ledger_writer" {
  name               = "ledger-writer"
  auth_type          = "AWS_IAM"
  custom_domain_name = "ledger-writer.internal.bank.local"
}

resource "aws_vpclattice_auth_policy" "ledger_writer" {
  resource_identifier = aws_vpclattice_service.ledger_writer.arn

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid       = "AllowPaymentsAndCoreOnly"
        Effect    = "Allow"
        Principal = {
          AWS = [
            "arn:aws:iam::${var.core_banking_account}:role/core-bank-svc",
            "arn:aws:iam::${var.payments_account}:role/payments-svc"
          ]
        }
        Action   = ["vpc-lattice-svcs:Invoke"]
        Resource = "*"
        Condition = {
          StringEquals = {
            "aws:PrincipalTag/workload" = ["core-banking", "payments"]
          }
          DateGreaterThan = {
            "aws:CurrentTime" = "2026-01-01T00:00:00Z"
          }
        }
      },
      {
        Sid       = "DenyAnalyticsExplicitly"
        Effect    = "Deny"
        Principal = "*"
        Action    = "*"
        Resource  = "*"
        Condition = {
          StringEquals = {
            "aws:PrincipalTag/workload" = "analytics"
          }
        }
      }
    ]
  })
}

The explicit Deny on analytics-tagged principals carries the weight. The architectural goal is that an analyst's compromised credentials cannot reach the ledger. The Allow statement limits invocation to core-banking and payments, but the explicit Deny holds the line if a future configuration change accidentally widens that Allow. A second lock on the same door, with a different mechanism.

The S3 vault holding the immutable compliance archive is provisioned with Object Lock in compliance mode, a separate KMS key owned by the compliance-vault account, and a bucket policy restricting writes to a single role assumed only by the backup process. Seven-year retention is baked into the object itself. No role in the bank has the permission to override it. The records are notarised in concrete.

The final fortnight — runbooks and evidence

The last phase is what separates a provisioned platform from an operational one. Detection rules without runbooks are noise — alarms that ring in an empty corridor. Audit data without queries is silence. The final two weeks of the engagement are spent writing the procedures the SOC will actually execute, and the queries the examiner will actually ask.

CSAT evidence comes from a documented set of queries against CloudTrail Lake. Three representative examples:

-- Who provisioned the IAM role used in the incident, when, and from where?
SELECT
  eventTime,
  userIdentity.principalId AS who,
  userIdentity.sessionContext.sessionIssuer.userName AS assumed_by,
  requestParameters.roleName AS role_created,
  sourceIPAddress,
  userAgent,
  awsRegion
FROM
  $bank_audit_data_store
WHERE
  eventName = 'CreateRole'
  AND eventTime BETWEEN @from_date AND @to_date
  AND requestParameters.roleName LIKE @role_pattern
ORDER BY
  eventTime DESC;
-- Every privileged action in the last 24 hours, with attribution
SELECT
  eventTime,
  eventName,
  userIdentity.principalId,
  userIdentity.sessionContext.sessionIssuer.userName,
  requestParameters,
  sourceIPAddress
FROM
  $bank_audit_data_store
WHERE
  eventTime > current_timestamp - interval '24' hour
  AND eventName IN (
    'CreateUser', 'AttachUserPolicy', 'PutUserPolicy',
    'CreateAccessKey', 'CreateRole', 'AttachRolePolicy',
    'PutRolePolicy', 'AssumeRole'
  )
  AND userIdentity.sessionContext.attributes.mfaAuthenticated = 'true'
ORDER BY
  eventTime DESC;
-- KMS key usage for the production CMK: every Encrypt and Decrypt call
SELECT
  eventTime,
  eventName,
  userIdentity.principalId,
  requestParameters.encryptionContext,
  errorCode,
  errorMessage
FROM
  $bank_audit_data_store
WHERE
  eventName IN ('Encrypt', 'Decrypt', 'GenerateDataKey')
  AND requestParameters.keyId = @customer_data_cmk
  AND eventTime BETWEEN @from_date AND @to_date
ORDER BY
  eventTime DESC;

The bank's compliance officer gets these as a parameterised runbook. When the examiner asks the question, the answer is a SQL execution against the audit store, returned in seconds. The examiner did not ask the bank to attest that the controls worked. They asked for evidence. The evidence is the query result.

A Friday evening, contained in eighteen seconds

The most useful test of any security architecture is whether it would catch the incident it was designed against. The Friday-evening pattern from Part 1 — anomalous lateral movement against a payments-handling role, with quiet exfiltration over a weekend — is the scenario this architecture is built to catch. Here is what happens, in real time, when it fires.

The clock starts at zero. GuardDuty surfaces a finding. The payments-svc role in the payments account is making an unusual number of sts:AssumeRole calls toward roles outside its normal pattern. GuardDuty's IAM behaviour model has been baselined for fourteen days, and the call rate is more than three standard deviations above the role's normal envelope. Severity: HIGH. The smoke detector has gone off.

Twelve seconds later, Security Hub aggregates the finding. The EventBridge rule matches on severity >= 7.0 AND product = "GuardDuty" AND resource.tag:workload = "payments". Two destinations fire in parallel — the way a fire panel rings both the bell and dispatches the engine in the same breath.

Three seconds after that, Path A. EventBridge sends to the SNS topic sec-pager-critical, which sends to PagerDuty. The on-call SOC analyst's phone lights up at home. Push notification, SMS, ringing handset.

The eighteenth second is where the architecture earns its keep. Path B has triggered the Lambda auto-contain. The function snapshots the relevant RDS cluster at a clean point in time, freezes the BankOperator permission set in IAM Identity Center so the role can still log in but cannot assume anything in the payments account, and applies an emergency VPC Lattice deny rule that explicitly blocks payments-svc from invoking the ledger-writer service. Automatic containment, no human in the loop. The blast radius stops here, before the analyst's phone has finished ringing.

At T+1:00, the SOC analyst acknowledges the page. Opens the incident channel. Pulls the GuardDuty finding detail.

At T+3:00, the analyst runs the containment-validation query against CloudTrail Lake:

SELECT eventTime, eventName, requestParameters, errorCode
FROM $bank_audit_data_store
WHERE userIdentity.sessionContext.sessionIssuer.userName = 'payments-svc'
  AND eventTime > current_timestamp - interval '60' minute
  AND (eventName LIKE 'Put%' OR eventName LIKE 'Update%' OR eventName LIKE 'Create%')
ORDER BY eventTime DESC;

The result lists every state-changing call the role has made in the last hour. The analyst can see immediately which calls succeeded before containment, which were denied, and what the attempted scope of the compromise was. A timeline, in order, with the questions already answered.

At T+5:00, a decision point. The analyst confirms no successful writes to the ledger, one suspicious PutObject to an internal S3 path that warrants investigation, and no impact on DR replication. The contained state is verified.

At T+8:00, the Aurora secondary in af-south-1 is verified as healthy and lag-free. The Route 53 ARC readiness check is green. If the analyst decides the primary region is compromised, failover is one routing-control flip away. They decide it is not. Primary stays primary.

At T+15:00, sector notification. The bank's NigFinCERT liaison pushes the indicators of compromise — the IP addresses GuardDuty surfaced, the AssumeRole pattern, the timing — to the sector intelligence-sharing channel. The silence is the failure, Part 1 called it. The silence does not repeat. The bank reports.

At T+30:00, the forensics window opens. The RDS snapshot taken at T+18 seconds is restored into a forensic account. The analyst can examine the database state immediately before the anomalous behaviour without touching production — the autopsy without disturbing the patient who recovered.

At T+2:00:00, the root cause surfaces. A long-lived secret in a CI/CD pipeline had been exposed via a public repository commit by a contractor three weeks earlier. The credentials had been scraped, deployed against the bank's pipeline, and used to obtain the payments-svc role. The fix has three legs: immediate rotation, credential scanning in CI within 24 hours, and Macie configured to scan public repositories for the bank's secrets going forward.

At T+24:00:00, the post-incident report is filed with NDIC and CBN. The full audit trail — every call, every actor, every artifact — is queryable from CloudTrail Lake. The NDPA breach-notification window is met with hours to spare.

The architecture worked because every piece played its part. GuardDuty caught the anomaly. EventBridge routed it. Lambda contained it before a human could intervene. VPC Lattice meant the compromise could not reach the ledger. Identity Center meant the principal was attributable. CloudTrail Lake meant the evidence was complete. The multi-region failover was available but not needed. The platform's auto-generated indicators of compromise were ready to share. An orchestra in which every musician knew their part and the conductor was a pre-written rule, not a panicking human at midnight.

What the architecture could not do was prevent the credential exposure. That lived in human process — a contractor, a public commit, three weeks of silence — and human process remains the perimeter that fails first, no matter what the technical posture looks like. The architecture's job is not to make the failure impossible. It is to make the failure cheap.

What lands on the desk at day 60

At day 60 the platform is in production with the workloads the engagement scoped. The handover includes:

  • Infrastructure as code — the full Terraform module set, in the bank's own repository, with documented variables and a CI pipeline that plans against staging on every PR
  • The control catalogue — every CSAT control mapped to its Security Hub standard, the AWS service that implements it, the configuration that enforces it, and the query that produces evidence of it working
  • The runbook library — incident response procedures for the top eight scenarios documented in the sector's published incident pattern, with the queries, the auto-containment actions, and the escalation paths
  • The detection rules — the GuardDuty filters, the custom Security Hub Insights, the CloudWatch alarms, and the EventBridge routing logic, all version-controlled
  • The CSAT evidence pack — pre-built CloudTrail Lake queries the compliance team can execute on demand, plus the standing reports that produce the quarterly examiner-facing summary
  • The cost-monitoring dashboard — every service tagged, every workload attributable, with a CloudWatch dashboard that shows the platform's run rate and an anomaly-detection rule that pages on unexpected spend

The platform runs. The bank's own engineering team is trained on the operational layer. The handover is real, not ceremonial.

What the series does not cover

This is the end of the AWS-architecture series. Three pieces — the threat picture in Part 1 over on DSI, the architecture in Part 2, this operational implementation. Together they document what individual-bank cloud security can credibly be in 2026.

What the series does not document — and what the threat picture in Part 1 implicitly raises — is the sectoral coordination layer. Every individual bank running this architecture is stronger than every bank running the equivalent posture on legacy infrastructure. The sector as a whole is still weaker than it could be, because the telemetry generated by each bank stays trapped inside that bank. Sixty private security operations centres looking at the same attacker, none of them able to compare notes.

The next series, when we publish it, will examine what changes when that telemetry becomes shared infrastructure — the sovereign Nigeria-hosted consortium platform model under discussion at the regulatory and inter-bank level. Until then, the architecture in this series is the right thing to build. It is the pattern we propose for Nigerian and African banks, and the same patterns are in production for our clients in Europe.

FAQs

What does the Service Control Policy on the Production OU actually block?

The actions that would weaken the security posture from inside — disabling GuardDuty, deleting CloudTrail, modifying the KMS key policy protecting customer data, granting IAM permissions to a principal outside the organisation. A fully-compromised principal in a production account hits the SCP denial before its own IAM authority is evaluated. It is structural defence above the workload layer.

Why does the SOC analyst permission set have three independent constraints?

Defence in depth at the policy layer. The Allow statement is read-only and tightly scoped. The Deny statement explicitly blocks destructive IAM, GuardDuty, Security Hub, and CloudTrail actions in case the Allow ever gets broadened. The MFA condition rejects any session without active MFA. If any one constraint is misconfigured in a future change, the other two still hold.

How does the Friday-evening scenario actually get contained in seconds?

GuardDuty surfaces the anomaly at T+0:00. Security Hub aggregates at T+0:12. EventBridge fires two parallel paths at T+0:15-18: one to PagerDuty for the on-call analyst, the other to a containment Lambda that snapshots the relevant RDS cluster, freezes the BankOperator permission set, and applies an emergency VPC Lattice deny rule blocking payments-svc from invoking the ledger. The blast radius stops there, no human in the loop.

What does the bank actually receive at the day-60 handover?

The full Terraform module set in the bank's own repository with a CI pipeline that plans on every PR; the control catalogue mapping every CSAT control to its Security Hub standard, AWS service, configuration, and evidence query; runbooks for the top eight sector incident scenarios; version-controlled detection rules and EventBridge routing; the CSAT evidence pack with pre-built CloudTrail Lake queries; and a tagged cost-monitoring dashboard.

What can the architecture not prevent?

The credential exposure that started the Friday-evening scenario — a long-lived secret committed to a public repository by a contractor. That is human process, and even with the best technical architecture, human process remains the perimeter that fails first. The architecture's job is to limit what that failure costs, which is what GuardDuty, EventBridge, Lambda containment, VPC Lattice, and CloudTrail Lake collectively did.

Series · Securing Nigerian banks on AWS ← Part 1: The threat picture (DSI ↗) · ← Part 2: The architecture · Part 3: The operational implementation


This is the architecture we design for Nigerian and other African banks. The same patterns are in production for our clients in Europe. Engagement enquiries: [email protected] · Cloud security services

awsnigeriabankingterraformiamguarddutyvpc-latticecloudtrail-lakecbn-csatrunbooksimplementationoperations

Ready to strengthen your security posture?

We help organizations across Africa build resilient infrastructure, deploy AI at scale, and navigate complex regulatory environments.

Start a conversation