Engineering

When the LLM Is the Easy Part: AI for Legacy Banking Stacks

Mayowa A.8 min read
When the LLM Is the Easy Part: AI for Legacy Banking Stacks
Share
~12 min

A senior architect at a regional Nigerian bank described the bank's core system this way: "We have an IBM mainframe running CICS, and a COBOL codebase that has been continuously deployed since 1989. The original engineer who wrote the savings module died in 2012. We have his binder."

The binder is real.

Substitute the institution and the date as needed. The shape of the conversation is the same in Stockholm as it is in Lagos as it is in Dallas. The bank has a core. The core is old. The core works. The core cannot break, because if it breaks, customers cannot withdraw money, and if customers cannot withdraw money the regulator opens an investigation. Therefore the core has a stability guarantee that the AI strategy has to defer to, completely.

This is the constraint that gets written out of most "enterprise AI" content. The constraint is the actual job.

The architecture nobody describes

The marketing diagram for an enterprise AI deployment has three boxes. User → Agent → Knowledge Base. There may be a fourth box for Tools.

The architecture diagram for an actual enterprise AI deployment in a bank has — at a conservative count — between twenty and thirty boxes. Most of them sit between the user and the model. None of them are the AI.

A working version of the diagram, sketched at the level of layers rather than systems:

  1. The customer touchpoint — usually a web or mobile app the bank already has, with established analytics, security review cadence, and accessibility audits
  2. An identity translation layer — converting the customer's session into an internal principal the AI system can act as a subset of
  3. A policy enforcement point — the bank's existing rule engine, often homegrown, that says what queries are allowed for what principal
  4. An intent classification layer — Haiku-tier model deciding whether the query is one the AI is allowed to answer, or one that has to go to a human
  5. The retrieval layer — vector store with strict RBAC, with the index built from sources that are themselves three layers deep
  6. The model invocation — yes, finally, the LLM. Sonnet for the actual reasoning, maybe Opus for the edge cases
  7. The response policy layer — guardrails, PII filters, denied topics, contextual grounding
  8. An audit emission point — structured event to an object-locked bucket, replicated to the bank's existing audit ingestion
  9. The escalation surface — when the agent does not have confidence or permission, where the conversation goes
  10. The reporting feed back — outcomes feeding the bank's existing customer-service metrics and the compliance team's regulatory reporting
  11. A regression-eval ingestion — production traffic flowing back into the evaluation suite with retention controls

That is eleven layers above the model. Below the model is where the real work lives.

Below the model

The retrieval layer (#5 in the diagram above) is where every architecture review actually stops moving.

The vector store needs an embedding source. The embedding source is built from the bank's institutional knowledge: account policies, AML guidelines, fee schedules, regulatory filings, internal SOPs, the customer service knowledge base, last quarter's product update memos. These exist in nine different systems and seventeen different formats.

A non-exhaustive catalogue, drawn from the kinds of stacks we see in this work:

The IBM DB2 core

The transactional record sits in DB2 on z/OS. The schema is a thirty-year-old artefact of someone's RPG report layout. Columns are six characters long because of fixed-width terminal constraints from 1991. The same field stores three different semantic meanings depending on the value of an adjacent flag column. EBCDIC encoding throughout. Date fields are eight-character YYYYMMDD strings, except in the savings module where they are YYDDD Julian dates because the savings team had a specific reason in 1994 nobody now remembers.

Getting a clean text feed out of this requires a change-data-capture stream — typically IBM CDC or a third-party tool wired to the DB2 logs — translating to UTF-8, denormalising on the way out, and applying the semantic disambiguation that the binder describes for each field. The CDC pipeline is the most sensitive integration in the entire AI project. If it falls behind the source database, the AI starts answering questions with stale data. If it falls too far behind, regulators notice.

The SharePoint policy library

The bank's policies, SOPs, and internal memos live in SharePoint, where they have accumulated three years of orphaned permissions, two parallel taxonomies (one from the 2023 reorg, one from before it), and a folder called "DO NOT EDIT - 2019 LEGAL ARCHIVE" that nobody is sure if they should index.

The ingestion has to honour the existing SharePoint permissions — meaning the vector store needs metadata reflecting every document's effective ACL at the time of indexing, and the ACL has to be refreshed on every change, propagated through the embedding pipeline. The full architecture for this kind of RBAC mapping is in Designing Strict RBAC for Enterprise Knowledge Bases.

The Oracle Forms application

The savings product configuration lives in an Oracle Forms application that has not had a maintainer since 2018. The only documented access is the green-screen Forms UI. There is no public API.

Two options. Option A: screen-scrape the Forms UI from an automated session, parse the layout, extract the configuration. Brittle, slow, and the security team will block it. Option B: have the database team write a daily extract to flat files on an SFTP location the AI ingestion can pull from. Slower to implement, requires a meeting with the database team, but it is the version that survives an audit. Always Option B.

The Excel canonical record

There is always at least one critical reference in an Excel file. In Nigerian banks it is often the fee schedule for the latest product. In European retail banks it is often the interest rate sheet. In credit unions in North America it is often the membership eligibility matrix.

The Excel is canonical because the compliance team treats it as canonical. The Excel has merged cells, three sheets, a tab called "Final FINAL v3", and a formula referencing a defined name that breaks when the file is opened in LibreOffice. The ingestion has to handle this exactly as the compliance team produces it, on the monthly cadence they update it, with the manual upload step they expect.

We do not change the compliance team's workflow. We build around it.

The translation layer

Sit with those four sources for a moment. They share no schema. They share no encoding. They share no update cadence. They share no security model.

The translation layer is what connects all of them to the AI system, and the design of the translation layer is most of the engineering work in the project.

The pattern that survives audit:

  • A canonical event format every ingestion source eventually emits to — typically a structured envelope with the source identifier, timestamp, principal-or-system-of-record, payload, and a hash for idempotency
  • Per-source adapters — small, focused programs that handle one source each, written by engineers who understand that source's specific quirks
  • A central enrichment step that applies semantic disambiguation (the binder's rules) and propagates the ACL metadata
  • A vector-store sink that respects the ACL and indexes the enriched payload, with re-embedding triggered on configuration changes

The model never sees the raw DB2 column. It sees the canonical event. The audit trail records the chain of transformations the event passed through, so when a regulator's question comes in six months later, the answer is traceable back to the source row in the source system on the source date.

This is the part of enterprise AI that nobody puts on a slide. It is also the part that, if missing, guarantees the project will fail the security review and slip a quarter.

What about the model

The model is — and this is the unsettling part for anyone whose career identity is wrapped up in model engineering — a procurement decision.

In our practice we default to Claude on Bedrock — Sonnet for the reasoning, Haiku for the routing, Opus only for the edges. We cover the full cascade pattern in Multi-Model AI on Amazon Bedrock. The choice is defensible, reproducible, and supported by an evaluation harness that catches regression on every model-version change.

But it is one procurement decision in a project with two thousand decisions. The other one thousand nine hundred and ninety-nine are the architecture above, the integration surfaces from Glue Code Is the Job, and the operational discipline from Mitigating Non-Deterministic AI Failures in Production Systems.

The reason most enterprise AI deployments fail is not that the model was wrong. It is that the architecture treated the model as the project, and the project as the model.

What this teaches us about enterprise scaling

If your AI strategy presentation has three boxes in the architecture diagram and one of them is the model, the strategy is unready. The strategy is ready when the diagram has twenty boxes, the model is one of them, and someone in the room can describe what happens to each of the other nineteen when DB2 falls behind by six minutes during a month-end batch run.

The engineers who can hold that whole picture in their head — who can write a CDC adapter for the savings module on Tuesday, debug the SharePoint ACL propagation on Wednesday, and still convince the platform team on Thursday that the AI layer will not touch their batch window — are the rarest profile in the market.

This is also, not coincidentally, the engineer Tier 1 banks are willing to pay senior-staff compensation for in 2026.

Companion content

How to engage

We do this work — the legacy integration, the CDC pipelines, the SharePoint ACL propagation, the audit-grade translation layer, the AI architecture that sits on top of all of it without breaking the batch window. Talk to us at creativeminds.dev/contact.

enterprise-aibankinglegacy-systemscoboldata-pipelinesproduction-aiperspective

Ready to strengthen your security posture?

We help organizations across Africa build resilient infrastructure, deploy AI at scale, and navigate complex regulatory environments.

Start a conversation