If an examiner asks you to demonstrate BCBS 239 compliance, you need numbers. Not aspirational maturity scores on a slide deck, but specific, derivable metrics grounded in what supervisors actually test. This page is that reference.
For the regulation itself, see What Is BCBS 239?.
The four areas examiners focus on
KPMG's Regulatory Insights identifies four areas of heightened supervisory focus that map directly to what OCC examiners evaluate under the Heightened Standards (12 CFR 30, Appendix D, §II.J) and what the ECB examines under the RDARR Guide:
- Governance: board and senior management involvement, three lines of defense, policies and controls
- Data universe and tiering: scope and classification of data covered by RDARR
- Data lineage: traceability from outputs back to authoritative sources and systems of origin
- Data management and quality: standardized controls, accuracy measurement, issue management Each area has specific metrics supervisors expect. The rest of this page walks through them.
Critical data elements: where everything starts
Critical data elements are the fields that matter. Monocle's Catherine Horne defines a CDE as any data element where "if you removed it, the accuracy, integrity, and functionality of the calculation or process relying on that data would be compromised." That's a useful definition, but it's abstract until you see what it looks like in practice.
What CDEs actually look like
A CDE isn't "credit risk data." It's a specific field in a specific system that feeds a specific calculation. Consider what your examiner would ask about:
Credit risk CDEs include fields like counterparty legal entity identifier (LEI), internal credit rating, probability of default (PD), loss given default (LGD), exposure at default (EAD), facility limit, drawn amount, collateral value, collateral type, maturity date, and industry classification code. These flow from loan origination and booking systems through credit risk engines into capital calculations and regulatory submissions like FR Y-14Q. If the internal rating in your booking system doesn't match the rating in your capital model's input table, you have a CDE integrity problem, and the reconciliation break will show up in examiner testing.
Market risk CDEs include trade notional amount, mark-to-market value, instrument type, underlying asset identifier, counterparty identifier, trade date, maturity or expiry date, currency, and position quantity. These originate in front- office trading systems and flow through pricing and valuation services into VaR models, stress testing, and regulatory reports. The JPMorgan surveillance finding, where the OCC cited failure to surveil "billions of instances of trading activity on at least 30 global trading venues," was fundamentally a CDE completeness problem. The bank couldn't aggregate trade-level data across venues.
Liquidity risk CDEs include cash flow amounts, cash flow dates, counterparty type, product type, deposit type and tenor, collateral eligibility flags, encumbrance status, and currency. These feed LCR and NSFR calculations. When a bank can't produce same-day liquidity coverage ratios during stress, the root cause is usually that these fields arrive too late, are inconsistently defined across entities, or depend on manual consolidation from subsidiary systems.
Operational risk CDEs include loss event amounts, loss event dates, business line classification, risk category, and recovery amounts. These originate in incident management and loss event databases, but the classifications often get assigned manually, making them particularly vulnerable to inconsistency across legal entities.
The common pattern: CDEs are produced by source systems (booking engines, trading platforms, treasury management systems, incident databases), transformed through data pipelines, and consumed by risk models and regulatory reports. Governance attaches to each field individually. You need to know who owns it, where it originates, what quality controls apply, and how it moves. That's why CDE metrics are the foundation of everything that follows.
CDE inventory completeness
This is the percentage of in-scope risk reports and regulatory submissions for which CDEs have been formally identified. Your numerator is the count of reports with CDEs defined. Your denominator is the total in-scope reports. "In-scope" means whatever your RDARR framework covers, and the BCBS 239 text expects that to include all material risk areas (Principle 8, p. 17). The OCC expects coverage "appropriate for the size, complexity, and risk profile of the covered bank" (12 CFR 30, Appendix D).
The target is 100%. In practice, examiners probe whether your scope definition is adequate. Narrowing scope to avoid hard-to-reach reports is a pattern they've seen before.
CDE ownership and metadata
Every CDE needs a named business owner accountable for quality, per the ECB RDARR Guide (p. 10). Business owner, not IT owner. Alongside ownership, each CDE should have a business definition in the data glossary, technical metadata (type, format, allowed values), and a documented authoritative source.
That last element is where things get interesting. If your metadata says "authoritative source:
`warehouse.credit_exposure.counterparty_id`," you've documented the first warehouse table, not the actual system of origin. The authoritative source for a counterparty ID is the booking system or customer master that created it. Tracing to the system of origin requires lineage that extends past the data platform, into the source applications. This is the difference between knowing where data enters your warehouse and knowing where data enters your institution.
Data quality dimensions
These are the metrics examiners test against Principles 3 through 6. They expect measurement across standard dimensions, with thresholds calibrated to regulatory expectations.
Accuracy
Accuracy measures how closely CDE values at consumption points (risk reports, regulatory submissions) match values at the authoritative source. You derive it by comparing values and expressing the result as a percentage of records within tolerance.
The BCBS 239 text requires that data be "reconciled and validated" (Principle 7, p. 16). The ECB's Data Quality Index (DQI) computes accuracy from actual supervisory submission data, not from internal self-reported scores. OCC examiners test against what the bank delivers under both normal and stress conditions.
Accuracy problems often originate at the boundary between source systems and the data platform. A service renames a field, changes a data type, or starts emitting null values for a previously required attribute. If your quality monitoring only runs on warehouse data, you detect this as a downstream anomaly and investigate manually. If you have lineage into the source application code, you can detect that schema change at deploy time, before it propagates. That's the difference between reactive accuracy management and preventive accuracy management, and for CDEs like PD or EAD that directly feed capital calculations, early detection matters.
Completeness
Completeness measures the percentage of required CDE fields that are populated with valid values. It applies both to individual fields and to entity coverage: the BCBS 239 text requires banks to "capture and aggregate all material risk data across the banking group" (Principle 4, p. 13), which means all material legal entities and business lines.
The ECB's 2025 newsletter found banks using "unclear criteria for defining material legal entities." Examiners probe both the metric itself and the scope definition underneath it. If your completeness score looks strong because you narrowed the entity scope, that's not completeness.
Timeliness
Timeliness measures whether data aggregation and delivery meet defined SLAs. For each data feed or aggregation job, compare the actual delivery timestamp to the SLA deadline. Report both the percentage meeting SLA and the distribution of delivery times.
The BCBS 239 text requires banks to "generate aggregate and up-to-date risk data in a timely manner" (Principle 5, p. 14), and the standard explicitly applies this to stress conditions: architecture must support aggregation "not only in normal times but also during times of stress/crisis" (Principle 2, p. 9). BAU timeliness is the easy part. The examiner question is whether you can meet those SLAs when markets are moving fast and the board needs numbers within hours, not days.
Consistency
Consistency measures whether the same CDE carries the same value across systems that should agree. Compare counterparty exposure in the risk data warehouse against the general ledger against the regulatory reporting system. Report the percentage of records that reconcile within tolerance.
Consistency breaks are among the hardest DQ issues to diagnose because the divergence can originate anywhere in the data path. If counterparty exposure differs between two reports, you need to trace each value back to understand where they diverge. Lineage that covers the full path, from source system through transformations to final output, makes root-cause analysis tractable. Without the source-system leg, you can only trace divergence within the data platform, and the root cause may not be there.
Automation and manual workarounds
The BCBS 239 text sets a clear expectation: "Data should be aggregated on a largely automated basis so as to minimize the probability of errors" (Principle 3, p. 12). KPMG interprets "largely" as end-user and desktop computing "reduced to absolute minimum."
Automation rate
Map the end-to-end data flow for each in-scope risk report. For each step, classify it as automated or manual. Express as automated steps divided by total steps. Be honest about what "automated" means. A scheduled SQL query that
requires a manual download, pivot in Excel, and paste into a template is not automated. A pipeline that is 95% automated but has a material manual adjustment in the last mile is effectively manual for supervisory purposes.
True automation of the first mile, from source system to data platform, requires knowing what source systems emit and when their schemas change. Without that knowledge, "automation" is fragile. It works until a source system deploys a change nobody communicated, and then a downstream team compensates with a manual fix, and you've just created another workaround.
Manual workaround inventory
Inventory all points where humans intervene in data aggregation or reporting: data corrections, manual reconciliations, spreadsheet transformations, email-based data transfers. For each, document the materiality (does it affect a CDE?), frequency, and compensating controls.
The ECB's 2025 newsletter found "banks too reliant on weakly controlled manual workarounds." Examiners want to see this inventory shrinking over time, with a credible automation plan. Many of these workarounds exist because source system changes weren't communicated to downstream consumers. A field gets renamed, a service starts emitting data in a different format, a new entity produces records that don't match the expected schema. Data contracts enforced at the source system, validated at deploy time, reduce the conditions that create workarounds in the first place.
Lineage metrics
The ECB's RDARR Guide leaves no ambiguity:
"A complete and granular lineage needs to be ensured." (p. 16)
"A common data architecture supports the creation of end-to-end data lineage." (p. 17)
KPMG's Heightened Standards analysis defines what US examiners look for: the "ability to trace and report on the relationship between data outputs and business processes, authoritative sources, systems of record, and systems of origin." That phrase, systems of origin, is key. It means lineage doesn't end at the first table in your data warehouse.
Coverage
Lineage coverage is the percentage of in-scope risk reports with documented lineage. "Documented" means it exists in a queryable system, not a Confluence page last updated in 2022. The target is 100% of in-scope reports, but coverage alone is insufficient.
Granularity
This is where most banks overstate their position. Many report high lineage coverage numbers, but at process or table level. When an examiner asks to trace a specific regulatory datapoint back to its source, process-level lineage ("system A feeds system B") can't answer the question. The ECB requires lineage at "data attribute level" (p. 16). A coverage metric of 90% at table level is not equivalent to 90% at attribute level. Be specific about what your number actually means.
End-to-end completeness
For each lineage path, determine whether it covers the full chain: (1) source system to data platform ingestion, (2) transformations within the data platform, (3) data platform to final risk report. A path covering only segments 2 and 3 is incomplete.
The ECB's 2025 newsletter found that "many frameworks do not cover the entire data aggregation process from data capture to final reporting." This is the gap, and it maps directly to the three-tier lineage taxonomy. Database-level column lineage tools (like Atlan) parse SQL, dbt, and Spark transformations within the data platform. They cover segments 2 and 3. Source-code-level lineage tools (like Gable) parse the application code in source systems: Java, Kotlin, Go, Python. They cover segment 1. Together, they produce genuine end-to-end lineage. Separately, each leaves a gap.
Reporting capacity
Report production time
Measure elapsed time from data cutoff to final risk report delivery. Track as a time series per report. Increasing production time often signals growing manual intervention or data quality issues that require investigation during the reporting cycle.
Ad hoc reporting capacity
This is the metric that separates banks with genuine aggregation capability from those that have automated their existing reports but can't flex. The BCBS 239 text explicitly requires banks to handle "on-demand, ad hoc risk management reporting requests, including requests during stress/crisis situations" (Principle 6, p. 15).
Measure it with periodic fire drills. Request a non-standard aggregation: "total exposure to counterparties in the energy sector across all entities, broken down by product type, as of yesterday's close." Time how long it takes to produce and validate. If the answer involves a multi-day manual effort, your aggregation infrastructure has gaps.
Answering that kind of question requires knowing what data is available, where it comes from, and how it flows. Which source systems produce sector classifications? How does counterparty data flow into the warehouse? What transformations apply? End-to-end lineage, including the source-system leg, makes these questions answerable without engineers who happen to know the system. That matters most under stress, when those engineers might be working on something else.
Reconciliation break rate
The percentage of reconciliation items with unresolved discrepancies, measured alongside resolution time. When two reports disagree on a value, attribute-level lineage lets you trace both paths and identify where the divergence occurred. Whether the break originated at the source or was introduced during transformation matters for remediation, and lineage that extends to the source system gives you that answer.
Governance and program metrics
These metrics evaluate whether your compliance program is credible, distinct from whether your data is good.
Gap analysis currency
When did your institution last complete a formal gap analysis against BCBS 239 principles (or equivalent domestic standards like OCC Heightened Standards §II.J)? The ECB's 2025 newsletter found "many banks do not provide proper
recent or regular gap analyses comparing against BCBS 239." If your last gap analysis is more than 12 months old, it's stale. Regulatory expectations evolve. What passed in 2020 may not pass in 2026.
Remediation milestone credibility
Track the percentage of remediation milestones met on original schedule, without rebaselining. Every rebaseline signals something to examiners. The ECB found that "banks repeatedly fail to provide credible target end dates." Schedule credibility is a proxy for program control. If your remediation keeps slipping, examiners infer that either you don't understand the scope or you can't execute.
Data quality issue aging
Average time from data quality issue identification to resolution, segmented by severity. Track mean and P90 resolution time. Flag issues open beyond SLA. Persistent aging signals systemic problems, either in root-cause identification or in the ability to make fixes in source systems. When root-cause identification requires understanding where data originated and what changed, source-system visibility shortens investigation cycles.
Control testing coverage and pass rate
The percentage of data quality controls tested in the current cycle, and the pass rate of those tests. The Forvis Mazars analysis recommends "robust data quality processes around source data before embarking on automation type projects." Testing controls at the source is where quality is determined, not at the reporting layer.
Where source-code lineage genuinely helps
Not every metric benefits from lineage, and not every lineage gap requires source-code-level tooling. The honest picture:
The pattern: source-code lineage concentrates its value in traceability, prevention, and root-cause analysis. It doesn't change your governance structure, your program management, or your report design. Use it where it's real. Don't claim it where it isn't.



