1
User Guide | CRA Analytics Data Tables
Board of Governors of the Federal Reserve System
May 2022
Table of Contents
Purpose ..................................................................................................................................................................... 2
Main Data Sources .................................................................................................................................................... 2
CRA Small Business and Small Farm Data ............................................................................................................. 2
HMDA Data ........................................................................................................................................................... 2
Performance Evaluation Data ............................................................................................................................... 3
Data Tables ............................................................................................................................................................... 3
1. Retail Loan Table ............................................................................................................................................... 3
2. Performance Evaluation (PE) Table .................................................................................................................. 5
3. Merged Data Table ........................................................................................................................................... 8
4. Assessment Area Definition Table .................................................................................................................... 8
Quality Control .......................................................................................................................................................... 9
Key Concepts and Variables of Note ....................................................................................................................... 10
Low-and-Moderate-Income (LMI) Tracts and Borrowers ................................................................................... 10
Summary of Deposits Proxy (SoD_Proxy_AA_Flag) ............................................................................................ 10
County Assessment Area Flag (County_AA_Flag) ............................................................................................... 10
Partial Indicator (Partial_Ind).............................................................................................................................. 10
Additional Data Sources .......................................................................................................................................... 11
National Information Center (NIC) Data ............................................................................................................. 11
Summary of Deposits (SoD) ................................................................................................................................ 11
Census ................................................................................................................................................................. 11
Vendor Data ........................................................................................................................................................ 11
Use Cases & Analytic Examples .............................................................................................................................. 11
NPR Analytics Table ............................................................................................................................................ 11
Retail Loan Table Examples ................................................................................................................................ 12
PE Table Example ................................................................................................................................................ 15
Merged Data Table Example ............................................................................................................................... 16
2
Purpose
The primary purpose of the data tables is to provide Community Reinvestment Act (CRA) stakeholders a better
understanding of the historical relationship between bank lending activity and the conclusions and ratings that
regulators assigned on CRA performance evaluations (PEs).
For the purposes of this document, (1) “ratings” refer to assessment area (AA) performance conclusions and
ratings; (2) AAs refer to areas that received a conclusion or rating as a part of the exam and are captured through
different AA types (see “assessment_area_type” in Data Dictionary
); and (3) “CRA data” refers to data reported
under the CRA.
Main Data Sources
In order to gain a comprehensive view of bank lending activities and connect them to the assigned ratings, Home
Mortgage Disclosure Act (HMDA) and CRA loan data were combined with data collected from banks’ historical
CRA PEs.
The first release of the CRA Analytics Data Tables was in March 2020 and contained data from 200517. The data
have since been updated with a second release including data from 2018 and 2019. This guide will discuss the
original data release (vintage 1) and the subsequent vintages (vintage 2) of the data tables.
Please note that there may be discrepancies between the CRA Analytics Data Tables and publicly available data on
the Federal Financial Institutions Examination Council (FFIEC) website because of resubmissions. The bank-
reported HMDA and CRA data included in these tables were frozen on 12/31/2020.
CRA Small Business and Small Farm Data
The census tract level CRA small business and small farm data are aggregated to the year, bank, and county level.
Bank-provided AA delineations are used to aggregate the data to the county level. The tables include the total
loan counts and dollar amounts for originations and purchases of small business and small farm lending by the
bank in the county with breakouts for tract income levels (low, moderate, middle, upper, and unknown) and
borrower income (gross annual revenue less than $1 million).
In some cases, banks claim a portion of a county, rather than an entire county, as being in an AA. In those cases,
the same data are provided as above for the portion of the county that was inside the banks AA. In addition to
the banks lending for a given year, bank, and county combination, the same loan data breakouts are provided as
above for all lending done by all banks in the county for the same year. Separate aggregates are provided for the
total county and the inside AA portion only where applicable.
The data tables use the post-publication version of the CRA data, which include additional updates to the yearly
publications currently available on the FFIEC website.
HMDA Data
The HMDA loan application register (LAR) data are aggregated to the year, bank, and county level. Where
applicable, these lending categories are further grouped into inside or outside of the banks’ AAs. This grouping
uses the AA delineation files submitted with the CRA data.
3
The data tables use the post-publication version of the HMDA data, which includes additional updates to the
yearly publications currently available on the FFIEC website. Row level (year, bank, county) loan data only include
depository institutions.
The CRA Analytics Data Tables utilize a subset of loans from the HMDA LARs. The way in which HMDA is reported
has changed over the years, and the filters used to aggregate the HMDA data in the Retail Loan Table over time
can be summarized as follows:
For HMDA LARs from 2005 through 2017, the following filters were used:
o Action Types of either 1 or 6
o Loan Purposes of 1, 2, or 3
o Lien Status not equal to 3
For HMDA LARs from 2018 through 2019, the following filters were used:
o Action Types of either 1 or 6
o Loan Purposes not equal to 4
The ability to differentiate between open- and closed-end mortgage products was introduced to HMDA in 2018.
At this time, open-end and closed-end loans are combined in the loan counts and amounts in the Retail Loan
Table.
Performance Evaluation Data
To help fill in the data gaps identified by CRA reform efforts, staff collected a wide range of data elements from a
sample of CRA PEs. For each PE, staff collected information related to the product evaluation periods and the
ratings and conclusions the bank received in its assessment areas. For small and intermediate small banks,
additional information related to assessment area lending and the geographic boundaries of those assessment
areas was collected. These data were extracted from the PEs manually by staff over time to supplement the data
needs of CRA reform efforts.
Data Tables
The first vintage contained four primary tables: Retail Loan Table, Performance Evaluation (PE) Table, Merged
Data Table, and AA Definition Table. In the second vintage, the Retail Loan Table was broken out into eight
separate tables, primarily because of the addition of the breakouts for the individual census tract and borrower
income categories and mortgage loan products.
Definitions of variables across the primary tables are included in the table-specific Data Dictionary
.
1. Retail Loan Table
The Retail Loan Table contains HMDA LAR, CRA small business and small farm loan data for banks and select
demographic data for the counties in which they lend, aggregated at the year, bank, and county level from 2005
through 2019. The table contains loan counts and amounts for originations and purchases, specified as inside or
outside assessment area values, and grouped by borrower and tract income categories. For HMDA, we also
provide breakouts for the various product types. Because of the expansive number of columns, the Retail Loan
4
Table is broken out into eight separate tables. For ease of use, a ninth table containing only the variables needed
to replicate the Retail Lending Test as described in the Notice of Proposed Rulemaking (NPR) is also provided.
Retail Loan Table Descriptions
1. “retail_loan_hmda_bank_total”: This table has the bank HMDA lending at the year, bank, county level with
the total loan counts and amounts for originations and purchases as well as breakouts for the income level
of the census tract in which the loan was made (low, moderate, middle, upper, and unknown), income level
of the borrower (low, moderate, middle, upper, and unknown), loan purpose (home purchase, home
refinance, home improvement), and each of the possible combinations of the census tract income level,
borrower income, and product type.
2. “retail_loan_hmda_bank_inside”: This table has the same loan data as the “retail_loan_hmda_bank_total”
table above but limited to include only counties, or partial county areas, that are part of the bank’s
assessment area.
3. “retail_loan_hmda_agg_total”: This table has the total aggregate HMDA lending at the year, bank, and
county level by all HMDA lenders in the county in which the bank reported loans. This table has the same
level of detail for loans as the “retail_loan_hmda_bank_total” table except that it includes all HMDA loans by
all HMDA lenders.
4. “retail_loan_hmda_agg_inside”: This table has the same loan data as the “retail_loan_hmda_agg_total”
table above but limited to include only counties, or partial county areas, that are part of the bank’s
assessment area.
5. “retail_loan_cra_all”: This table has the bank-reported CRA small business and small farm lending
aggregated at the year, bank, and county level for counts and amounts of originations and purchases with
breakouts for the census tract income (low, moderate, middle, upper, unknown) and borrower (gross annual
revenue less than $1 million) income categories. Separate breakouts are also provided for the banks total
lending in the county as well as the portion inside the assessment area only, where applicable.
6. “retail_loan_demo_all”: This table has the select demographic data at the year, bank, and county level for
each county in which a bank reported either HMDA or CRA loans. Separate breakouts are also provided for
the entire county as well as the portion inside the assessment area only, where applicable.
7. “retail_loan_bank_attributes”: This table has select bank level attributes for each year a bank submitted
either HMDA or CRA loan data.
8. “retail_loan_county_attributes”: This table has select bank and/or county level attributes for each year a
bank submitted either HMDA or CRA loan data. This table contains information such as the number of
branches and amounts of deposits the bank had in the county, CRA Assessment Area Number, and the
MSA/MD code of the county.
9. “retail_loan_lending_test”: This table has a subset of variables from each of the above eight tables and can
be used “out-of-the-box” to replicate a bank’s performance on the retail lending test as proposed in the
NPR.
A bank must be a HMDA reporter, a CRA reporter, or both to be included in the Retail Loan Table (figure 1).
5
Figure 1. Reporters included in the Retail Loan Table - Green, solid boxes indicate inclusion in the Retail Loan
Table, i.e., the bank must be either a HMDA reporter or a CRA reporter to be included in the Retail Loan Table. If
neither HMDA nor CRA reporter (patterned white box), then the bank’s data are not included in the Retail Loan
Table.
2. Performance Evaluation (PE) Table
The PE data has been hand-collected over time in separate vintages. The first vintage of PE data was collected
from a sample of over 6,000 CRA PEs over the 200517 period and consisted of a diverse sample of bank sizes and
business models. The second vintage consisted of an additional 600 PEs sampled from the 201819 period. The
PEs were selected using a semi-random sampling strategy (see Sampling Procedure).
PE Vintage 1
For each PE reviewed in the first vintage (200517), the following data were extracted, where available:
1. Evaluation periods for each product the examiner included as a part of the PE and a flag to denote when a
statistical sample of loans was used by the examiner.
2. Lending, community development, investment, and overall test ratings/conclusions for institution, state,
multistate metropolitan statistical area (MSA), MSAs, and non-MSA AAs.
For each rating/conclusion recorded, a set of “seed” geography codes were collected to help map
the rating/conclusion to the appropriate data in the retail loan tables.
3. Community development loan counts and amounts as well as investment amounts made in each assessment
area.
4. Information on loan production offices (LPOs), where available.
For banks that were not required to report CRA and/or HMDA data at the time of the exam, the following data
fields were collected:
1. County-level geographical boundaries of their AAs with a flag to denote when the county was a partial
county.
6
2. Where applicable, the counts of mortgage, small business, and small farm loans made in each assessment
area were collected.
Loan count data were collected from the Geographic and/or Borrower Distribution tables. In the first
vintage, low- and moderate-income loan counts were combined into one “LMI” value.
PE Vintage 2
There were several changes made to the data collection efforts for the second vintage of PE data (201819). In
addition to the data collected in the first vintage, the following data were also extracted:
1. Total inside and outside assessment area counts and amounts for each loan product reviewed as a part of
the performance evaluation.
2. Assessment area loan count data were collected from the Geographic and/or Borrower Distribution tables
using all of the available income breakouts (low, moderate, middle, upper, unknown, <=$1M GAR, >$1M
GAR).
3. Community development loan and investments made outside of an assessment area were recorded
separately from loans assigned to an individual assessment area. The integration of these Retail Loan and PE
data constitutes the Merged Data Table.
Sampling Procedure
For each vintage, the PEs were selected using a stratified random sample method. The first vintage of PEs was
drawn from evaluations conducted from 2005 to 2017, and the second vintage was from 2018 to 2019.
Evaluations cover bank performance over a period of years preceding the evaluation start date. The sample
therefore, covers the market during a boom period, a severe recession, and then an extended recovery.
Sampling was stratified by the existing three bank asset threshold categories: small, intermediate small, and large;
the agency which performs the performance evaluation (Office of the Comptroller of the Currency (OCC), Federal
Deposit Insurance Corporation (FDIC), or the Federal Reserve); and whether or not the bank received a rating of
“Needs to Improve” or “Substantial Non-Compliance.” Historically, ratings below a “Satisfactory” have been
relatively uncommon. Staff, therefore, oversampled banks with “Needs to Improve” or “Substantial Non-
Compliance” ratings to obtain a sufficient sample size of these performances. PEs were grouped by year,
regulator, bank size, and the performance category described above. PEs were randomly selected from each
group for data collection.
The first vintage of PEs contained approximately 22,000 PEs with an evaluation start date from 2005 through 2017
available on the websites of Federal Reserve Board (the Board), the FDIC, and the OCC. Of these, staff selected
approximately 6,300 PEs from which to collect data. In the second vintage of PEs, staff collected data from 600 of
the approximately 2,900 PEs available from the 201819 timeframe.
Additional Notes on Performance Evaluations
The scope of information on performance ratings varies across agencies.
Board staff had access to ratings for each test at the state and multistate MSA level for Federal Reserve
System (FRS) banks, to ratings for each test at the individual institution level for FDIC banks, and to overall
institution level ratings for OCC banks. To address this variation in availability of digitized information, the
following evaluations were oversampled:
FRS state member bank evaluations that received a “Needs to Improve” or worse on a lending test
rating (or overall rating, for small banks) for any state or multistate MSA;
7
FDIC bank evaluations where the bank level lending test rating (or overall rating, for small banks)
was “Needs to Improve” or worse; and
OCC bank evaluations where the bank level overall rating was “Needs to Improve” or worse
Examiners assign bank ratings on the various CRA tests at the state, multistate MSA, and institution level, but
only draw conclusions at the AA level.
Currently, evaluation conclusions do not use the ratings terminology. As a result, staff often had to
judge what language in a conclusion corresponded to a particular rating. For example, if an AA was
within a state that received an “Outstanding” rating, and the conclusion language for the AA stated
that the bank’s performance was consistent with its state-level rating, the bank would be recorded
in the database as having received an “Outstanding” on the test in question for that AA.
Alternatively, if the state-level rating was “Satisfactory,” but the AA conclusion was “poor” or “below
state-level performance,” then the bank would be recorded as “Needs to Improve” for that AA, i.e.,
below “Satisfactory.”
Evaluation structure differs by regulatory agency.
FRS and FDIC evaluations generally stated conclusions about banks’ retail and community
development performance, as applicable, at the AA levels. OCC evaluations, instead, recorded
conclusions for components of those tests (i.e., geographic and borrower income distributions of
various retail loan products) at the AA levels, with ratings on the full test recorded at the state or
multistate MSA level. Therefore, for OCC banks, ratings on each test for an AA are often only
available when that AA was the only one the bank had in a particular state or multistate MSA.
However, staff did make efforts to map state ratings to multiple assessment area conclusions when
the conclusions were clearly described as consistent with state ratings.
Retail Loan and PE Tables are used to produce the Merged Data Table by matching the bank, geography, and
product evaluation period (figure 2).
8
Figure 2. Merging the Retail Loan Table and PE Table to produce the Merged Data Table - Applicable originations
from the Retail Loan Table are summed and matched with the correct bank, MSA code, and evaluation period from
the PE Table. The aggregated originations are then merged onto the original PE Table data to create the Merged
Data Table.
3. Merged Data Table
One of the primary purposes of collecting the loan and PE data was to link each bank’s lending activity to
individual AAs and their corresponding ratings for a given exam period. The product evaluation period information
was used to independently aggregate the HMDA and CRA lending data within each of the banks’ AAs for each of
the filing years within the product-specific evaluation periods. For banks that did not report HMDA and/or CRA
data during an evaluation period, the lending activity AA information from the PEs was used to fill in the AAs and
lending activity data fields, where applicable. This process led to the creation of the Merged Data Table, which has
one row per bank, exam, and AA, and summarizes all of the bank’s lending activity in addition to the deposits,
demographics, and vendor data (see Additional Data Sources).
For each of the individual AAs in the PE table, where all of the necessary data were available, bank lending
activity, deposits, aggregate lending, and select demographic data from the AA were aggregated over the product-
specific evaluation periods. Generally, only MSA, multistate MSA, and non-MSA AAs were mapped over from the
PE Table to the Merged Data Table. However, if the bank only had one assessment area in a given state or only a
single overall assessment area in the PE Table, that overall or state rating would be mapped to the assessment
area. The Merged Data Table contains AAs from exams with evaluation periods starting no earlier than 2005 and
evaluation periods ending no later than 2019. The updated Merged Data Table also includes the more granular
low- and moderate-geographic and borrower income breakouts described in the Retail Loan and PE table
sections.
There are a substantial number of AAs in the PE table where a conclusion was not extracted from the PE. These
AAs were previously dropped in the first vintage of the Merged Data Table. However, if all of the other
information was available to accurately map the AA and aggregate the lending activity, then these AAs were
included in this second vintage of the Merged Data Table even if they did not have a lending test rating or
conclusion.
4. Assessment Area Definition Table
This table combines AA geography data collected from PEs with AA information from annually reported CRA data.
Joining these AA data together results in (approximate) county level geographic definitions for each AA included in
the Merged Data Table.
Banks that report their CRA lending data annually provide a list of all census tracts where they originated or
purchased loans and flag them if they were in their AA for the filing year. However, information to accurately
assign these tracts to specific AAs and their respective ratings for a given exam were unavailable. To solve
this problem, MSA, state, and county geography codes for each AA in each exam were collected, where
available, and used to associate the reported lending with a specific AA. It should be noted that neither small
banks nor ISBs are required to report their CRA data, but both small banks and ISBs have the option to report
CRA data.
“MSA AAs” refer to metropolitan statistical areas (MSA), metropolitan division (MD), or combined statistical
area (CSA). Whenever these geography codes were used to define the geography of an AA in the exam, they
were collected and mapped to all counties in the annually reported CRA data within that same MSA code.
9
For “non-MSA AAs,” only a single “seed” county was recorded during the PE data collection for each exam.
These seed county codes were a starting point to build the whole, approximate non-MSA AA. First, the seed
county was selected and matched to the same county inside the AA of the corresponding bank in the Retail
Loan Table. Then, the list of non-MSA counties inside the AA was examined to see if any of the remaining
counties were contiguous to the seed county, using the census contiguous counties files. If a candidate
county was (1) contiguous to the seed county, (2) inside the bank’s AA, (3) in the same state as the seed
county, and (4) not a seed county itself, then the county was assigned to the seed county’s AA. Iterations of
this process continued until no new counties were added to the AA.
Small banks do not report CRA data; therefore, data defining their AA geographies were only available
through PEs. For these banks, the entire set of counties that were at least partially included in each AA was
collected. Census-tract-level information for partial counties was not collected; instead, a flag was created to
note when only a portion of a county was included in the AA.
Quality Control
Various quality control (QC) strategies were applied to assess the data tables. The sections below discuss the
approaches taken for each of the primary tables.
Retail Loan Table
As the content for the Retail Loan Table comes from validated sources like CRA and HMDA, the QC process
focused on validating the application of the business requirements and programming procedures used to create
the Retail Loan Table. After the knowledge transfer of the materials, an internal IT team independently and
successfully replicated the Retail Loan Table components and the final table itself.
PE Table
Two approaches were used to provide quality control and validation of the PE data. First, the data collection
included continuous monitoring. The data were processed and integrated into the internal database, which
included procedures to detect and correct errors in the data. Second, a separate QC process for each vintage
reevaluated and re-extracted data from a sample of PEs including 200 from the first vintage and 50 from the
second vintage. The re-extracted data were compared to the previously collected data for each major type of data
point collected. The review found that data from the first vintage were misreported at a rate of 1.0 percent and
missing in error 3.0 percent of the time. Results from the second vintage showed an overall rate of misreporting
of 1.4 percent and missing in error 0.1 percent for the data fields reviewed.
A full breakdown of the percentages of misreported and underreported data for each of the PE Table data fields
can be found in the “pe_accuracy_check_npr” table included in this release. While the data collection process
included robust QC checks, the data may nevertheless be subject to error. To submit a question or feedback,
please fill out the feedback form. To ensure that your question is properly routed, please select the Community
Reinvestment Act as the “staff group” and select no other options above the field labeled “Type your message.”
Merged Data Table
The Merged Data Table was validated using a combination of code review, user testing, and spot checks of
random samples. A code review to map the ratings, evaluation periods, and the AA code data from the PE tables
to the matching year, bank, and county records in the Retail Loan Table was conducted to ensure that the code
aggregated the data in accordance with the business requirements of the project. In addition to the code review,
analysts randomly sampled rows in the Merged Data Table (exam AAs) and manually validated the aggregated
10
loan data and compared it to the sum of the matching loan data from the Retail Loan Table (for the set of counties
which constituted the AA over the specific evaluation period from the PE data).
Key Concepts and Variables of Note
This section provides additional information on key concepts across the data tables as well as notable variables
within the data tables.
Low-and-Moderate-Income (LMI) Tracts and Borrowers
Low-and-Moderate-Income (LMI) borrowers and geographies are key concepts in the CRA Analytics Data Tables.
Definitions of LMI geographies are consistent across HMDA and CRA small business and small farm lending, where
census tracts with a median family income (MFI) of less than 50 percent of the area MFI are considered “low” and
those with MFI of less than 80 percent are considered “moderate.” However, each primary dataset provides
different definitions for loans to LMI borrowers. For HMDA purposes, “low” income borrowers refer to borrowers
with incomes less than 50 percent of the area MFI and “moderate” income borrowers are those with incomes less
than 80 percent of the area MFI. For CRA small business and small farm lending, an LMI borrower refers to a small
business with gross annual revenue of less than $1 million.
Summary of Deposits Proxy (SoD_Proxy_AA_Flag)
The Summary of Deposits (SoD) Proxy variable is a proxy for defining AAs in the loan data for banks that did not
report CRA lending data for a given year. If a bank operated a branch within a county for a given year, then that
county was assumed to be included in the bank’s AA for that year. The FDIC’s SoD data were used to create this
flag. They were aggregated into a table with one row per year, bank, and county. Each row contained the number
of branches within the county and a binary flag, indicating that the bank had at least one branch in this county
during the time of reporting. These data were merged to the loan data by year, bank, and geography to create the
SoD table. For the variable “SoD_Proxy_AA_Flag” in the “retail_loan_county_attributes,” records from the loan
table that matched a row in the SoD table have a value of 1 and otherwise have a value of 0. For cases where the
lender is not in the annual SoD data, the value is NA.
County Assessment Area Flag (County_AA_Flag)
County AA Flag in the “retail_loan_county_attributes” is used to indicate that at least one census tract within the
county is inside the bank’s AA.
If this flag is positive (value = 1) and the Partial Indicator variable is negative (value = “N”), then the entire
county is within the bank’s AA.
If the flag is positive (value = 1), and the Partial Indicator variable is positive (value = “Y”), then only a portion
of the county is inside the bank’s AA.
If the flag is negative (value = 0), then the county is completely outside the bank’s AA.
If the value is missing or “NA” then the bank does not report annual CRA data, and its assessment area status
is unknown.
Partial Indicator (Partial_Ind)
The Partial Indicator variable in the “retail_loan_county_attributes” table is positive (value = “Y”) when a bank
claims part of the county in its AA. Banks that report CRA lending data also provide a list of all tracts included
11
within their AAs. However, banks are not required to add entire counties to their AAs. This variable comes from
annually reported bank CRA data.
If the flag is positive (value = “Y”), the bank includes only a portion of the county in its AA.
If the flag is negative (value = “N”), the bank includes the whole county in its AA.
Additional Data Sources
National Information Center (NIC) Data
The National Information Center (NIC) provides comprehensive information on banks and other institutions for
which the FRS has a supervisory, regulatory, or research interest, including both domestic and foreign banking
organizations operating in the United States.
Summary of Deposits (SoD)
The FDIC SoD is an annually reported dataset, which contains the location of each branch office for all FDIC-
insured banks for the reporting year. The SoD was used to aggregate the total sum of deposits within a given
county for each bank and HMDA/CRA filing year and to count the number of branches (if any) that a bank has in a
county.
Census
The FFIEC census data, available on the FFIEC Online Census Data System, identify LMI census tracts. They also
provide data for several county-level demographic comparators, such as the count of LMI families and owner-
occupied housing units within a given county.
Vendor Data
Dun & Bradstreet (D&B) data identify the number of small businesses within a county for a given year. The D&B
2010 census file data comprise all D&B records in the United States, Puerto Rico, and the Virgin Islands. These
data include fields for two-digit North American Industry Classification System (NAICS) codes; annual sales
volume; year started; control year ranges; and census state, county, and tract codes. The 2018 data included
updates to correct for changes in census and American Community Survey (ACS) data.
Use Cases & Analytic Examples
NPR Analytics Table
To assess the performance of large banks under the proposed Retail Lending Test, a Retail Loan NPR Analytics
Table was created from the base Retail Loan Table. This analytics table grouped each Large Bank’s lending into
approximate assessment areas over a hypothetical evaluation period of 2017-2019. These assessment areas are
referred to as approximate because they rely on the Assessment Area Number (AAN) field reported by banks in
their annual CRA reporting. These AANs may differ from actual assessment areas defined during a bank’s regular
CRA PE. To replicate this table used in the NPR analyses, the data user should do the following:
1. Utilize the retail loan table data from 2017 to 2019.
2. Remove wholesale/limited purpose/strategic plan banks from the sample. These banks were identified
using the CRA exam results data (from the FFIEC website). Any bank examined during the evaluation
12
period as wholesale/limited purpose/strategic plan is removed from the sample (regardless of exam year),
as are any banks whose first exam after the evaluation period is wholesale/limited purpose/strategic plan.
3. Remove bank-county observations that are part of non-state territories (Puerto Rico, Guam, etc.). These
observations are often missing the data needed to construct the various community benchmarks, so they
should be removed.
4. Create a consistent assessment area (AA) identification (ID) variable that is constant across time. The
AA ID variable used in the NPR analyses focused on large bank performance on the retail lending test is a
concatenation of the Assessment_Area_Number (AAN) and the MSA_Code (or state code in the case of
non-MSA areas), both of which can be found in the retail_loan_county_attributestable. However, the
AA numbers reported by banks in the CRA data and included in the retail loan tables are not consistent
from year to year. For example, a bank’s AAN “1” in 2017 may not necessarily correspond to the same
geography as the same bank’s AAN “1” in 2019. To obtain a consistent AA identifier across all years of the
sample, the data user may perform the following steps:
a. Create a crosswalk that links a bank’s 2017 and 2018 AA IDs to the correct 2019 AA ID.
b. Reshape the data so that each bank-county has an AA ID 2017, AA ID 2018, etc. in one row of
data. This means that counties that exit the data before 2019 would have a missing value for AA
ID 2019.
c. Group by the AA ID 2017 and take the mode of AA ID 2019 within that group.
d. Backfill the missing AA ID 2019 values with the mode value of the group.
5. Restrict the sample to bank AAs that exist in 2019. This means that any bank-AA that dropped out of the
sample between 2017 and 2019 should be removed. Despite this, individual counties that drop out of an
individual AA are still included as long as the MSA in which that AA is located still has data in 2019.
6. Aggregate (sum) the county-level data within each assessment area and then across years. In the final
dataset, each bank AA should only have one row that represents data from all three years.
The final sample includes approximately 6,880 unique bank AAs and 606 banks. The data user’s sample may differ
somewhat, depending on the exclusion rules applied to wholesale/limited purpose banks.
Retail Loan Table Examples
These scripts and tables show the geographic and borrower income distributions for HMDA and CRA loans at the
county and MSA level for 2019.
HMDA Lending in R Code
Use the retail_loan_lending_testtable to calculate the borrower income distribution ratios for all counties in
the country for 2019.
#R Code, loans_2019 is the retail_loan_lending_test table filtered for year 2019
hmda_county_2019 <- retail_loan_lending_test %>%
filter(ActivityYear == 2019) %>%
filter(Lender_in_HMDA == "Y") %>%
filter(Lender_in_CRA == "Y") %>%
group_by(ActivityYear,State_Code,County_Code) %>%
13
summarise(across(c(Loan_Orig,Loan_Orig_BILow,Loan_Orig_BIMod,Loan_Orig_TILow,Loan_
Orig_TIMod),sum,na.rm=T)) %>%
mutate(hmda_low_borrower_ratio = Loan_Orig_BILow/Loan_Orig,
hmda_mod_borrower_ratio = Loan_Orig_BIMod/Loan_Orig,
hmda_low_tract_ratio = Loan_Orig_TILow/Loan_Orig,
hmda_mod_tract_ratio = Loan_Orig_TIMod/Loan_Orig)
#Look at the distribution of the HMDA low/moderate income borrower ratios for all co
unties in US for 2019
quants = c(0.1,0.25,0.5,0.75,0.9)
hmda_low_borr_quants = quantile(hmda_county_2019$hmda_low_borrower_ratio,probs=quant
s,na.rm=T)
hmda_mod_borr_quants = quantile(hmda_county_2019$hmda_mod_borrower_ratio,probs=quant
s,na.rm=T)
#HMDA ratios for banks within Morgan County, Ohio
hmda_county_borr_median <- tibble(`Percentiles`=names(hmda_low_borr_quants),
`Low Income Borrower`=hmda_low_borr_quants,
`Moderate Income Borrower`=hmda_mod_borr_quants) %>%
mutate(across(c(`Low Income Borrower`:`Moderate Income Borrower`),round,2))
Table 1. HMDA bank county borrower income distribution ratios, 2019
Percentiles Low-income borrower Moderate-income borrower
10 0.00 0.06
25 0.03 0.11
50 0.05 0.16
75 0.09 0.21
90 0.13 0.26
Now use the retail_loan_lending_testtable to calculate the borrower and tract income distribution ratios for all
MSAs in the country for 2019.
#Do the same but for only metropolitan statistical areas
hmda_msa_2019 <- retail_loan_lending_test %>%
filter(ActivityYear == 2019) %>%
filter(Lender_in_HMDA == "Y") %>%
filter(Lender_in_CRA == "Y") %>%
group_by(ActivityYear,MSA_Code) %>%
summarise(across(c(Loan_Orig,Loan_Orig_BILow,Loan_Orig_BIMod,Loan_Orig_TILow,Loan_
Orig_TIMod),sum,na.rm=T)) %>%
mutate(hmda_low_borrower_ratio = Loan_Orig_BILow/Loan_Orig,
hmda_mod_borrower_ratio = Loan_Orig_BIMod/Loan_Orig,
hmda_low_tract_ratio = Loan_Orig_TILow/Loan_Orig,
hmda_mod_tract_ratio = Loan_Orig_TIMod/Loan_Orig)
14
hmda_msa_low_borr_quants = quantile(hmda_msa_2019$hmda_low_borrower_ratio,probs=quan
ts,na.rm=T)
hmda_msa_mod_borr_quants = quantile(hmda_msa_2019$hmda_mod_borrower_ratio,probs=quan
ts,na.rm=T)
hmda_msa_low_tract_quants = quantile(hmda_msa_2019$hmda_low_tract_ratio,probs=quants
,na.rm=T)
hmda_msa_mod_tract_quants = quantile(hmda_msa_2019$hmda_mod_tract_ratio,probs=quants
,na.rm=T)
#HMDA ratios for banks within Morgan County, Ohio
hmda_msa_borr_tract_quantiles <- tibble(`Percentiles`=names(hmda_msa_low_borr_quants
),
`Low Income Borrower`=hmda_msa_low_borr_quants,
`Moderate Income Borrower`=hmda_msa_mod_borr_quants,
`Low Income Tract`=hmda_msa_low_tract_quants,
`Moderate Income Tract`=hmda_msa_mod_tract_quants) %>%
mutate(across(c( `Low Income Borrower`:`Moderate Income Tract`),round,2))
Table 2. HMDA bank MSA borrower and tract income distribution ratios, 2019
Percentiles
Low-income
borrower
Moderate-income
borrower
Low-income
tract
Moderate-income
tract
10% 0.02 0.09 0.00 0.06
25% 0.04 0.13 0.00 0.08
50% 0.05 0.16 0.01 0.12
75% 0.08 0.19 0.02 0.14
90% 0.10 0.22 0.03 0.16
CRA Lending in R Code
Use the retail_loan_lending_testtable to calculate the inside assessment area small business loan borrower and
tract income distribution ratios for MSAs in 2019.
cra_borr_tract_msa_2019 <- retail_loan_lending_test %>%
filter(ActivityYear == 2019 & Lender_in_CRA == "Y" & County_AA_Flag == 1) %>%
group_by(ActivityYear,id_rssd,MSA_Code) %>%
summarise(across(c(SB_Loan_Orig_Inside,SB_Loan_Orig_TILow_Inside,SB_Loan_Orig_TIMo
d_Inside,
SB_Loan_Orig_GAR_less_1m_Inside),sum,na.rm=T)) %>%
filter(SB_Loan_Orig_Inside > 0) %>%
mutate(`SB Borrower Ratio`= SB_Loan_Orig_GAR_less_1m_Inside/SB_Loan_Orig_Inside,
`SB Tract Low Ratio` = SB_Loan_Orig_TILow_Inside/SB_Loan_Orig_Inside,
`SB Tract Moderate Ratio` = SB_Loan_Orig_TIMod_Inside/SB_Loan_Orig_Inside)
#distributions
15
sb_borr_quants = quantile(cra_borr_tract_msa_2019$`SB Borrower Ratio`,probs=quants,n
a.rm=T)
sb_tract_low_quants = quantile(cra_borr_tract_msa_2019$`SB Tract Low Ratio`,probs=qu
ants,na.rm=T)
sb_tract_mod_quants = quantile(cra_borr_tract_msa_2019$`SB Tract Moderate Ratio`,pro
bs=quants,na.rm=T)
sb_borr_tract_percentiles <- tibble(Percentiles = names(sb_borr_quants),
`SB Borrower Ratio` = sb_borr_quants,
`SB Tract Low Ratio` = sb_tract_low_quants,
`SB Tract Moderate Ratio` = sb_tract_mod_quants)
%>%
mutate(across(c(`SB Borrower Ratio`:`SB Tract Moderate Ratio`),round,2))
Table 3: CRA small business loan borrower and tract income ratio distributions in MSA assessment areas, 2019
Percentiles SB borrower ratio SB tract low ratio SB tract moderate ratio
10 0.29 0.00 0.01
25 0.44 0.00 0.11
50 0.58 0.03 0.17
75 0.70 0.08 0.25
90 0.84 0.14 0.36
PE Table Example
The code and chart below show a distribution of PEs by bank agency and state lending test rating specifically for
the PE Table.
PE_2019_lending_test <- PE_2019 %>%
filter(assessment_area_type == "state")%>%
group_by(lending_test_rating) %>%
summarise(count = n()) %>%
mutate(lending_test_rating = factor(lending_test_rating, levels = c("Outstanding",
"High Satisfactory", "Satisfactory", "Low Satisfactory","Needs to Improve", "Substan
tial Non-Compliance")))
16
Figure 3. Count of State Lending Test Ratings, 2019
The chart was created using the PE table. The x-axis range is organized left to right by state lending test ratings,
starting with “Outstanding” (the highest rating), “High Satisfactory,” “Satisfactory,” “Low Satisfactory,” “Needs to
Improve.” The y-axis is a count of 2019 ratings and ranges from 0 to 30. “Outstanding” ratings are approximately
10 in count. “High Satisfactory” ratings are approximately 20 in count. “Satisfactory” ratings have the highest
overall count of approximately 30. “Low Satisfactory” ratings are about 10 in count. There is approximately 1
“Needs to Improve” rating.
Merged Data Table Example
The code and chart below show count of MSA and non-MSA AAs in the Merged Data Table by lending test rating
and regulatory agency.
#R Code, merged_2017 is the Merged Data Table filtered for year 2017
merged_data_2019_lending_test <- merged_data_2019 %>%
filter(assessment_area_type %in% c("msa", "non_msa")) %>%
group_by(lending_test_rating)%>%
summarise(AA_count = n())
17
Figure 4. Count of MSA and non-MSA AAs by Lending Test Conclusion, 2019
The chart was created using the Merged Data Table. The x-axis range is organized left to right by lending test
ratings, starting with “Outstanding” (the highest rating),“High Satisfactory,” “Satisfactory,” “Low Satisfactory,”
“Needs to Improve,” and lastly, “Substantial Non-Compliance” (the lowest rating). The y-axis is a count of 2019
assessment area conclusions and ranges from 0 to 300. Approximately 50 AAs received “Outstanding” ratings.
Approximately 60 AAs received “High Satisfactory” ratings. Approximately 300 AAs received “Satisfactory” ratings.
Approximately 40 AAs received “Low Satisfactory” ratings. Approximately 30 AAs received “Needs to Improve”
ratings, and one AA received a “Substantial Non-Compliance” rating.