NATIONAL CENTER FOR HEALTH STATISTICS
Vital and Health Statistics
NCHS reports can be downloaded from:
https://www.cdc.gov/nchs/products/index.htm.
Series 2, Number 200 March 2023
National Center for Health Statistics
Data Presentation Standards for
Rates and Counts
Data Evaluation and Methods Research
This report was revised on June 8, 2023, after
errors were found. Figure 2 was revised to indicate
that rates with relative confidence intervals
greater than 160% should be suppressed.
Production and calculation errors were corrected
on formulas 8 and 22 and the unnumbered
formula at the top of column 2, page 13.
Copyright information
All material appearing in this report is in the public domain and may be reproduced or
copied without permission; citation as to source, however, is appreciated.
Suggested citation
Parker JD, Talih M, Irimata KE, Zhang G, Branum AM, Davis D, et al. National Center
for Health Statistics data presentation standards for rates and counts. National Center
for Health Statistics. Vital Health Stat 2(200). 2023. DOI: https://dx.doi.org/10.15620/
cdc:124368.
For sale by the U.S. Government Publishing Office
Superintendent of Documents
Mail Stop: SSOP
Washington, DC 20401–0001
Printed on acid-free paper.
National Center for
Health Statistics Data Presentation
Standards for Rates and Counts
Data Evaluation and Methods Research
NATIONAL CENTER FOR HEALTH STATISTICS
Vital and Health Statistics
Series 2, Number 200 March 2023
U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES
Centers for Disease Control and Prevention
National Center for Health Statistics
Hyattsville, Maryland
March 2023
National Center for Health Statistics
Brian C. Moyer, Ph.D., Director
Amy M. Branum, Ph.D., Associate Director for Science
Division of Research and Methodology
Jennifer D. Parker, Ph.D., Director
John Pleis, Ph.D., Associate Director for Science
Division of Analysis and Epidemiology
Irma E. Arispe, Ph.D., Director
Julie D. Weeks, Ph.D., Acting Associate Director for Science
Division of Health Care Statistics
Carol J. DeFrances, Ph.D., Director
Alexander Strashny, Ph.D., Associate Director for Science
Division of Vital Statistics
Steven Schwartz, Ph.D., Director
Andrés A. Berruti, Ph.D., M.A., Associate Director for Science
Series 2, Number 200 iii NATIONAL CENTER FOR HEALTH STATISTICS
Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
Key Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Rates and Counts at NCHS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
NCHS Presentation Standards for Rates and Counts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
Previous Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
New Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10
Appendix I. Sample Size and Confidence Interval Calculations for Rates and Counts . . . . . . . . . . . . . . . . . . . . . . .12
Appendix II. Design Effects for National Center for Health Statistics Surveys and Selected Census Surveys . . . . . . . . . . 18
Text Figures
1. Presentation standards for rates without sampling variability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
2. Presentation standards for rates with sampling variability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
Text Tables
A. National Center for Health Statistics standards for rates and counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
B. National Center for Health Statistics standards for rates and counts: Confidence interval calculations,
by data system and type of denominator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
NATIONAL CENTER FOR HEALTH STATISTICS iv Series 2, Number 200
The Data Presentation Standards for Rates and Counts
Workgroup researched and wrote this report: Division of
Analysis and Epidemiology—Barnali Das; Division of Health
Care Statistics—Danielle Davis and Alexander Strashny;
Division of Research and Methodology—Katherine E.
Irimata, Jennifer D. Parker, and Guangyu Zhang; Division
of Vital Statistics—Brady E. Hamilton and Kenneth
D. Kochanek; Office of the Center Director—Amy M.
Branum; and Strategic Innovative Solutions, LLC—Frances
McCarty and Makram Talih. The workgroup would like to
acknowledge the contributions of Donald J. Malec (retired)
and Van L. Parsons, Division of Research and Methodology,
who provided statistical and technical input; and Cynthia
A. Reuben, Division of Analysis and Epidemiology, who
developed initial versions of the figures. NCHS Office of
Information Services, Information Design and Publishing
Staff edited and produced this report: editor Jane Sudol and
typesetter and graphic designer Jiale Feng.
Acknowledgments
Series 2, Number 200 1 NATIONAL CENTER FOR HEALTH STATISTICS
National Center for Health Statistics
Data Presentation Standards for
Rates and Counts
by Jennifer D. Parker, Ph.D., National Center for Health Statistics; Makram Talih, Ph.D., Strategic Innovative
Solutions, LLC; Katherine E. Irimata, Ph.D., Guangyu Zhang, Ph.D., Amy M. Branum, Ph.D., Danielle Davis, M.P.H.,
Barnali Das, Ph.D., Brady E. Hamilton, Ph.D., and Kenneth D. Kochanek, M.A., National Center for Health
Statistics; Frances McCarty, M.Ed., Ph.D., Strategic Innovative Solutions, LLC; and Alexander Strashny, Ph.D.,
National Center for Health Statistics
Abstract
Background
The National Center for Health Statistics (NCHS) shares
information on a broad range of health topics through
various publications. These publications must rely on
clear and transparent presentation standards that
can be broadly and efficiently applied. Standards are
particularly important when indicators of precision
cannot be included alongside the estimates, such as
for some large, cross-cutting reports and for shorter
communications on social media.
Objective
This report describes the NCHS data presentation
standards for rates and counts.
Results
The multistep NCHS data presentation standards for
rates and counts are based on a minimum sample size
and the relative width of a confidence interval (CI).
Specific criteria for rates and counts, including the
CI calculations used, differ between vital statistics and
health surveys and may differ according to the source of
the denominator.
Conclusion
The NCHS data presentation standards for rates and
counts will be applied to all NCHS publications. Using
these standards, some estimates will be identified as
unreliable and suppressed, and some estimates will be
flagged for statistical review. For reports where estimates
are evaluated individually, a particular estimate that
does not meet the NCHS data presentation standards
for rates and counts could be identified as unreliable but
not be suppressed if it can be interpreted appropriately
in the context of subject-specific factors and report
objectives.
Keywords: confidence interval • sample size •
degrees of freedom • health surveys • vital statistics •
population estimates
Introduction
The National Center for Health Statistics (NCHS) collects,
analyzes, and distributes information on a broad range of
health topics through a variety of publications, databases,
and tables. Some data products present information based
on a single NCHS data system, while others summarize
information from many NCHS data systems. These reports
and data products may include estimates on a wide range
of topics or focus on a particular health outcome. Some of
these products may only provide supporting information
separately from an estimate, such as its standard error (SE)
or confidence interval (CI), because of space and format
constraints. Examples include some large reports, some data
visualizations, and social media content (1,2). As a result,
NCHS reports and other products must rely on clear and
transparent presentation criteria that can be broadly and
efficiently applied whenever caution should be given to a
particular estimate because it may be unreliable.
Statistical standards for data presentation vary across
agencies, data systems, and data products (3,4). Differences
among standards can be, in part, attributed to each data
system’s unique features and constraints. Standards also
change over time, due to changes in the purpose and scope
of the data’s use, the capability of a user to carefully review
published estimates, the ability to provide explanatory text
discussing the precision of the published estimates, and
advances in statistical methodology.
NATIONAL CENTER FOR HEALTH STATISTICS 2 Series 2, Number 200
In 2017, NCHS released the “National Center for Health
Statistics Data Presentation Standards for Proportions” (4),
which described the criteria and the reasoning supporting
the criteria, by which NCHS would determine whether to
publish a proportion (or percentage) in its reports and other
products. Standards were developed to identify estimates
with sufficiently high statistical reliability for presentation,
where reliability was determined using sample size and
CI thresholds. Proportions, or percentages (proportions
multiplied by 100%), are the most common statistics
produced at NCHS.
Rates and counts are also widely disseminated by NCHS,
principally in two areas: vital statistics, including rates and
counts for deaths and births (5,6), and health care visits,
including rates and counts for hospitalizations and ambulatory
care visits (7,8). Additionally, counts of the number of people
with specific health outcomes—including health conditions,
risk factors, and access to care measures—can be produced
from household surveys, including the National Health
Interview Survey (NHIS) (9), National Health and Nutrition
Examination Survey (NHANES) (10), and National Survey of
Family Growth (NSFG) (11).
This report describes the NCHS data presentation standards
for rates and counts, including details about rates and counts
produced from the National Vital Statistics System (NVSS)
and the National Health Care Surveys. Because underlying
statistical distributions differ between NVSS- and survey-
based estimates, such as those from health care surveys
or other sample surveys, standards are given separately
for these cases. In addition to statistical reliability, data
confidentiality and disclosure risks affect the ability to
present estimates. For example, subnational and, in some
cases, national counts of deaths and births based on fewer
than 10 events are suppressed to protect confidentiality
(12). Confidentiality and disclosure standards are handled
separately and are outside the scope of these standards.
Rates are more complicated than proportions. Consequently,
this report and the resulting standards require more
technical detail of the statistical methods compared with
the previous report on standards for proportions. The
Appendixes contain supporting technical material for the
standards: Appendix I provides mathematical details for the
CIs used in the standards, and Appendix II provides links to
the technical documentation for NCHS surveys and selected
U.S. Census Bureau surveys where guidance can be obtained
on calculating survey design effect (DEFF), among other
survey-specific information. An evaluation of the standards
for vital statistics and health care surveys under different
scenarios, including the CI thresholds used in the standards,
is discussed in a separate NCHS report (13). Although the
standards were not evaluated for other sample surveys such
as NHIS, NHANES, or NSFG, they are intended to apply to all
survey-based rate and count estimates produced by NCHS.
Key Concepts
What is a rate?
For the NCHS data presentation standards for rates and
counts, a rate is defined as the number of events for a
population for a given time period (numerator) divided by
a count of the population at risk during that time period
(denominator) and expressed per population size.
For most NCHS dissemination purposes, the term crude
rate refers to an overall rate for all ages or for broad
age categories, such as all adults over age 18, while
age-adjusted rates are mathematically adjusted to a
standard population (14,15).
Age-specific rates are the number of events among
people of a specified age or age group divided by a
count of the population of people in that age group for
a given time period.
In contrast with proportions, which are constrained to
be from 0 to 1 (or from 0% to 100% when expressed as
percentages), rates are not always constrained by an
upper bound.
For death rates, which are usually expressed per 100,000
population, the event (death) can only occur once.
Because the numerator is a subset of the denominator,
the upper bound would be 100,000/100,000 = 1.
For other rates, including health care visit rates and birth
rates, multiple events can occur for the same person. As
a result, such rates can be larger than 1.
What are the specific components of the
definition of a rate?
The population can be the U.S. resident population or
a subset of the U.S. population, such as the U.S. civilian
noninstitutionalized population, defined by factors like
race and ethnicity, age group, and geographic location.
The number of events (numerator) for the population of
interest over the time period is calculated or estimated
from vital records, a survey, or another source.
The time period is typically 1 year for NCHS annual
estimates. However, shorter or longer time periods could
be applied to the definition, such as quarterly rates or
multiyear rates.
The population at risk (denominator) during the time
period typically corresponds to the population of interest.
For most rates produced by NCHS, the population at risk is
based on census decennial population files, including the
decennial estimates and the postcensal and intercensal
estimates that are calculated from decennial estimates
(5–8). In these cases, the denominator is relatively free
of random variation; see "A. Sources of variation."
The population at risk can also be a count obtained from
a population survey, such as the U.S. Census Bureau's
American Community Survey (ACS) (16) or NHIS (17). In
Series 2, Number 200 3 NATIONAL CENTER FOR HEALTH STATISTICS
these cases, the sampling variability of the denominator
needs to be considered when calculating SEs and other
measures of uncertainty for rates.
Health Care Surveys are estimated using appropriate
statistical methods for complex surveys. Like rates from
NVSS, the denominators for rates from the National Health
Care Surveys are mostly decennial census or postcensal
or intercensal population estimates. However, condition-
specific visit rates can also be calculated for a population at
risk that is estimated from a complex survey. In these cases,
the sampling variability needs to be included in calculations.
NCHS Presentation Standards for
Rates and Counts
Previous Guidelines
NCHS has used previous standards and guidance for
determining whether to present rates and counts (3,4).
These criteria, reviewed for vital statistics and the National
Health Care Surveys in the following sections, relied on
sample sizes and measures of variance, most often relative
standard errors (RSEs) and often adjusted for survey DEFFs
for estimates from surveys. RSE is calculated as SE divided by
the estimate and multiplied by 100%.
Vital statistics
Before the release of the current standards, the presentation
guidance for NVSS was to suppress rates with fewer
than 20 events in the numerator when using population
denominators that were decennial census or postcensal
or intercensal population estimates (5,6). This 20-event
threshold for vital rates corresponds to an RSE of 23% for a
Poisson-distributed count variable. For a Poisson variable, SE
is the square root of the number of events.
For rates using census population denominators estimated
from CPS or ACS where the sampling variability needs to
be considered, RSE of the rate was used to determine its
reliability for presentation. Rates with an RSE of 23% or more
were suppressed or flagged for internal review (20).
Age-adjusted death rates followed the guidelines above
and were presented if the number of events on which
the rate was based was 20 or more, or when populations
were estimated from surveys where RSE was less than
23% (5).
National Health Care Surveys
At NCHS, health care surveys account for nearly all rates
produced and distributed from survey data. Denominators
for most of these rates are decennial census or postcensal
or intercensal population counts. Historically, rates from
the National Health Care Surveys were suppressed when
based on a sample size less than 30, that is, fewer than 30
sample observations in the numerator. Rates were flagged
as unreliable if RSEs were greater than 30%. Combined,
these criteria were known as the “30/30 rule” (7,8). Rates
A. Sources of variation
Both the numerator and denominator may be subject
to several sources of variation. When either the
numerator or denominator count is estimated from a
survey, it will be subject to sampling variability. Even
when the actual number of events in the numerator or
the size of the population at risk in the denominator is
recorded and free from sampling variability, natural (or
stochastic) variability exists in the realized values (18).
For a numerator or denominator that is enumerated
from vital statistics and free of sampling variability,
the number of events (as in deaths or births) will be
assumed to arise from a Poisson distribution (5,6,19).
For a denominator that is enumerated from a decennial
census or a postcensal or intercensal population
estimate, some natural variability may exist in the
realized value; however, such random variation will be
negligible and will not be considered in calculations.
Rates and Counts at NCHS
Counts produced by NCHS include numbers of deaths and
births, numbers of visits to hospitals and other health care
settings, and, in some cases, numbers of people with specific
health outcomes, including health conditions, risk factors,
and access to care measures. Counts of deaths and births
from NVSS are obtained from registers of events. Counts
of visits from the National Health Care Surveys, and counts
of people with specific health outcomes from population
health surveys, are estimated using appropriate methods for
sample surveys; examples are available elsewhere (7,8).
Rates regularly calculated from NVSS and the National Health
Care Surveys include birth and death rates and health care
visit rates, respectively. In many cases, the rates are published
alongside corresponding counts. For most NVSS rates
produced at NCHS, the denominator is a decennial census
or postcensal or intercensal population estimate, which is
relatively free of random variation. However, for rates for
some subpopulations, including Hispanic subpopulations,
and for some demographic groupings, such as by education
level, the population estimates are calculated from a survey,
such as ACS or the U.S. Census Bureau's Current Population
Survey (CPS). In these cases, the sampling variability in the
corresponding denominator needs to be considered when
calculating SE and other measures of uncertainty around the
rate (19).
Rates from the National Health Care Surveys typically include
visit rates to hospitals and health care providers (7,8).
The counts in the numerators obtained from the National
NATIONAL CENTER FOR HEALTH STATISTICS 4 Series 2, Number 200
of the same size (21). The effective sample size is more
informative than the nominal sample size for complex
surveys because it incorporates information about the
design, which has important implications for statistical
power and reliability of estimates.
When the number of events or population estimates
are estimated from a complex survey using sample
weights, such as from the National Health Care Surveys,
NHIS, or ACS, the sample-weighted estimates of the
number of events or population are not the same as the
corresponding nominal sample sizes or effective sample
sizes. In these cases, nominal sample sizes and effective
sample sizes are used to determine reliability, not the
sample-weighted estimates of the events or population;
see "B. Design effect."
with survey-based numerators from health care surveys
and survey-based population denominators had been
uncommon, so decisions for these cases were developed on
a case-by-case basis.
Similarly, counts were suppressed when based on fewer
than 30 sample observations and flagged as unreliable if the
RSE was greater than 30%.
New Standards
Table A summarizes the NCHS presentation standards for
rates and counts for each component: sample size, CI, and
degrees of freedom (df). Figures 1 and 2 illustrate the steps
needed to determine whether to present rates with or
without sampling variability.
Specific components of NCHS data presentation standards
for rates and counts are detailed in the following sections.
Sample size standard
Sample size is an important indicator of an estimate’s
precision and, when evaluating the statistical reliability of a
rate, sample size is relevant for both the number of events
(numerator) and the population at risk (denominator).
The sample size is the number of observations, or events,
used in the calculation of a rate or count. For vital
statistics, the sample size is the number of vital events
(births or deaths). For complex surveys, the sample size
is the number of observations used in calculations of
survey-based estimates and is sometimes referred to
as the nominal sample size to distinguish it from other
measures, such as the effective sample size used for
surveys.
For complex surveys, the effective sample size is defined
as the (nominal) sample size divided by the DEFF. DEFF is
the relative change in variance due to the complex survey
design relative to a hypothetical simple random sample
Table A. National Center for Health Statistics standards for rates and counts
Statistic Standard for rates Standard for counts
Sample size threshold Estimated rates should be based on a minimum sample
size and effective sample size (when applicable) of 10
in both numerator and denominator.
Estimated counts should be based on a minimum
sample size and effective sample size (when
applicable) of 10.
Confidence interval (CI) If the sample size criteria are met, calculate a 95%
two-sided CI using the appropriate method and obtain
its relative width. Estimated rates should have a relative
CI width of 160% or lower.
If the sample size criteria are met, calculate a 95% two-
sided CI using the appropriate method and obtain its
relative width. Estimated counts should have a relative
CI width of 160% or lower.
Degrees of freedom When applicable for complex surveys, if the sample
size and CI criteria are met for presentation and
degrees of freedom are fewer than 8 for either
numerator or denominator, then the rate should be
flagged for statistical review by the clearance official.
This review may result in presentation or suppression
of the rate.
When applicable for complex surveys, if the sample
size and CI criteria are met for presentation and
degrees of freedom are fewer than 8, then the count
should be flagged for statistical review by the clearance
official. This review may result in presentation or
suppression of the count.
SOURCE: National Center for Health Statistics.
B. Design effect
The design effect (DEFF) measures the impact of
the complex sample design on variance estimates. If
DEFF = 1, then design-based variance is the same as
the variance under simple random sampling. Most
National Center for Health Statistics surveys have DEFF
greater than 1, which means that the effective sample
size is less than the number of observations or events.
If DEFF is less than 1, then the effective sample size
is greater than the number of observations or events,
and the nominal sample size is used instead of the
effective sample size to assess statistical reliability.
DEFF can vary depending on the health outcome or
condition that is being measured, as in geographic
distribution, as well as by population subgroups, as in
age or race and ethnicity. Some statistical packages by
default calculate DEFF of the row, or ROW DEFF, based
on the row percentage in frequency or cross-tabulation
tables (22,23). However, TOTAL DEFF is preferred for
rates because the numerator is a total estimate.
Series 2, Number 200 5 NATIONAL CENTER FOR HEALTH STATISTICS
NOTE: Rates without sampling variability are rates developed from vital statistics involving populations calculated from decennial census or postcensal or intercensal estimates, and
period- and cohort-linked infant mortality rates where the denominator is live births.
SOURCE: National Center for Health Statistics.
YES
YES
Calculate sample size for
numerator and denominator
NO
NO
Is sample size less than 10 in
numerator or denominator?
Calculate relative width of recommended 95%
two-sided confidence interval (CI):
See Table B in this report
Does estimated rate have a relative
Cl width greater than 160%?
Present
Suppress
Figure 1. Presentation standards for rates without sampling variability
NATIONAL CENTER FOR HEALTH STATISTICS 6 Series 2, Number 200
NOTE: Rates with sampling variability include rates developed from vital statistics with population denominators from a sample survey, such as the American Community Survey;
rates from health care surveys; and rates from population health surveys.
SOURCE: National Center for Health Statistics.
YES
YES
YES
Calculate nominal sample size and effective sample
size, when applicable, for numerator and denominator
NO
NO
NO
Is nominal or effective sample size less
than 10 in numerator or denominator?
Calculate relative width of recommended 95%
two-sided confidence interval (CI):
See Table B in this report
Does estimated rate have a relative
CI width greater than 160%?
Are degrees of freedom, when applicable,
less than 8 in numerator or denominator?
Present
Suppress
Statistical review
Figure 2. Presentation standards for rates with sampling variability
Series 2, Number 200 7 NATIONAL CENTER FOR HEALTH STATISTICS
Confidence interval standard
Once the sample size thresholds are met, NCHS data
presentation standards for rates and counts are based on
the evaluation of relative 95% CI widths. The absolute width
of 95% CIs for rates is not useful for presentation standards
because, as mentioned previously, rates are not necessarily
constrained by an upper bound and, unlike proportions or
percentages, have variable standard population sizes (per
100,000 population, per 1,000 live births, or per 100 people
per year, among others); see "C. Confidence interval."
For effective sample sizes, documentation for specific
surveys should be consulted when calculating DEFFs because
recommended approaches can differ among surveys and for
specific analytic purposes; see Appendix II. Further, methods
may be updated with methodological developments and
design changes. Because standard software can produce
multiple DEFFs, users should consult the survey and software
documentation to identify the appropriate DEFF. For the
NCHS data presentation standards for rates and counts,
based on an evaluation using two common methods for
calculating DEFF, DEFFs for totals or counts are used.
As with crude or age-specific rates, when either the numerator
or denominator for an age-adjusted rate is estimated using
sample weights from a complex survey, the effective sample
sizes should be calculated and evaluated along with the
nominal sample sizes. As noted previously, age-specific
nominal sample sizes and effective sample sizes should
be used to determine reliability, not the sample-weighted
estimates of the events or population in the age group.
When evaluating counts, as with rates, the estimated count
can be the same as the corresponding sample size, but it
also can be calculated using sample weights when obtained
from a complex survey or adjusted using other analytic
approaches. When the count is estimated from a complex
survey, the effective sample size should be calculated and
evaluated along with the nominal sample size. The sample-
weighted estimate of the count is not used to determine
statistical reliability.
For the standard, rates and counts should have nominal
sample sizes (or effective sample sizes, when relevant) of 10
or more. For surveys, both the nominal and effective sample
sizes should be evaluated, and the threshold of 10 or more
applied to the smaller of the two values.
For crude and age-specific rates and counts from vital
statistics, the threshold of 10 corresponds to an RSE of 33%,
which is close to RSE-based thresholds of 30% historically
used at NCHS. The previous threshold for vital statistics of a
sample size of 20 was equivalent to an RSE of 23%. No similar
equivalents exist between sample size and RSE for sample-
size thresholds for age-adjusted and survey-based rates,
because extra variation may exist due to other factors, such
as the effect of population weights used for age adjustment
or, for surveys, the variability of sample weights or other
survey design features. As a result, additional criteria need
to be met to ensure statistically reliable estimates.
Standard
When presenting estimated rates, the sample size (or
effective sample size, when relevant) for the numerator and
denominator should be 10 or greater. For counts, the sample
size (or effective sample size, when relevant) should be 10
or greater. In instances where the effective sample size is
greater than the sample size, the smaller sample size should
be evaluated.
C. Confidence interval
The width of a confidence interval (CI) provides an
assessment of an estimate’s precision. Technical
definitions of CIs are available from many standard
statistical texts, including Bickel and Doksum (24) and
Casella and Berger (25). Generally, under repeated
sampling, if an estimate such as a rate and its 95%
CI are estimated from each sample, the true value
of the rate is expected to be contained in 95% of the
calculated intervals. Depending on the method used
to calculate CI, the expectation of 95% coverage may
not be attained for some intervals or under some
conditions. Methods used to calculate a CI may lead
to undercoverage if the true rate is contained in fewer
than the expected number of intervals (less than 95%).
Unlike the NCHS data presentation standards for proportions,
where the Clopper–Pearson CI is used once the sample size
thresholds are met, for the data presentation standards for
rates and counts, different approaches for calculating CIs
are needed for vital statistics, complex health surveys, and
different types of denominators to ensure 95% coverage.
These CI calculation methods differ due to the underlying
assumptions about statistical distributions and the sampling
variability from complex surveys.
Table B summarizes CI calculations for different scenarios,
which are detailed in Appendix I.
For all rates and counts, the standard is based on the relative
width of the appropriate 95% two-sided CI. The width of the
interval is the difference between the upper CI and the lower
CI. The relative width of the CI is the length of the interval
divided by the estimate multiplied by 100%.
A relative width of 160% or narrower is needed to present
rates and counts. For vital statistics, a CI threshold of 135.9%
using the approach outlined in the first row of Table B with
the exact gamma interval directly corresponds to the sample
size criterion of 10 or more events for crude and age-specific
rates. Generally, no direct correspondence exists between
the relative CI width criterion and the sample size criterion
for age-adjusted vital rates, vital rates with extra variation
due to use of ACS or CPS, and rates that include sample-
NATIONAL CENTER FOR HEALTH STATISTICS 8 Series 2, Number 200
weighted components from complex surveys. Simulations
for age-adjusted all-cause mortality rates (13) suggest that
a CI threshold of 160% for the Fay–Feuer gamma interval
described in Table B corresponds to a numerator of 10 or
more. Further, simulations modeled on National Ambulatory
Medical Care Survey data (13) suggest that a CI threshold of
160% for the logarithmic Students t CI described in Table B
corresponds to a numerator as high as 30. As a result, both
the relative width and sample size criteria are used to ensure
statistically reliable estimates because of the lack of a one-
to-one correspondence between them.
Appendix I contains the mathematical details of the CIs
described in Table B. The evaluations and simulation studies
mentioned above are documented in a separate NCHS
report (13).
Standard
If the sample size (or effective sample size) criterion is met,
then calculate the appropriate 95% two-sided CI for the data
system and type of denominator. The relative width of CI is
the width of the interval divided by the estimate multiplied
by 100%. If the relative width of CI is 160% or less, then the
rate or count should be presented.
Degrees of freedom standard
For complex sample surveys, the precision of the estimated
variance is approximately related to the square root of df
(26,27); see "D. Degrees of freedom."
Table B. National Center for Health Statistics standards for rates and counts: Confidence interval calculations,
by data system and type of denominator
Data system
Rates Counts
Denominator Confidence interval Confidence interval
National Vital Statistics System Relatively free of random variation
and sampling error, when applicable
Calculate gamma interval where
the lower limit is the 0.025 quantile
of the standard gamma, where
x = number of events and with
parameters
α
= x and
β
= 1. The
upper limit is the 0.975 quantile
of the standard gamma, with
parameters
α
= x + 1 and
β
= 1.
Apply Fay–Feuer approximation for
age-adjusted vital rates.
Calculate gamma interval where
the lower limit is the 0.025 quantile
of the standard gamma, where
x = number of events and with
parameters
α
= x and
β
= 1. The
upper limit is the 0.975 quantile
of the standard gamma, with
parameters
α
= x + 1 and
β
= 1.
Based on American Community
Survey (ACS) or Current Population
Survey (CPS), U.S. Census Bureau
Calculate a Student's t interval
for logarithm of the rate, with
variance estimated using method
supplied with survey data source.
Form confidence intervals (CIs) for
age-adjusted rates using weighted
combinations of age-specific
estimates. Obtain CI for the rate by
reverse transformation.
Based on population surveys other
than ACS or CPS and with sampling
error or other source of random
variation
Calculate a Student's t interval for
logarithm of the rate with estimated
variance supplied by survey data
source. Form CI for age-adjusted
rates using weighted combinations
of age-specific estimates.
Obtain CI for the rate by reverse
transformation.
Based on births file and subject
to random variation, such as for
period- or cohort-linked infant
mortality
Calculate a Student's t CI for
logarithm of the rate. Obtain CI for
the rate by reverse transformation.
Complex health surveys Relatively free of random variation
and sampling error, when applicable
Calculate a Student's t CI for
logarithm of the rate. Obtain CI for
the rate by reverse transformation.
Calculate a Student's t CI for
logarithm of the count. Obtain
CI for the count by reverse
transformation.
Based on population surveys and
with sampling error or other source
of random variation
Calculate a Student's t CI for
logarithm of the rate. Obtain CI for
the rate by reverse transformation.
… Category not applicable.
SOURCE: National Center for Health Statistics.
Series 2, Number 200 9 NATIONAL CENTER FOR HEALTH STATISTICS
Using resulting SEs with low precision to assess estimated
proportions may lead to poor measures of effective
sample size and CI widths. Under certain conditions, the
variance estimate is approximately proportional to a chi-
squared distributed random variable, and the RSE of the
variance obtained from a complex sample survey can be
approximated as
100 2 / df
[1]
From this expression, RSE of the estimated variance of a
rate or count based on fewer than eight df will be 50% or
higher. As a rule of thumb, df for a sample survey can be
calculated as the number of primary sampling units (PSUs)
minus the number of strata. This calculation is used in
most NCHS surveys and implemented in survey software,
although specific calculations can vary across software
packages. Default calculations of df from survey software
may not be appropriate for subgroups represented in only a
subset of PSUs (for example, some racial and ethnic groups
and region-specific estimates) and when calculating annual
or survey cycle estimates using a multiyear or multicycle
data file. In these instances, the relevant information should
be extracted and df directly calculated to assess estimate
precision. The calculation of df as a measure of precision for
SE may not be applicable for all surveys (see survey-specific
documentation in Appendix II) and does not apply to vital
statistics. For additional information on df, see Valliant and
Rust (26), Korn and Graubard (27), and the NHANES tutorial
(28).
Standard
When applicable for complex surveys, df should be eight
or higher. If df are fewer than eight, then the rate or count
should be flagged for statistical review by the clearance
official. This review will result in either the presentation or
suppression of the rate or count.
Discussion
The NCHS data presentation standards for rates and counts
will be applied to all NCHS publications and used by all NCHS
analyses and resulting products. Using these standards, some
estimates will be identified as unreliable and suppressed,
particularly for large reports and tables. However, when
the standards for rates and counts are used for shorter,
more focused reports, specific estimates that do not meet
the standards may be reported after being evaluated
individually by the analyst and clearance official. Some
estimates identified as unreliable based on the standards
may be important and can be interpreted appropriately
in the context of measures of precision and other subject-
specific information. In these cases, the estimate could be
presented. Because report objectives and subject-specific
factors vary widely, justification for presenting an unreliable
estimate should be provided by the analyst, and final
determination should be made by the analyst and clearance
official on a case-by-case basis. In all publications, unreliable
estimates, whether presented or suppressed, should be
identified with a footnote.
Many NCHS data products include SEs alongside the
estimate so that data users can assess the precision of the
point estimates, although some large cross-cutting reports
and shorter publications do not. Whenever space permits,
appropriate CIs should be provided, rather than just SEs,
because CIs obtained using appropriate assumptions more
accurately describe the variability than a typical Wald (or
normal) CI calculated using the estimate and its SE.
Estimates from sample surveys with fewer than eight df
should be flagged for statistical review because sufficient df
are needed for reliable CIs. Statistical review by a clearance
official of flagged estimates will consider factors such
as the estimate’s sample size, availability of alternative
CI approaches, and df. The review will also consider the
recommendations of the analyst, results of any supplemental
or sensitivity analyses, the report’s objectives and format
(including the ability to present CIs or other measures
of precision), and other estimates in the report. In some
large reports, this process may be automated to ease the
production process, with all flagged estimates suppressed
without review. In all publications, estimates from sample
surveys based on fewer than eight df, whether presented or
suppressed, should be identified with a footnote.
Age-adjusted estimates are often produced for national
statistics. Age adjustment allows for a comparison
of outcomes between two groups with differing age
distributions, since many health outcomes are highly
correlated with age (14,15). Instances may occur in which
the age-adjusted estimate will not meet the presentation
criteria, but the crude estimate does, or vice versa. In these
cases, the estimate that meets the presentation criteria will
be shown, and the one that does not will be suppressed.
D. Degrees of freedom
The degrees of freedom (df) for a complex sample
survey are the independent pieces of information
on which an estimate is based. Sample persons or
establishments within a given primary sampling unit
are not independent. In some complex, multistage
surveys, df can be calculated by subtracting the
number of clusters or strata from the number of
primary sampling units (27,28).
NATIONAL CENTER FOR HEALTH STATISTICS 10 Series 2, Number 200
For the NCHS data presentation standards for rates and
counts, a minimum sample size and effective sample size
(when applicable) are needed for both the numerator and
denominator. These minimums ensure the validity of the
CI methods where coverage can be inadequate for small
samples. From simulation results (13), small sample sizes
were generally observed along with large interval widths.
However, in some of these instances, the coverage of the CI
was less than 95%.
These data presentation standards are appropriate for
rates and counts. Standards for proportions were described
previously (4). The NCHS standards were not developed to
apply to other estimators, such as percentiles or means, or to
model-based estimates other than those from the Poisson-
distributed vital rates. Although the principles considered
by the workgroup for rates and counts, and previously for
proportions, can be considered for other estimators—
including the evaluation of effective sample size, CIs, and df,
when appropriate, to guide decisions—no specific thresholds
for these estimators are provided by these standards.
Further, alternative methods exist for calculating CIs for
rates and counts, as well as more precise approximations
to the variance of ratio X / Y, when simplifying assumptions
(Appendix I) are not met. Thresholds for the CI standards
were determined using the CI methods described in this
report. Although other CI methods may be useful for other
purposes, such as hypothesis testing or graphic display, the
evaluations and simulations used to set the presentation
thresholds may not be appropriate for these intervals.
In addition to precision, other factors not addressed here
affect the quality of the estimates, including measurement
error and response rates, and other dimensions of data
quality, such as timeliness, relevance, granularity, and
confidentiality. Effective understanding of data quality is
essential for making data-driven decisions. The recent data
quality framework issued by the Federal Committee on
Statistical Methodology sets guidance on documenting and
reporting data quality so that users can determine whether
data are fit for their purpose, including the quality of data
published as tabular estimates (29). Twelve quality dimensions
within three domains of quality (utility, objectivity, and
integrity) compose the Data Quality Framework. Consistent
with the Data Quality Framework, particularly its dimension
on accuracy, the NCHS data presentation standards for rates
and counts are transparent criteria that allow data users to
know that rates and counts produced by NCHS meet certain
thresholds of statistical reliability.
References
1. Centers for Disease Control and Prevention. Children’s
mental health remains a public health concern. Twitter.
February 23, 2022. Available from: https://twitter.com/
CDCMMWR/status/1497013493819707399/photo/1.
2. National Center for Health Statistics. Health, United
States, 2019. Hyattsville, MD. 2021. DOI: https://dx.doi.
org/10.15620/cdc:100685.
3. Klein RJ, Proctor SE, Boudreault MA, Turczyn KM.
Healthy People 2010 criteria for data suppression.
Healthy People 2010 Statistical Notes; no 24. Hyattsville,
MD: National Center for Health Statistics. 2002.
4. Parker JD, Talih M, Malec DJ, Beresovsky V, Carroll M,
Gonzalez JF Jr, et al. National Center for Health Statistics
data presentation standards for proportions. National
Center for Health Statistics. Vital Health Stat 2(175).
2017.
5. Xu JQ, Murphy SL, Kochanek KD, Arias E. Deaths: Final
data for 2019. National Vital Statistics Reports; vol 70 no
8. Hyattsville, MD: National Center for Health Statistics.
2021. DOI: https://dx.doi.org/10.15620/cdc:106058.
6. Martin JA, Hamilton BE, Osterman MJK. Births in
the United States, 2020. NCHS Data Brief, no 418.
Hyattsville, MD: National Center for Health Statistics.
2021. DOI: https://dx.doi.org/10.15620/cdc:109213.
7. Davis D, Cairns C. Emergency department visit rates
for motor vehicle crashes by selected characteristics:
United States, 2017–2018. NCHS Data Brief, no 410.
Hyattsville, MD: National Center for Health Statistics.
2021. DOI: https://dx.doi.org/10.15620/cdc:106460.
8. Santo L, Okeyode T. National Ambulatory Medical Care
Survey: 2018 national summary tables. National Center
for Health Statistics. 2021. Available from: https://
www.cdc.gov/nchs/data/ahcd/namcs_summary/2018-
namcs-web-tables-508.pdf.
9. Lucas JW, Benson V. Tables of summary health statistics
for the U.S. population: 2018 National Health Interview
Survey. National Center for Health Statistics. 2019.
Available from: https://www.cdc.gov/nchs/nhis/SHS/
tables.htm.
10. Roberts H, Kruszon-Moran D, Ly KN, Hughes E, Iqbal K,
Jiles RB, Holmberg SD. Prevalence of chronic hepatitis
B virus (HBV) infection in U.S. households: National
Health and Nutrition Examination Survey (NHANES),
1988–2012. Hepatology. 63(2):388–97. 2016. DOI:
https://dx.doi.org/10.1002/hep.28109.
11. Martinez GM, Daniels K, Febo-Vazquez I. Fertility of men
and women aged 15–44 in the United States: National
Survey of Family Growth, 2011–2015. National Health
Statistics Reports; no 113. Hyattsville, MD: National
Center for Health Statistics. 2018. Available from:
https://www.cdc.gov/nchs/data/nhsr/nhsr113.pdf.
12. Centers for Disease Control and Prevention. CDC
WONDER. https://wonder.cdc.gov. 2022.
13. Talih M, Irimata KE, Zhang G, Parker JD. Evaluation of the
National Center for Health Statistics data presentation
standards for rates from vital statistics and sample
surveys. National Center for Health Statistics. Vital Health
Stat 2(198). 2023. DOI: https://dx.doi.org/10.15620/
cdc:123462.
Series 2, Number 200 11 NATIONAL CENTER FOR HEALTH STATISTICS
14. Curtin LR, Klein RJ. Direct standardization (age-adjusted
death rates). Healthy People 2000 Statistical Notes; no
6. Hyattsville, MD: National Center for Health Statistics.
1995.
15. Anderson RN, Rosenberg HM. Age standardization of
death rates: Implementation of the year 2000 standard.
National Vital Statistics Reports; vol 47 no 3. Hyattsville,
MD: National Center for Health Statistics. 1998.
16. U.S. Census Bureau. Understanding and using American
Community Survey data: What all data users need to
know. 2020. Available from: https://www.census.gov/
content/dam/Census/library/publications/2020/acs/
acs_general_handbook_2020.pdf.
17. Akinbami LJ, Santo L, Williams S, Rechtsteiner EA,
Strashny A. Characteristics of asthma visits to physician
offices in the United States: 2012–2015 National
Ambulatory Medical Care Survey. National Health
Statistics Reports; no 128. Hyattsville, MD: National
Center for Health Statistics. 2019.
18. Brillinger DR. The natural variability of vital rates and
associated statistics. Biometrics 42(4):693–734. 1986.
19. National Center for Health Statistics. Vital statistics of
the United States: Mortality, 1999. Technical appendix.
Hyattsville, MD. 2004. Available from: https://www.
cdc.gov/nchs/data/statab/techap99.pdf .
20. Curtin SC, Tejada-Vera B, Anderson RN. Death rates
by marital status for leading causes of death: United
States, 2010–2019. National Vital Statistics Reports; vol
70 no 10. Hyattsville, MD: National Center for Health
Statistics. 2021. DOI: https://dx.doi.org/10.15620/
cdc:109161.
21. Kish L. Survey sampling. New York, NY: John Wiley &
Sons, Inc. 1965.
22. RTI International. SUDAAN Language Manual, vol 1 and
2 (Release 11) [computer software]. 2012.
23. Lumley T. Complex surveys: A guide to analysis using R.
New York, NY: Wiley & Sons, Inc. 2010.
24. Bickel PJ, Doksum KA. Mathematical statistics: Basic
ideas and selected topics. Vol 1, 2nd ed. New York, NY:
Chapman & Hall/CRC. 2015.
25. Casella G, Berger RL. Statistical inference. 2nd ed.
Boston, MA: Cengage Learning. 2001.
26. Valliant R, Rust KF. Degrees of freedom approximations
and rules-of-thumb. J Off Stat 26(4):585–602. 2010.
27. Korn EL, Graubard BI. Chapter 5: Additional issues in
variance estimation. In: Korn EL, Graubard BI. Analysis
of health surveys. New York, NY: John Wiley & Sons, Inc.
192–234. 1999.
28. National Center for Health Statistics. NHANES Tutorial:
Module 5–Reliability of estimates. 2022. Available
from: https://wwwn.cdc.gov/nchs/nhanes/tutorials/
reliabilityofestimates.aspx.
29. Federal Committee on Statistical Methodology. A
framework for data quality. FCSM 20-04. 2020.
30. Chiang CL. Standard error of the age-adjusted death
rate. Vital Statistics–Special Reports; vol 47 no 9.
National Center for Health Statistics. Washington, DC:
Public Health Service. 1961.
31. Garwood F. Fiducial limits for the Poisson distribution.
Biometrika 28(3–4):437–42. 1936.
32. Fay MP, Feuer EJ. Confidence intervals for directly
standardized rates: A method based on the gamma
distribution. Stat Med 16(7):791–801. 1997.
33. Morris JK, Tan J, Fryers P, Bestwick J. Evaluation of
stability of directly standardized rates for sparse data
using simulation methods. Popul Health Metr 16(1):19.
2018.
34. Ng HKT, Filardo G, Zheng G. Confidence interval
estimating procedures for standardized incidence rates.
Comput Stat Data Anal 52(7):3501–16. 2008.
35. Tiwari RC, Clegg LX, Zou Z. Efficient interval estimation
for age-adjusted cancer rates. Stat Methods Med Res
15(6):547–69. 2006.
36. Talih M, Anderson RN, Parker JD. Evaluation of four
gamma-based methods for calculating confidence
intervals for age-adjusted mortality rates when data are
sparse. Popul Health Metr 20(1):13. 2022. DOI: https://
dx.doi.org/10.1186/s12963-022-00288-1.
37. U.S. Census Bureau. Chapter 5: Data quality in the
ACS PUMS. In: Understanding and using the American
Community Survey public use microdata sample files:
What data users need to know. 2021. Available from:
https://www.census.gov/content/dam/Census/library/
publications/2021/acs//acs_pums_handbook_2021_
ch05.pdf.
38. Ely DM, Driscoll AK. Infant mortality in the United
States, 2018: Data from the period linked birth/infant
death file. National Vital Statistics Reports; vol 69 no 7.
Hyattsville, MD: National Center for Health Statistics.
2020.
39. National Center for Health Statistics. Public use data file
documentation: 2015 cohort linked birth/infant death
data set. Available from: https://ftp.cdc.gov/pub/
Health_Statistics/NCHS/Dataset_Documentation/DVS/
cohortlinked/LinkCO15Guide.pdf.
NATIONAL CENTER FOR HEALTH STATISTICS 12 Series 2, Number 200
This appendix provides specific formulas and expressions
used in the sample size and confidence interval (CI)
presentation criteria for rates and counts reported in the
National Center for Health Statistics (NCHS) vital statistics
and health care surveys. Although some of the analytic steps
may apply to rates and counts from other sources, including
other population health surveys at NCHS, the underlying
assumptions were determined only in the context of NCHS
vital statistics and health care surveys.
Sample Size
For rates R = X / Y that are estimated using r = x / y , the
numerator, based on sample size n
x
 , is the number of
events, and the denominator, based on sample size n
y 
, is
the population at risk. When calculating the rate r, x and
y may be the same as n
x
or n
y 
, or they may be calculated
using sample weights or other analytic weights. When either
the numerator or denominator is estimated from a complex
survey, the effective sample sizes, n
x _ eff
and n
y _ eff 
, should
be calculated. Rates for the total population or a specific
population subset, such as adults aged 18 and over, are
often referred to as crude rates.
Age-adjusted rates can be expressed as
ii
r wr=
, where
r
i
= x
i
/ y
i
are the age-specific rates for age groups
i = 1,2,…,K, and w
i
denotes the relative proportions for age
group i in the reference (standard) population. For age-
adjusted rates, the corresponding numerator sample sizes
n
x _ i
are the age-specific sample sizes, and the corresponding
denominator sample sizes n
y _ i
, are the age-specific
population sizes. The numerator sample size is the sum of
the age-specific sample sizes,
_
, and the denominator
sample size is the sum of the age-specific population sizes,
_
yi
n
. For the crude rates, when either the numerator
or denominator is estimated from a complex survey, the
effective sample sizes for each age group, n
x _ eff _ i
and
n
y _ eff _ i
, should be calculated.
When calculating the sample size for a count, x may be
the same as n
x
or may be adjusted by sample weights or
calculated using other analytic methods. For counts from
a complex survey, the effective sample size is the number
of observations in the sample adjusted by the design effect
(DEFF) n
x _ eff
.
Appendix I. Sample Size and
Confidence Interval Calculations for
Rates and Counts
Condence Intervals
For the NCHS data presentation standards for rates and
counts, different approaches for calculating CIs are used
for vital statistics and complex surveys, as well as according
to the source of the denominator, to ensure 95% coverage.
These methods differ due to underlying assumptions about
statistical distributions and the need to include sampling
variability when calculating CIs.
CIs for Counts and Age-specic and
Crude Vital Rates, Where Population
Denominator is Free of Sampling
Variability
When the numerator count is enumerated from vital
statistics, the number of events (as in deaths or births)
will be assumed to come from a Poisson distribution (5,6).
When the population denominator count is a decennial
census or postcensal or intercensal population estimate
that is relatively free of sampling variability, it is treated as
a constant without variation in calculations. Although the
actual number of events or population denominator count
is recorded and relatively free from sampling variability,
natural variability exists in the realized value (18,30).
These expressions use the following notation:
Rate (usually multiplied by 100,000 and expressed as a
rate per 100,000 population): R = X / Y
Total number of events on which rate is based: X
Total population on which rate is based: Y
As stated previously, the number of events X is assumed
to be Poisson-distributed, with mean and variance given Y
equal to
λ
Y where
λ
is the true underlying rate. Exact 95%
CI limits for the rate can be derived using a well-known
relationship between the Poisson and gamma distributions
(31,32). Suppose X = x events are observed in a population
of size Y = y. Then a gamma-distributed random variable Z
exists with mean r = x / y and variance v = x / y
2
such that
λ
≥=
( )( )
Pr Pr
rZ
R
Series 2, Number 200 13 NATIONAL CENTER FOR HEALTH STATISTICS
As a result of this relationship:
The lower limit L(r) of the 95% CI is obtained from
the 0.025-quantile of a gamma distribution with
shape parameter
α
=
x
=
r
2 
/ v and scale parameter
=
1 / y
=
v / r.
The upper limit U(r) of the 95% CI is obtained from the
0.975-quantile of a gamma distribution with shape
parameter
α
=
x + 1
=
r'
2
/ v' and scale parameter
=
1 / y
=
v' / r' where the mean r' and variance v' are
based on a unit increment to the observed number of
events x:
( )
1/ 1 /
rr y x y
=+=+
( )
22
1/ 1 /
vv y x y
=+=+
Quantiles of the gamma distribution can be calculated
using commonly available spreadsheet programs or
statistical software (Excel or SAS) that include an inverse
gamma function, although users must ensure the correct
parameterization is used, because some software programs
may expect the rate 1/
instead of the scale parameter
β
to be supplied by the user. To avoid confusion, users should
calculate the upper and lower CI limits using the standard
gamma distribution (with
β
=
1), so that
L(r)
=
L(x) / y where L(x) is the 0.025-quantile of a standard
gamma with α
=
x
U(r)
=
U(x + 1) / y where U(x + 1) is the 0.975-quantile of a
standard gamma with α
=
x + 1
Note that the quantiles L(x) and U(x + 1) in the last two
formulas are also used to calculate the lower and upper
limits of the exact 95% CI for the mean number of events
E(X) when X
=
x.
In Excel, the function GAMMA.INV (probability, alpha, beta),
with beta set to 1, returns the quantile of the standard
gamma distribution for a given probability between 0 and 1.
For 95% CI, the probability associated with the lower limit is
0.05/2
=
0.025, and with the upper limit 1 – (0.05/2)
=
0.975.
CIs for Age-adjusted Vital Rates, Where
Population Denominator is Free of
Sampling Variability
No exact 95% CI for the true underlying age-adjusted rate
λ' is known. Instead, an approximate 95% CI can be derived
under the assumption that the Poisson-gamma relationship
for crude rates holds approximately for age-adjusted rates
as well (32).
These expressions use the following notation:
Age-adjusted rate (usually multiplied by 100,000 and
expressed as a rate per 100,000 population):
ii
R wR
=
Standard population weight: w
i
 , such that
1
i
w∑=
Age-specific rate for the ith age group: R
i
=
X
i 
/ Y
i
Total number of events for the ith age group on which the
age-specific rate is based: X
i
Total population for the ith age group on which the age-
specific rate is based: Y
i
Suppose X
i
=
x
i
events are observed for the age-specific
populations, each of size Y
i
=
y
i 
. It is assumed that a
gamma-distributed random variable Z exists with mean
( )
/
i ii
r w yx
= and variance
( )
2
/
ii i
v wy x
= such that
Pr Rr Pr Z()
As a result of this assumption:
The lower limit L(r) of the approximate 95% CI is obtained
from the 0.025-quantile of a gamma distribution with
shape parameter
α
=
r
2
/ v and scale parameter
=
v / r.
The upper limit U(r) is obtained from the 0.975-quantile
of a gamma distribution with shape parameter
α
=
r'
2
/ v'
and scale parameter
=
v' / r' where the mean r' is based
on a unit increment to the observed number of deaths x
k
in the age group with the largest value of w
i
/ y
i
.
( ) ( )( )
/ /1
ik i i i k k k
r r w yx w y x
κ
=+= + +
where
κ =
max (w
i
/ y
i
), and the variance is v'
=
v +
κ
2
.
This approach to calculating the upper CI for the age-adjusted
rate is known to be overly conservative, but to date no
other method has been able to maintain nominal coverage
(coverage of 95% or more) in very sparse data (32–36).
As before, calculation of the upper and lower limits can use
the standard gamma distribution (
=
1):
=
2
( /)
()
/
Lr v
Lr
rv
[2]
where L( r
2
/ v ) is the 0.025-quantile of a standard gamma
with
α
=
r
2
/ v, and
2
( /)
()
/
Ur v
Ur
rv
′′
=
′′
[3]
where U(r'
2
/ v' ) is the 0.975-quantile of a standard gamma
with
α
=
r'
2
/ v'.
CIs for Crude and Age-specic Vital Rates,
With Population Denominator Estimates
From American Community Survey or
Current Population Survey
For rates where the population estimate used in the
denominator is obtained from the American Community
Survey (ACS) or the Current Population Survey (CPS), such as
death rates for specified Hispanic subgroups, or by education
level or marital status, CI can be calculated by adjusting
the upper and lower bounds of the interval for the extra
variation from the survey. This adjustment can be made for
both crude and age-adjusted rates.
NATIONAL CENTER FOR HEALTH STATISTICS 14 Series 2, Number 200
These expressions use the following notation:
Rate (usually multiplied by 100,000 and expressed as a
rate per 100,000 population): X / Y
Total number of events on which rate is based: X
Total population on which rate is based: Y
Let
Y
µ
and
2
Y
σ
denote the mean and variance of the
denominator population Y. Let the number of events X be
Poisson-distributed, with mean and variance equal to λ
x 
,
and assume that X and Y are independent. Using first-order
Taylor series approximations (also known as the delta
method), the mean and variance are given by E
( )
/
XY
ER
λµ
and
2
2
2
1
var( )
XY
YX
Y
R
λσ
µλ
µ


≈+





[4]
The sample mean and variance are calculated using the
method of moments, yielding r
=
x / y,
2
2
2
1
y
s
x
v
yx
y


= +





[5]
2
22
1
y
s
v
x
ry
= +
[6]
where
2
y
s
is the value of the design-based sample variance
of denominator population Y evaluated at Y
=
y. Standard
errors (SEs) for ACS estimates are published by the U.S.
Census Bureau for selected population estimates for
combinations of race and ethnicity, marital status, and
education level groups, and by the number of years on which
the rate is based (19,37).
The generalized variance function (GVF) model may
sometimes be assumed, for example with CPS-estimated
totals, simplifying the calculation of the relative variance for
Y (19):
2
2
Y
Y
Y
b
fa
σ
µ
µ

= +


[7]
Using the GVF model, the sample variance and sample
relative variance of R
=
X / Y are given by


= ++




2
1xb
v fa
yx y
[8]
2
1vb
fa
xy
r

=++


[9]
For CPS-estimated totals, the parameters a and b are
estimated by fitting a model to a group of related estimates
and their estimated relative variances, and f is a factor that
depends on whether the population estimate is based on
demographic analysis or CPS and the number of years used.
The following 100(1 – α)%CI that includes the extra sampling
variability from the survey is recommended for crude
and age-specific vital rates with population denominator
estimates from ACS or CPS:
/2,
2
exp ln( )
df
v
rt
r
α


±



[10]
The degrees of freedom (df) for the Students t critical value
/2,
df
t
α
are given by min(x, n
y
, n
y _ eff
 ) – 1 where n
y
and n
y _ eff
are the sample size and effective sample size, respectively,
from the survey. The CI just described is referred to as the
logarithmic (log) Student’s t CI.
CIs for Age-adjusted Vital Rates, With
Population Denominator Estimates From
ACS or CPS
For crude and age-specific rates where the population
denominator is obtained from ACS or CPS, such as for
specified Hispanic subgroups, or by education level or
marital status, the CI for age-adjusted rates can be calculated
by adjusting the upper and lower bounds of the interval for
the extra variation from the survey.
These expressions use the following notation:
Age-adjusted rate (usually multiplied by 100,000 and
expressed as a rate per 100,000 population):
ii
R wR
=
Standard population weight: w
i 
, such that
1
i
w∑=
Age-specific rate for the ith age group: R
i
=
X
i
/ Y
i
Total number of events for the ith age group on which the
age-specific rate is based: X
i
Total population for the ith age group on which the age-
specific rate is based: Y
i
Let
i
µ
and
2
i
σ
denote the mean and variance of the
denominator population Y
i 
. Let X
i
be independently
Poisson-distributed, with mean and variance equal to λ
i 
.
As before, the sample means and variances for R
i
are
given by r
i
=
x
i
/ y
i
and
2
2
2
1
i
y
i
i
ii
i
s
x
v
yx
y


= +





[11]
where
2
y
i
s
are the realizations of the design-based sample
variances of the age-specific populations Y
i
evaluated at
Y
i
=
y
i 
.
If the GVF model can be assumed, as with CPS-estimated
totals, the age-specific sample variances are calculated as
2
1
ii
i
i
i
x
b
v fa
yx y


= ++






[12]
Whether or not the GVF model can be assumed, the sample
mean and variance for the age-adjusted rate are given by
( )
/
ii i i i
r wr w y x=∑=
and
( )
( )
2
2 22
//
ii i i i i i
v wv w y v r x=∑=
Series 2, Number 200 15 NATIONAL CENTER FOR HEALTH STATISTICS
where the relative sample variances
2
/
ii
vr
are calculated as
shown.
As before, the following 100(1 – α)%CI is recommended
for age-adjusted vital rates with population denominator
estimates from ACS or CPS and not assumed constant:
/2,
2
exp ln( )
df
v
rt
r
α


±



[13]
The df for the Students t critical value
/2,df
t
α
are given by
_ __
, , –1()
i yi yeffi
df min x n n= ∑∑
where
_ __
, , –1()
i yi yeffi
df min x n n= ∑∑
is the total number of events in the numerator
and
_ __
, , –1()
i yi yeffi
df min x n n= ∑∑
and
_ __
, , –1()
i yi yeffi
df min x n n= ∑∑
are the denominator sample size
and effective sample size, respectively. The CI just described
is referred to as the log Students t CI.
CIs for Age-specic and Crude Vital Rates,
Where Denominator is Estimated From
Another Population Survey
NCHS currently does not produce rates from vital statistics
where the numerator is the number of events from
vital statistics and the population estimate used in the
denominator is obtained from a survey other than ACS or
CPS. However, such rates could be considered for some
purposes, such as the number of deaths for a particular
cause per population with a certain condition, where the
denominator would come from a population health survey
such as the National Health Interview Survey (NHIS).
These expressions use the following notation:
Rate (usually multiplied by 100,000 and expressed as a
rate per 100,000 population): R
=
X / Y
Total number of events on which rate is based: X
Total population on which rate is based: Y
Let
Y
µ
and
2
Y
σ
denote the mean and variance of the
denominator population Y. Let the number of events X be
Poisson-distributed, with mean and variance equal to λ
x 
,
and assume that X and Y are independent. Using first-order
Taylor series approximations (delta method) (25), the mean
and variance are E
( )
/
XY
ER
λµ
and
2
2
2
1
var( )
XY
YX
Y
R
λσ
µλ
µ


≈+





[14]
The sample mean and variance for R are calculated using the
method of moments, yielding r
=
x / y,
2
2
2
1
y
s
x
v
yx
y


= +





[15]
and
2
22
1
y
s
v
x
ry
= +
[16]
where
2
y
s is the value of the design-based sample variance of
denominator population Y evaluated at Y
=
y.
The following 100(1 – α)%CI is recommended, for example,
with α
=
0.05:
/2,
2
exp ln( )
df
v
rt
r
α


±



[17]
The df for the Students t critical value
/2,df
t
α
are given by
df
=
min(x, n
y
, n
y _ eff
 ) – 1
where n
y 
and n
y _ eff
are the denominator sample size and
effective sample size, respectively. The CI just described is
referred to as the log Students t CI.
CIs for Age-adjusted Vital Rates, Where
Denominator is Estimated From Another
Population Survey
Just as it may be of interest to consider crude and age-
specific rates where the numerator is the number of vital
events and the denominator is obtained from a survey other
than ACS or CPS (such as NHIS), consider age adjusting such
rates to obtain, for example, the age-adjusted number of
deaths for a particular cause per population with a certain
health condition.
These expressions use the following notation:
Age-adjusted rate (usually multiplied by 100,000 and
expressed as a rate per 100,000 population):
ii
R wR
=
Standard population weight: w
i
, such that
1
i
w∑=
Age-specific rate for the ith age group: R
i
=
X
i
/ Y
i
Total number of events for the ith age group on which the
age-specific rate is based: X
i
Total population for the ith age group on which the age-
specific rate is based: Y
i
Let
i
µ
and
2
i
σ
denote the mean and variance of the
denominator population Y
i 
. Let X
i
be independently Poisson-
distributed, with mean and variance equal to λ
i 
. As before,
the sample means and variances for the R
i
are given by
r
i
=
x
i
/ y
i
and
2
2
2
1
i
y
i
i
ii
i
s
x
v
yx
y


= +





[18]
where
2
y
i
s
are the realizations of the design-based sample
variances of the age-specific populations Y
i
evaluated at
Y
i
=
y
i
.
The sample mean and variance for the age-adjusted rates
are
( )
/
ii i i i
r wr w y x=∑=
and
( )
( )
2
2 22
//
ii i i i i i
v wv w y v r x=∑=
NATIONAL CENTER FOR HEALTH STATISTICS 16 Series 2, Number 200
As before, the following 100(1 – α)%CI is recommended, for
example, with α
=
0.05:
/2,
2
exp ln( )
df
v
rt
r
α


±



[19]
The df for the Students t critical value
/2,df
t
α
are given by
_ __
, , –1()
i yi yeffi
df min x n n= ∑∑
where
_ __
, , –1()
i yi yeffi
df min x n n= ∑∑
is the total number of events in the numerator
and
_ __
, , –1()
i yi yeffi
df min x n n= ∑∑
and
_ __
, , –1()
i yi yeffi
df min x n n= ∑∑
are the denominator sample size
and effective sample size, respectively. The CI just described
is referred to as the log Students t CI.
CIs for Infant Death and Infant Mortality
Rates
For the final reports of deaths in the United States (5),
including the infant death rate or death rate for infants
under age 1 year, CI is calculated using the number of
infant deaths in a calendar year as the numerator, and the
census population of age under 1 year as of July 1 for that
calendar year as the denominator, the latter being treated
as a constant in calculations. For infant death rates, CI can be
calculated as above for other vital rates.
For infant mortality rates (IMR) based on live births in the
denominator, the numerator is the number of deaths of
infants (under age 1 year) during a given time period, and
the denominator is the number of live births during that
period. For period-linked IMRs, the numerator is the number
of deaths among infants in a given year, whether the birth
occurred during that year or in the previous year, which
have been linked to their corresponding birth record. The
denominator for the period-linked IMRs is the number of
births occurring during that year (38). For cohort-linked
IMRs, the numerator is the number of deaths for infants
born in a given year, whether the death occurred in the
same year or the next year, which have been linked to
their corresponding birth record (39). The denominator for
cohort linked IMRs is the number of births in that year.
For IMRs and period- and cohort-linked IMRs, CI can be
calculated using the following notation:
IMR (usually multiplied by 100,000 and expressed as a
rate per 100,000 live births): R
=
X / Y
Total number of infant deaths on which the IMR is
based: X
Total number of live births on which the IMR is based: Y
Both the numerator and denominator can be assumed to
follow Poisson distributions, with underlying rates λ
x
and λ
y 
.
As a result, first-order approximations (delta method) (25)
for the mean and variance of the IMR can be calculated,
assuming independence of the numerator and denominator.
This approach is conservative, as X and Y will be positively
correlated. Under the assumption of independence,
dropping the covariance term makes the variance and
resulting CI larger. Consequently,
E
( )
/
xy
ER
λλ
and
2
11
var( )
X
Y XY
R
λ
λ λλ

≈+


[20]
The sample mean and variance for R are calculated using the
method of moments, yielding:
r
=
x / y
2
11x
v
y xy

= +


[21]
= +
2
11v
xy
r
[22]
Use of the following 100(1 – α)%CI is recommended, as
with α
=
0.05:
/2,
2
exp ln( )
df
v
rt
r
α


±



[23]
The df for the Student’s t critical value
/2,df
t
α
are given by
df
=
min( x, y ) – 1. The CI just described is referred to as the
log Students t CI.
CIs for Rates and Counts Estimated From
Sample Surveys
The number of events obtained from health care surveys,
such as the National Ambulatory Medical Care Survey, can
be assumed to follow a Poisson distribution. However, for
variance estimation, the design-based sampling distribution
is generally used, along with asymptotic distributional
assumptions (21,26). Based on evaluation of alternative
distributional assumptions and implementations (Appendix
II), the log Students t CI, with adaptations for complex
surveys, is used for the presentation standards for survey-
based rates, including age-adjusted rates.
CIs for rates where the denominator population value
y is free of sampling variability can be obtained using the
following expression, where α
=
0.05:
( )
2
/2,
2
1
exp ln
x
df
s
xt
y
x
α


±



[24]
In this expression,
_
,1()
x x eff
df min n n= ∑∑
is the
minimum of the sample size and effective sample size (when
applicable). For complex surveys,
__
/
x eff x x eff
n nD=
, where
__
/
x eff x x eff
n nD=
is the (possibly average) DEFF for x in that survey.
The numerator sample SE value is s
x
and y is the population
value, assumed constant.
Series 2, Number 200 17 NATIONAL CENTER FOR HEALTH STATISTICS
When the population estimate in the denominator is also
obtained from a survey (with sampling error), the suggested
CIs for rates can be obtained using the following expression,
where α
=
0.05:
2
2
/2,
22
exp ln
y
x
df
s
s
x
t
y
xy
α



±+





[25]
With the addition of the sampling variability from the
denominator, df
=
min(n
x 
, n
x _ eff 
, n
y 
, n
y _ eff
 ) 1, s
x
is the
numerator sample SE value and s
y
is the denominator
sample SE value.
As above, if both X and Y are estimated using a complex
survey sample, then the sample sizes used to determine
df should be the minimum of the sample size and effective
sample size for each survey.
The CI expressions ignore possible correlation between
X and Y that could arise from sample selection. Under
the assumption that such correlation would be positive,
the CI width will be larger (more conservative) when the
correlation is ignored.
CIs for counts can be obtained by simplifying the above
expressions and setting the denominator to a constant of 1
with terms defined as above, where α
=
0.05:
( )
2
/2,
2
exp ln
x
df
s
xt
x
α


±



[26]
where df
=
min(n
x 
, n
x _ eff 
 ) 1, the minimum of the sample
size and effective sample size (when applicable).
NATIONAL CENTER FOR HEALTH STATISTICS 18 Series 2, Number 200
Appendix II. Design Effects for
National Center for Health Statistics
Surveys and Selected Census Surveys
Consulting the following documentation for surveys is
recommended when calculating design effects (DEFFs),
because suggested approaches can differ among surveys
and for specific analytic purposes.
National Health and Nutrition Examination Survey
NHANES Survey Methods and Analytic Guidelines:
https://wwwn.cdc.gov/nchs/nhanes/analyticguidelines.
aspx
National Health Interview Survey
Data, Questionnaires and Related Documentation—
Methods: https://www.cdc.gov/nchs/nhis/methods.htm
National Survey of Family Growth
Questionnaire, Datasets, and Related Documentation:
https://www.cdc.gov/nchs/nsfg/nsfg_questionnaires.htm
National Health Care Surveys
https://www.cdc.gov/nchs/dhcs/dhcs_surveys.htm
American Community Survey
U.S. Census Bureau Methodology: https://www.census.
gov/programs-surveys/acs/methodology.html
Current Population Survey
U.S. Census Bureau Methodology: https://www.census.
gov/programs-surveys/cps/technical-documentation/
methodology.html
Methods also may be updated with methodological
developments and design changes. Standard software
can produce multiple DEFFs, and users should consult
the survey and software documentation to identify the
appropriate DEFF. For National Center for Health Statistics
data presentation standards for rates and counts, DEFFs for
totals or counts are used. An evaluation comparing use of
DEFF for totals and counts and DEFF for rows that shows
better performance of DEFF for totals and counts is found in
Talih et al. (13).
Vital and Health Statistics
Series Descriptions
Active Series
Series 1. Programs and Collection Procedures
Reports describe the programs and data systems of the
National Center for Health Statistics, and the data collection
and survey methods used. Series 1 reports also include
denitions, survey design, estimation, and other material
necessary for understanding and analyzing the data.
Series 2. Data Evaluation and Methods Research
Reports present new statistical methodology including
experimental tests of new survey methods, studies of vital and
health statistics collection methods, new analytical techniques,
objective evaluations of reliability of collected data, and
contributions to statistical theory. Reports also include
comparison of U.S. methodology with those of other countries.
Series 3. Analytical and Epidemiological Studies
Reports present data analyses, epidemiological studies, and
descriptive statistics based on national surveys and data
systems. As of 2015, Series 3 includes reports that would
have previously been published in Series 5, 10–15, and 20–23.
Discontinued Series
Series 4. Documents and Committee Reports
Reports contain ndings of major committees concerned with
vital and health statistics and documents. The last Series 4
report was published in 2002; these are now included in
Series 2 or another appropriate series.
Series 5. International Vital and Health Statistics Reports
Reports present analytical and descriptive comparisons of
U.S. vital and health statistics with those of other countries.
The last Series 5 report was published in 2003; these are now
included in Series 3 or another appropriate series.
Series 6. Cognition and Survey Measurement
Reports use methods of cognitive science to design, evaluate,
and test survey instruments. The last Series 6 report was
published in 1999; these are now included in Series 2.
Series 10. Data From the National Health Interview Survey
Reports present statistics on illness; accidental injuries;
disability; use of hospital, medical, dental, and other services;
and other health-related topics. As of 2015, these are included
in Series 3.
Series 11. Data From the National Health Examination Survey, the
National Health and Nutrition Examination Surveys, and
the Hispanic Health and Nutrition Examination Survey
Reports present 1) estimates of the medically dened
prevalence of specic diseases in the United States and the
distribution of the population with respect to physical,
physiological, and psychological characteristics and 2)
analysis of relationships among the various measurements.
As of 2015, these are included in Series 3.
Series 12. Data From the Institutionalized Population Surveys
The last Series 12 report was published in 1974; these reports
were included in Series 13, and as of 2015 are in Series 3.
Series 13. Data From the National Health Care Survey
Reports present statistics on health resources and use of
health care resources based on data collected from health
care providers and provider records. As of 2015, these reports
are included in Series 3.
Series 14. Data on Health Resources: Manpower and Facilities
The last Series 14 report was published in 1989; these reports
were included in Series 13, and are now included in Series 3.
Series 15. Data From Special Surveys
Reports contain statistics on health and health-related topics
from surveys that are not a part of the continuing data systems
of the National Center for Health Statistics. The last Series 15
report was published in 2002; these reports are now included
in Series 3.
Series 16. Compilations of Advance Data From Vital and Health
Statistics
The last Series 16 report was published in 1996. All reports
are available online; compilations are no longer needed.
Series 20. Data on Mortality
Reports include analyses by cause of death and demographic
variables, and geographic and trend analyses. The last Series
20 report was published in 2007; these reports are now
included in Series 3.
Series 21. Data on Natality, Marriage, and Divorce
Reports include analyses by health and demographic
variables, and geographic and trend analyses. The last Series
21 report was published in 2006; these reports are now
included in Series 3.
Series 22. Data From the National Mortality and Natality Surveys
The last Series 22 report was published in 1973. Reports from
sample surveys of vital records were included in Series 20 or
21, and are now included in Series 3.
Series 23. Data From the National Survey of Family Growth
Reports contain statistics on factors that affect birth rates,
factors affecting the formation and dissolution of families, and
behavior related to the risk of HIV and other sexually
transmitted diseases. The last Series 23 report was published
in 2011; these reports are now included in Series 3.
Series 24. Compilations of Data on Natality, Mortality, Marriage, and
Divorce
The last Series 24 report was published in 1996. All reports
are available online; compilations are no longer needed.
For answers to questions about this report or for a list of reports published
in these series, contact:
Information Dissemination Staff
National Center for Health Statistics
Centers for Disease Control and Prevention
3311 Toledo Road, Room 4551, MS P08
Hyattsville, MD 20782
Tel: 1–800–CDC–INFO (1–800–232–4636)
TTY: 1–888–232–6348
Internet: https://www.cdc.gov/nchs
Online request form: https://www.cdc.gov/info
For e-mail updates on NCHS publication releases, subscribe
online at: https://www.cdc.gov/nchs/email-updates.htm.
U.S. DEPARTMENT OF
HEALTH & HUMAN SERVICES
Centers for Disease Control and Prevention
National Center for Health Statistics
3311 Toledo Road, Room 4551, MS P08
Hyattsville, MD 20782–2064
OFFICIAL BUSINESS
PENALTY FOR PRIVATE USE, $300
FIRST CLASS MAIL
POSTAGE & FEES PAID
CDC/NCHS
PERMIT NO. G-284
Series 2, No. 200
CS336872
For more NCHS Series Reports, visit:
https://www.cdc.gov/nchs/products/series.htm