Outcomes & Impact
Better data. Better decisions
Wei Zhang Ph.D.
Research Statistician
Texas Children's Hospital Outcomes & Impact
Service (TCHOIS)
Assistant Professor
Congenital Heart Surgery, Baylor College of
Medicine
Introduction to Sampling Methods
Overview
Purpose of Sampling
Some Definitions
Sample Designing Process
Importance of Probability Sampling
Four Commonly Used Probability Sampling Techniques
Sample Size Determination
Purpose of Sampling
Who = Target Population
Bronchiolitis or Sepsis
What = Parameter
Characteristics of population
Problem: Cannot study whole
Solution: Sample
Subset of “who”
Calculate a statistics for “what”
http://korbedpsych.com/R06Sample.html
Some Definitions
Observation Unit
Target Population
Study Population or Sampling Population
Sampling Frame
Sample
Sampling Unit
http://keywordsuggest.org/gallery/659530.html
Advantages of Sampling
Less Resource
More Accuracy
Reduced Inspection Fatigue
Disadvantages of Sampling
May not be representative
Chance of over or under estimation
Associated with both sampling and non-sampling errors
Causes of Sample Failed to be Representative
Sampling Population not
Reflecting Target Population
Not Enough Sample
http://www.slideshare.net/krishna1988/sa
mpling-techniques-market-research
Medical expertise
Medical
expertise and IT
Medical expertise
and statistical
consultation
Medical expertise
Sampling Techniques
Non-probability Sampling unequal chance of being selected
Convenience sampling
Judgement sampling
Snowball sampling
Quota sampling
Probability Sampling equal chance of being selected
Simple random sampling
Systematic sampling
Stratified sampling
Clustered sampling
Sampling Techniques
Non-probability sampling unequal chance of being selected
Convenience sampling
Judgement sampling
Snowball sampling
Quota sampling
Probability Sampling equal chance of being selected
Simple random sampling
Systematic sampling
Stratified sampling
Clustered sampling
Why Probability Sampling?
Avoid selection bias
Be able to assess representativity based on sample size
Simple Random Sampling (SRS)
Similar to draw numbered balls from
a bag.
Each unit in the target population is
equally likely to be selected.
Simple Random Sampling How to Do it
List all units in the sampling population
Generate a random number per unit
Use Excel: E.g. “=randbetween(1,100)” if
100 patients in the target population
If sample 10, then take patients with the
10 smallest numbers.
Simple Random Sampling - Example
Retrospective Chart Review on 30 Day Readmission Rate
Sampling Population: A disease group defined by certain
ICD codes
Sampling Frame: A list of MRNs pulled by these ICD
codes from the EMR system
N patients were selected randomly
m out N were readmitted within 30 days
Rate = m/N*100%
Simple Random Sampling - Pros and Cons
Pros:
Very simple technique
Based on probability law
No personal bias
Cons:
Does not work well when population is heterogeneous
Less efficient
Need to get the whole list before sampling
Systematic Random Sampling
Used when no list or the list is in roughly random order
Results comparable to simple random sampling
http://www.mathcaptain.com/statistics/systematic-sampling.html
Systematic Random Sampling - How to Do it
Given that patients arrive at no specific order,
Include first “n” patients everyday
Sample every “k”th patients
Systematic Sampling - Example
NSQIP Sampling Algorithm
8-day cycle to assure cases having equal chance of being
selected
Operative log provides a list of surgical cases in a cycle
Apply inclusion and exclusion rules to the log
Select first 35 cases in consecutive order
Consecutive order: date of operation, in room time, OR room
number
Systematic Sampling Pros and Cons
Pros:
Easy to implement
Can be used without the whole list of units
Cons:
Not in general a simple random sample
May yield bias if there are periodic features
Stratified Random Sampling (STRS)
Heterogeneous Sampling Population (SP)
Divide SP into “K” number of homogeneous subgroups called
strata
Sample n
1
,n
2
,……n
k
units from 1
st
,2
nd
,…..k
th
strata by simple
random sampling
https://www.pinterest.com/pin/410179478533148229/
Stratified Random Sampling Allocation of Sample Size
Proportional allocation
Optimum allocation
Stratified Random Sampling - Example
Appendectomy LOS Study Stratified by Simple and Complex
To study post-op length of stay of 1000 appy patients
60% are simple
LOS of complex cases had much larger variation
Stratify 100 samples into 30 simple and 70 complex
The total average LOS is the weighted average of simple and
complex




 

Stratified Random Sampling Pros and Cons
Pros:
More representative
Higher precision than simple random sampling
Administratively easier
Each stratum can be analyzed separately
Cons:
Stratification needs to be done properly
Division into homogeneous strata with multiple characteristics may
be difficult
Cluster Sampling
Divide a sample population into “K” number of subgroups
called clusters
Take an simple random sampling of clusters
Observe all elements within the clusters in the sample
Stratified vs. Cluster
http://keydifferences.com/difference-between-stratified-and-cluster-sampling.html
Cluster Sampling - Example
Patients grouped by zip codes
Simple random sample from a list of zip codes
Collect information of all patients within the selected zip codes
Cluster Sampling Pros and Cons
Pros
Reduces cost
No sampling frame necessary
Cons
Decrease precision
Sample Size and Sampling Error
Standard deviation (std)
describe on average how each unit differs from sample mean
95% confidence interval
a range of values that you can be 95% certain contains the true
mean of the population
Margin of error
Half the width of confidence interval
Determine Sample Size
Nature of population: size, heterogeneous/homogenous
Goal of study
Sampling technique
Desired precision and reliability
Financial and resource constraints
One Sample Proportion
Rate or rapid transferring to a higher level of care
Rate of interventions
False positive rate of sepsis diagnosis
Percentage of bronchiolitis patients who went to ICU
Online Tool:
https://select-statistics.co.uk/calculators/sample-size-calculator-
population-proportion/
One Sample Mean
Average of LOS of sepsis population or bronchiolitis
population
Online Tool:
https://select-statistics.co.uk/calculators/sample-size-calculator-
population-mean/
Two Sample Proportion Comparison
Any reduction on readmission rate after intervention?
Does the education decrease the rate of chest x-rays in the
acute asthma population?
Compare proportion of patients that received 1
st
fluid bolus
within 20 minutes in 2017 and 2018.
Online Tool:
https://select-statistics.co.uk/calculators/sample-size-calculator-two-
proportions/
Two Sample Mean Comparison
Any decrease on LOS after intervention?
Online Tool:
https://select-statistics.co.uk/calculators/sample-size-calculator-two-
means/
Summary
Thoroughly think through sampling design process
Well defined target population
Sampling population and sampling frame to reflect target population
Sampling algorithm to adapt questions and population
Sample size to balance estimation errors and practical constraints
Outcomes & Impact
Better data. Better decisions