Bootstrapping

Code for quiz 12.

Load the R packages we will use.

what is the average age of members that have served in Congress? - set random seed generator to 123 - take a sample of 100 from the dataset congress_age and assign it to congress_age_100

set.seed(123)
congress_age_100 <- congress_age  %>% 
  rep_sample_n(size = 100)

Construct the confidence interval

1. Use specify to indicate the variable from congress_age_100 that you are interested in

congress_age_100  %>% 
  specify(response = age)
Response: age (numeric)
# A tibble: 100 x 1
     age
   <dbl>
 1  53.1
 2  54.9
 3  65.3
 4  60.1
 5  43.8
 6  57.9
 7  55.3
 8  46  
 9  42.1
10  37  
# ... with 90 more rows

2. generate 1000 replicates of your sample of 100

congress_age_100  %>% 
  specify(response = age)  %>% 
  generate(reps = 1000, type = "bootstrap")
Response: age (numeric)
# A tibble: 100,000 x 2
# Groups:   replicate [1,000]
   replicate   age
       <int> <dbl>
 1         1  42.1
 2         1  71.2
 3         1  45.6
 4         1  39.6
 5         1  56.8
 6         1  71.6
 7         1  60.5
 8         1  56.4
 9         1  43.3
10         1  53.1
# ... with 99,990 more rows

3. calculate the mean for each replicate

bootstrap_distribution_mean_age  <- congress_age_100  %>% 
  specify(response = age)  %>% 
  generate(reps = 1000, type = "bootstrap")  %>% 
  mean(stat = "age")

bootstrap_distribution_mean_age
[1] NA

4. visualize the bootstrap distribution

calculate the 95% confidence interval using the percentile method

Calculate the observed point estimate of the mean and assign it to obs_mean_age