Sampling

A short description of the post.

  1. Load the R packages we will use.
  1. Quiz questions

Question 7.2.4 in modern dive with a different sample size and repetitions

Modify the code for comparing different sample sizes from the virtual bowl

Segment 1: sample size 28

1.a) Take 1150 samples of size 28 instead of 1000 replicates size 25 from the bowl dataset. Assign the output to virtual_samples_28

1.b) Compute the 1150 replicates of proportion red

1.c) Plot distribution of virtual_prop_red_28 via a histogram use labs to


Segment 2: sample size = 53

2.a) Take 1150 samples of size 53 instead of 1000 replicates of size 50. Assign the output to virtual_sample_53

2.b) Compute resulting 1150 replicates of proportion red

2.c) Plot distribution of virtual_prop_red_53 via a histogram use labs to - label x axis = “Proportion of 53 balls that were red” - Create title = “53”


Segment 3: Sample size = 118

3.a) Take 1150 samples of size 188 instead of 1000 replicates of size 50. Assign the output to virtual_samples_118

3.b) Compute resulting 1150 replicates of proportion red - Start with virtual_samples_118 - Group_by replicate THEN - Create variable red equal to the sum of all the red balls - create variable prop_red equal to variable red / 118 - Assign the output to virtual_prop_red_118

3.c) Plot distribution of virtual_prop_red_118 via a histogram use labs to


Calculate the standard deviation for three sets of 1150 values of prop_red using the standard deviation

#n = 28

# A tibble: 1 x 1
      sd
   <dbl>
1 0.0907

#n = 53

# A tibble: 1 x 1
      sd
   <dbl>
1 0.0643

#n = 118

# A tibble: 1 x 1
      sd
   <dbl>
1 0.0432

The distribution with the sample size n = 118, has the smallest standard deviation aRound the estimated proportion of red balls