Skip to contents

This function works with the output of 'stratify()'. The user provides the number of units they wish to sample from their population dataset. The function tells the user how many observations to sample from each stratum and generates recruitment lists, one per stratum, which can either be saved to .csv files in any given directory or accessed later on.

Usage

recruit(
  stratify_output,
  guided = TRUE,
  sample_size = NULL,
  save_as_csv = FALSE
)

Arguments

stratify_output

output from 'stratify()', of S3 class 'generalizeR_stratify'

guided

logical; defaults to TRUE. Whether the function should be guided (ask questions and behave interactively throughout) or not. If set to FALSE, must provide values for other arguments below

sample_size

defaults to NULL. If guided is set to FALSE, must provide a number of units to sample

save_as_csv

defaults to NULL. If guided is set to FALSE, specify whether or not to save recruitment lists to working directory; TRUE or FALSE

Value

A three-element list containing the recruitment lists and the recruitment table/kable

Details

This function, and the others in this package, are designed to mimic the website https://www.thegeneralizer.org/ based on the papers referenced below.

References

Tipton, E. (2014). Stratified sampling using cluster analysis: A sample selection strategy for improved generalizations from experiments. Evaluation Review, 37(2), 109-139.

Tipton, E. (2014). How generalizable is your experiment? An index for comparing experimental samples and populations. Journal of Educational and Behavioral Statistics, 39(6), 478-501.

Examples

library(tidyverse)
#> ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
#>  dplyr     1.1.4      readr     2.1.5
#>  forcats   1.0.0      stringr   1.5.1
#>  lubridate 1.9.3      tibble    3.2.1
#>  purrr     1.0.2      tidyr     1.3.1
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#>  dplyr::filter() masks stats::filter()
#>  dplyr::lag()    masks stats::lag()
#>  Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

selection_covariates <- c("total", "pct_black_or_african_american", "pct_white",
                          "pct_female", "pct_free_and_reduced_lunch")
strat_output <- stratify(generalizeR:::inference_pop, guided = FALSE, n_strata = 4,
                         variables = selection_covariates, idvar = "ncessch")
#> 
#> This might take a little while. Please bear with us.
#> 
#> Calculated distance matrix.
#>  
#> iteration: 1 --> total WCSS: 338.562  -->  squared norm: 1.40551
#> iteration: 2 --> total WCSS: 205.024  -->  squared norm: 0.138412
#> iteration: 3 --> total WCSS: 203.903  -->  squared norm: 0.0419178
#> iteration: 4 --> total WCSS: 203.775  -->  squared norm: 0.0313237
#> iteration: 5 --> total WCSS: 203.729  -->  squared norm: 0
#>  
#> ===================== end of initialization 1 =====================
#>  

recruit(strat_output, guided = FALSE, sample_size = 72, save_as_csv = FALSE)
#> 
#> The 'generalizeR_stratify' object you've supplied consists of 324 population units 
#> divided into 4 strata along these variables:
#> 
#> total, pct_black_or_african_american, pct_white, pct_female, pct_free_and_reduced_lunch.
#> 
#> 4 recruitment lists have been generated, one per stratum. Each list contains the ID 
#> information for the units, which have been ranked in order of desirability.
#> 
#> The following table (also shown in the Viewer pane to the right) displays the stratum 
#> sizes, their proportion relative to the total population size, and consequent 
#> recruitment number for each stratum. Ideally, units should be recruited across strata 
#> according to these numbers. Doing so will lead to the least amount of bias and no 
#> increase in standard errors. Note that the recruitment numbers have been rounded to 
#> integers in such a way as to ensure their sum equals the desired total sample size.
#> 
#> Recruitment Table
#>                      Stratum 1 Stratum 2 Stratum 3 Stratum 4
#>     Population Units   153.000    75.000     39.00    57.000
#>  Sampling Proportion     0.472     0.231      0.12     0.176
#>   Recruitment Number    34.000    16.000      9.00    13.000
#> <table class="table table-striped table-hover" style="margin-left: auto; margin-right: auto;">
#> <caption>Recruitment Table</caption>
#>  <thead>
#> <tr>
#> <th style="empty-cells: hide;border-bottom:hidden;" colspan="1"></th>
#> <th style="border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="4"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Stratum</div></th>
#> </tr>
#>   <tr>
#>    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;">   </th>
#>    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;"> 1 </th>
#>    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;"> 2 </th>
#>    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;"> 3 </th>
#>    <th style="text-align:center;position: sticky; top:0; background-color: #FFFFFF;"> 4 </th>
#>   </tr>
#>  </thead>
#> <tbody>
#>   <tr>
#>    <td style="text-align:center;font-weight: bold;border-right:1px solid;"> Population Units </td>
#>    <td style="text-align:center;"> 153.000 </td>
#>    <td style="text-align:center;"> 75.000 </td>
#>    <td style="text-align:center;"> 39.00 </td>
#>    <td style="text-align:center;"> 57.000 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:center;font-weight: bold;border-right:1px solid;"> Sampling Proportion </td>
#>    <td style="text-align:center;"> 0.472 </td>
#>    <td style="text-align:center;"> 0.231 </td>
#>    <td style="text-align:center;"> 0.12 </td>
#>    <td style="text-align:center;"> 0.176 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:center;font-weight: bold;border-right:1px solid;background-color: rgba(92, 200, 99, 255) !important;"> Recruitment Number </td>
#>    <td style="text-align:center;background-color: rgba(92, 200, 99, 255) !important;"> 34.000 </td>
#>    <td style="text-align:center;background-color: rgba(92, 200, 99, 255) !important;"> 16.000 </td>
#>    <td style="text-align:center;background-color: rgba(92, 200, 99, 255) !important;"> 9.00 </td>
#>    <td style="text-align:center;background-color: rgba(92, 200, 99, 255) !important;"> 13.000 </td>
#>   </tr>
#> </tbody>
#> </table>
#> Attempt to recruit units starting from the top of each recruitment list. If you are 
#> unsuccessful in recruiting a particular unit, move on to the next one in the list and 
#> continue until you have reached the ideal recruitment number in each stratum.
#> 
#> If you have stored the output of 'recruit()' in an object, you can use it to access 
#> these lists by typing the name of the object followed by '$recruitment_lists'.