Appendix D (Sample Size Determination)

Recommended steps

  • Define how many nested cells will be relevant for the analysis and what should be the minimal number of cases in each cell allowing for substantial analyses.
  • Have the survey sponsor specify the desired level of precision.
  • Convert these 95% confidence intervals into a sampling variance of the mean, \(var\left (\bar{y}\right )\).
    • Example: the survey sponsor wants a 95% confidence interval of .08 around the statistic of interest. Since the half width of a 95% confidence interval (CI) is \(\frac{1}{2}\left(95\% CI\right)=1.96\left ( se(\bar{y})\right )\), this formula can be rearranged with basic algebra to calculate the precision (sampling variance of the mean) from this confidence interval: \(var(\bar{y})=\left(se(\bar{y})^2\right)=\left(\frac{.5(95\% CI)}{1.96}\right)^2=\left(\frac{.04}{1.96}\right)^2=.0004165\).
  • Obtain an estimate of \(S^2\) (population element).
  • If the statistic of interest is not a proportion, find an estimate of \(S^2\) from a previous survey on the same target population.
  • If the statistic of interest is a proportion, the sampler can use the expected value of the proportion (\(p\)), even if it is a guess, to estimate \(S^2\) by using the formula \(S^2=p(1-p)\).
  • Estimate the needed number of completed interviews for a simple random sample (SRS) by dividing the estimate of \(S^2\) by the sampling variance of the mean.
    • Example: the obtained estimate of \(S^2\) is .6247. Therefore the needed number of completed interviews for an SRS (\(n_{srs}\)) is: \(n_{srs}=\frac{.6247}{.0004165}=1,499.88\approx1,500\).
    • Multiply the number of completed interviews by the design effect to account for a non-SRS design.
      • Example: the design effect of a stratified clustered sample is 1.25. Taking into account the design effect, the number of completed interviews for this complex (i.e., stratified clustered) sample is: \(n_{complex}=n_{srs}\times d_{eff}=1,500\times 1.25=1,875\).
    • The sample size must account for three additional factors:
    • Not all sampled elements will want to participate in the survey (i.e., response rate).
    • Not all sampled elements, given the target population, will be eligible to participate (i.e., eligibility rate).
    • The frame will likely fail to cover all elements in the survey population coverage rate).
    • Calculate the necessary sample size by dividing the number of completed interviews by the expected response rate, eligibility rate, and coverage rate.
    • The sampler can estimate these three rates by looking at the rates obtained in previous surveys with the same survey population and survey design.
      • Example: the expected response rate is 75%, the expected eligibility rate is 90%, and the expected coverage rate is 95%. Therefore, the necessary sample size is: \(n_{final}=\frac{n_{complex}}{\text{Response rate}\times \text{Eligibility rate}\times \text{Coverage rate}}=\frac{1,875}{.75\times .9\times .95}=2,923.97\approx2924\).