Sample Size in Quantitative Research

Sample Size in Quantitative Research

 

Prior to data collection, it is critical to run a power analysis to determine the required sample size to be able to detect an effect if an effect does exist, termed statistical power (Field, 2009, p. 551). It is important to know how large of a sample is needed to have adequate statistical power, otherwise there may be an effect, but the power isn’t large enough to be able to detect it. There are several methods to determining the required sample size:

 

Benchmarks:

  • Benchmarks are sample size recommendations for the minimum sample size recommended by for the analysis (i.e., factor analysis should have at least 300 observations) (Tabachnick & Fidell, 2007, p. 613).
  • Issue: does not take into account prior effect sizes, normality (or the lack there of) of data or the amount of missing data. 

Ratios: 

  • Ratios provide an estimate of minimum sample size based on the number of predictors (independent variables) to cases. For example, Howell (2010, p. 533) mentions a 10:1 ratio or 10 observations for each predictor, however, he also points out that Harris (1985) notes that there is no empirical evidence for the 10:1 rule.
  • Issues: in the case of regression, if there are a small number of predictors, the sample size could be too small to have adequate power (Harris, 1985).

Power analyses:

  •  Power analyses are calculations (either by hand or through a computer program) that researchers can use to determine the minimum sample size needed to have adequate statistical power (for a review of statistical power analyses, see Cohen, 1992a). 
  • The researcher must enter in a prior estimate of the effect, the error probability (i.e., .05), power, and number of groups, predictors, or covariates. The system then determined a sample size based on the parameters.
  • In order to run a power analysis, the researcher needs to have an estimate of the expected effect size (based on prior empirical studies or theory). 
  • Some software packages do not allow researchers to run power analyses on advanced statistical analyses (e.g., structural equation modeling (SEM) or confirmatory factor analysis (CFA)).

Here is a tutorial of an apriori power analysis for a one-way Analysis of Variance using G*Power:

 

 

Monte Carlo simulation studies:

  • Monte Carlo simulations are conducted when researchers have an idea of the population estimate or expected effect size. The benefits of Monte Carlo simulation studies are that the participants can vary the estimates of particular variables and determine the sample size needed under a variety of conditions (i.e., normality or non-normality of data, missing data, and higher or lower parameter estimates).
  • There are some drawbacks to running a Monte Carlo simulation; mainly that the procedure is computer and labor intensive (Paxton, Curran, Bollen & Kirby, 2001).

 

Special cases:

Although power analyses can be used for many univariate and multivariate analyses, sometimes they cannot be used for advanced statistical analyses, such as CFA and SEM, which are large data analyses. In these cases, there are three approached to determining sample size for CFA/SEM:

 

Benchmarks:

  • Barrett (2007) recommends a minimum sample size of 200.o   An issue with benchmarks are that they are based on normal continuous data, as such, when there is non-normal data a larger sample size is needed (Kline, 2010)

Ratios:

  • Jackson (2003) recommends a minimum sample size based on an observation to parameter ratio, 20:1, whereas Bentler and Chou (1987) recommend a ratio of 10:1.
  • The ratios do not consider normality, the strength of the path coefficients, and the amount of missing data (Muthén & Muthén, 2002; Wolf, Harrington, Clark, and Miller, 2013).

Monte Carlo simulations:

  • By implementing a Monte Carlo simulation, the sample size, path coefficients of the indicators, missing data, and normality of the data can be varied to determine adequate power and assess for potential biases.

 

References:

Barrett, P. (2007). Structural equation modelling: Adjusting for model fit. Personality and Individual Differences, 42, 815-824.

Bentler, P. M., & Chou, C. P. (1987) Practical issues in structural modelingSociological Methods & Research, 16, 78-117

Cohen, J. (1992a). Statistical power analysis. Current Directions in Psychological Science, 1, 98-101.

Harris, R.J. (1985). A primer in multivariate statistics, second edition. New York, NY: Academic Press.

Howell, D. C. (2010). Statistical methods for psychology, Seventh Edition. Belmont, CA: Wadsworth Cengage Learning.

Jackson, D.L. (2003). Revisiting sample size and number of parameter estimates: Some support for the N:q hypothesis. Structural Equation Modeling, 10, 128-141.

Kline, R. (2010). Principles and practice of structural equation modeling, third edition. NY, NY: Guilford.

Muthén, L.K. & Muthén, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 9, 599-620.

Paxton, P., Curran, P.J., Bollen, K.A., & Kirby, J. (2001). Monte Carlo experiments: Design and implementation. Structural Equation Modeling, 8, 287-312.

Tabachnick B.G. & Fidell, L.S. (2007). Using multivariate statistics, fifth edition. New York, NY: Pearson Education, Inc.

Wolf, E.J., Harrington, K.M., Clark, S.L., & Miller, M.W. (2013). Sample size requirements for structural equation models: An evaluation of power, bias, and solution propriety. Educational and Psychological Measurement, 73, 913-934.

 

Further readings:

Cohen, J. (1992b). A power primer. Psychological Bulletin, 112, 155-159.

 

-Cohen (1992b) provides an overview of statistical power, factors that affect power, and power tables and effect size indexes for eight widely used analyses in psychology.

 

Creative Commons License

 

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.