Sampling Designs for MLM

Overview of Sampling Designs for Multilevel Models

Multilevel models (a.k.a. hierarchical linear models, nested models, mixed models, random-effect models, etc.) are models with a number of linear regressions in which parameters can be varied at more than one level. The good thing is that the concepts and results are very similar to those of the conventional linear regression. However, the multilevel models are required when the data are nested or clustered.

To illustrate, let’s think of data with hierarchical structures – where lower level units are nested in higher level units (i.e., students are nested within schools, schools are nested within states, and states are nested within countries). You will find that most of the national datasets resemble these structures. Consequently, the individual observations are no longer independent. Ignoring these structures can result in biases in parameter estimates and standard errors.

The narrowed population, known as the sampling frame population, is frequently used when researchers would like to generalize to a larger population (e.g. U.S. high school students) (Hancock & Mueller, 2010). For example, a multilevel model where a researcher samples high school and look at students with in schools, the researcher needs to decide:

How many schools? How many students per school? The same sample size from each school?

These questions are related, a large number of units at one level can make up for a smaller number at another level. It can be a problem when the number of higher level units is small. It is better to have a larger number of higher level units than lower level units nested within the higher level units (Tolmie, Muijs, & McAteer, 2011).

Issues related to sample size are critically important in multilevel modeling; it requires a bit more complex sampling design than other analyses. It also involves several stages of selection. The researcher should identify and justify the sampling strategy and the type of data collection.

Here are some sampling strategies that have been used in multilevel modeling data collection:

1. Stratified sampling

Compared to SRS (simple random sampling) and systematic sampling, the stratified sampling strategy will gain samples that are more representative based on the strata (distinct subpopulations) or the categories. With this method, members of the sampling frame are split into mutually exclusive categories (strata) and then elements are sampled from each category to ensure specific representation of members of each stratum. The increase in precision is related to the guarantee that different types of people will be included in the sample – it will be nearly impossible to obtain a “bad”, non-representative, sample with an extreme parameter estimate (Hancock & Mueller, 2010).

Stratification can be used at any or all levels of a multistage sampling design. In general, any time stratification is used as part of the sampling design and the response variable is homogenous within strata, the estimates from the sample will be more precise than had a sample of the same size been obtained by through SRS (O'Connell & McCoach, 2008).


1. Proportionate Allocation

As the most widely used method, proportionate allocation is perceived as a “representative sampling” of samples which are “miniatures of the population,” and by the notation that the “different parts of the population should be appropriately represented in the sample.” (Kish, 1965) According to Table 1, the size of the sample drawn from each district (proportionate stratified sample) is proportional to their representation in the whole region or the target population.

The researcher applies the same stratum weight calculated from the target population to each stratum. This equal probability of selection method (EPSEM) allows every element in the population to have an equal chance to be recruited, leaving a smaller margin of errors compared to disproportionate allocations. The assumption is that the variances and data collection costs are the same across the strata. Proportionate allocation is used when the purpose of the research is to estimate a population’s parameters (Daniel, 2011).


2. Disproportionate stratified sampling:

With disproportionate stratified sampling, the size of each stratum/category is not proportional to their representation in the total population. As a result, every unit in the population does not have an equal chance to be recruited into a stratified sample. For some studies, disproportionate stratified sampling may be more appropriate than proportionate stratified sampling (Daniel, 2011).

There are 3 sub-types of disproportionate stratified sampling. Each type is based on the purpose of allocation that is implemented (Daniel, 2011):

2.1 To facilitate within-strata analyses 2.2 To facilitate between-strata analyses 2.3 To facilitate optimum allocation (the optimization of costs, the optimization of precision, or the optimization of both precision and costs)

Table 2 presents an example of disproportionate allocation stratified sampling with an aim to facilitate within-strata analysis study. The proportionate allocation of District 4 is so small that it may not yield an adequate sample size to carry out a meaningful and detailed analysis within the District 4. As such, researchers may choose to oversample this small stratum.


Table 3 presents an example of disproportionate allocation stratified sampling with an aim to facilitate a between-strata analysis study, a comparative analysis across the districts. The number of unit in each stratum is equal, hoping to maximize the sample size of each stratum.


Table 4 presents an example of optimum allocation disproportionate stratified sampling. Compared with proportionate stratified sampling, this allocation, which takes a) data collection costs and b) research precision into account, helps researchers to achieve higher accuracy (less margin of error). This procedure is highly appropriate for a study where the strata differ in terms of data collection costs and the variability of the variables of interest. Researchers can apply optimum allocation focusing on cost only, precision only, or both cost and precision together.

The hypothetical data in Table 4 represents data collection cost per unit for each district. If the researcher takes only collection cost into consideration, sample size for the district that has the lowest collection is the largest. If the collection cost is unavailable, the researcher may consider using variability of the interested variable as a criterion (Neyman Allocation).

When both collection cost and variability data are available, the researcher may optimize for both cost and precision, using the following formula as a criterion:


The researcher will then recruit more samples in the district with higher value, as seen in column 9.


2. Two-stage and Multistage cluster sampling

Most multilevel modeling is multi-stage in nature. In order to make valid inferences about the target population of interest, the sample needs to be designed and recruited meticulously to guarantee appropriate representation at all levels. Two stage and multi-stage sampling represents a more complex form of cluster sampling where larger clusters are further divided into smaller, more specific groupings for the purposes of surveying. Multi-stage sampling can be easier to implement and can create a more representative sample of the population than a single random sampling technique (Daniel, 2011).

With two-stage and multistage sampling, investigators will duplicate two basic steps. Initially, the first-stage clusters, or the primary sampling unit (PSU) will be defined and divided into second-stage cluster or a secondary sampling unit. Then, the second-stage unit will be selected. Sampling procedures at each stage may differ (e.g. simple random sampling, stratified sampling) (Daniel, 2011).

For example, there are 50,000 students (N) in 100 schools. A researcher would like to select a sample of 500 students (n). A researcher can select a sample of students or a sample of schools (clusters), using an overall sampling fraction of 1/100 (500/50,000). The researcher might use other sampling fractions that are similar, such as 1/100, e.g. 1/20*1/5 or 1/10*1/10.

1/20 for schools and 1/5 for students in school;

1/20 * 100 schools = 5 schools

1/5 * (50,000 students / 100 schools) = 100 students per school

As a result, 5 schools with 100 students per school are selected

1/10 for schools and 1/10 for students in school;

1/10 * 100 schools = 10 schools

1/10 * (50,000 students / 100 schools) = 50 students per school

This will help decrease cost and increase the variance of survey variates.

The systematic probability-proportional-to-size (PPS) technique is a modified version of stratified sampling, usually used in multi-stage cluster (or stratified) sampling for population-level studies. PPS is also sometimes called “unequal probability sampling” because this technique can actually increase the odds that a subject will be chosen based on its size. PPS is used when populations vary in size. If sampling units are selected with equal probability, then the likelihood of a larger population sampling-unit being selected for the survey is actually less than the likelihood of elements from a smaller population sampling-unit. The likelihood that a larger sampling population-unit will be chosen over a smaller population-sampling unit is increased with the use of PPS, thus reducing standard error and bias (Hancock & Mueller, 2010). Weighting techniques can also be used after the sample is obtained, but the initial use of PPS allows up-front calculations, precluding the use of weighting later.

For instance, the TIMSS (Trends in International Mathematics and Science Study) uses a two-stage sampling procedure to insure an illustrative sample of students (Gonzalez & Miles, 2001). First, a national list of all eligible schools is selected and assigned to predetermined strata. Using the PPS technique, approximately 150 schools were randomly selected from all secondary schools in each participating country. The probability of the school being selected was proportional to the number of eighth grade students in each school. Stratification by region and urbanization was used to ensure that urban and rural schools in all states were represented. At the second sampling stage, one or two eligible classrooms of eighth grade students within each sampled school were randomly selected (Gonzalez & Miles, 2001).

The PPS sampling technique may be used where the probability of selection of the PSU depends on the size of the PSU. If there is an equal number of final units drawn from each PSU (e.g., 10 students at each campus), and the PSU is drawn with equal probability, then students at smaller institutions will have a higher probability of being included in the sample. The final sample might over-represent students at small campuses. A sampling design should result in approximately equal probabilities of the lowest level units being selected. The PSUs selected should be directly proportionate to the size of the PSU, i.e., smaller PSUs have smaller chances to be in the sample and vice versa (Hancock & Mueller, 2010).


Daniel, J. (2011). Sampling essentials: practical guidelines for making sampling choices. Sage Publications.

Gonzalez, E.J., & Miles, J.A. (2001). TIMSS 1999 user guide for the international database. International Association for the Evaluation of Educational Achievement. Boston, MA.

Hancock, G. R., & Mueller, R. O. (Eds.). (2010). The reviewer’s guide to quantitative methods in the social sciences. Routledge.

Kish, L. (1965). Survey sampling.

O'Connell, A. A., & McCoach, D. B. (Eds.). (2008). Multilevel Modeling of Educational Data (PB). IAP.

Tolmie, A., Muijs, D., & McAteer, E. (2011). Quantitative methods in educational and social research using SPSS. McGraw-Hill International.



Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.