Date of this Version
Koziol, N. A. (2015). A comparison of population-averaged and cluster-specific approaches in the context of unequal probabilities of selection (Doctoral dissertation). Retrieved from Digital Commons.
Sampling designs of large-scale, federally funded studies are typically complex, involving multiple design features (e.g., clustering, unequal probabilities of selection). Researchers must account for these features in order to obtain unbiased point estimators and make valid inferences about population parameters. Single-level (i.e., population-averaged) and multilevel (i.e., cluster-specific) methods provide two alternatives for modeling clustered data. Single-level methods rely on the use of adjusted variance estimators to account for dependency due to clustering, whereas multilevel methods incorporate the dependency into the specification of the model.
Although the literature comparing single-level and multilevel approaches is vast, comparisons have been limited to the context in which all sampling units are selected with equal probabilities (thus circumventing the need for sampling weights). Weighted multilevel modeling is more complex than weighted single-level modeling, and fully flexible methods for estimating weighted multilevel models have only recently been developed. Both approaches are used in practice, but researchers are left with minimal guidance as to which approach is most appropriate.
The goal of this study was to determine under what conditions single-level and multilevel estimators outperform one another (with respect to bias, mean square error, coverage, and root mean square error) in the context of a two-stage sampling design with unequal probabilities of selection. Monte Carlo simulation methods were used to evaluate the impact of several factors, including population model, informativeness of the design, distribution of the outcome variable, intraclass correlation coefficient, cluster size, and estimation method. Results indicated that the unweighted estimators performed similarly across conditions, whereas the weighted single-level estimators tended to outperform the weighted multilevel estimators, particularly under non-ideal sample conditions. Multilevel weight approximation methods did not perform well when the design was informative.
Single-level and multilevel approaches both have advantages and disadvantages, so it is recommended that researchers validate their findings by running the analyses multiple times using different methods. Convergence across methods lends support to the findings, whereas divergence provides a starting point for identifying potentially unreliable results. Ultimately, the appropriateness of a statistical method depends on the researcher’s aims, so even a seemingly well-performing approach may not be suitable.
Adviser: James A. Bovaird