Statistics, Department of


First Advisor

Kent Eskridge

Date of this Version



Karnik, K.N. (2023). Exploring Experimental Design and Multivariate Analysis Techniques for Evaluating Community Structure of Bacteria in Microbiome Data (Doctoral dissertation, University of Nebraska-Lincoln).


A DISSERTATION Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Doctor of Philosophy, Major: Statistics, Under the Supervision of Professor Kent Eskridge. Lincoln, Nebraska: August, 2023

Copyright © 2023 Kelsey Karnik


The gut microbiome plays a crucial role in human health, and by working collaboratively with microbiologists, we aim to further our understanding of the human gut and its impact on human health. Promoting a diverse microbiome is emphasized throughout microbiology literature, and involving a statistician in designing experiments to relate gut bacteria and some measured health outcome is crucial for ensuring valid and accurate results. By adopting new experimental design and analysis methods, researchers can begin to gain a deeper understanding of how the genetics of our food affect the composition of taxa within the gut microbiome. This dissertation is structured around three main objectives, demonstrating how applying new experimental design techniques and multivariate analysis methodologies could potentially benefit domain-specific researchers throughout the scientific process. This work developed a new experimental design structure for assigning treatments to well-plates. Multivariate analysis methods were used to analyze the data, creating new polymicrobial traits to introduce a community taxonomic effect into genome-wide association models. Finally, the effects of experimental parameters on statistical optimality criteria were explored. Our randomizations and experimental design structure exhibited increased efficiency over a design that included only replicate effects. After analyzing our taxonomic abundance data and decomposing the variability in multiple formats, our new pseudo-multivariate phenotypes were included in our collaborators' GWAS models. We found that 57\% of the calculated polymicrobial traits were included in the genome-wide association study (GWAS) models. Over half of the polymicrobial traits used as responses contained either a direct or related overlap with a univariate taxon on the same Major Effect Loci, where some of the unique and helpful relationships were explored more in-depth regarding taxonomic functions within the microbiome. Lastly, we developed a function that calculates the composite optimality criteria to compare design optimality for a multivariate linear mixed model with a covariance structure on the random genetic effects. In the future, similar models and optimal design functions could help researchers improve their experimental design layouts by leveraging their knowledge of genetic relationships in our diets and the relationships between taxa in the gut.

Advisor: Kent Eskridge