Statistics, Department of


First Advisor

Stephen D. Kachman

Date of this Version

Summer 8-2016


D.F. Wilson-Wells. Methods to Account for Breed Composition in a Bayesian GWAS Method which Utilizes Haplotype Clusters. Ph.D. thesis, University of Nebraska-Lincoln, Lincoln, Nebraska, 2017.


A DISSERTATION Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Doctor of Philosophy, Major: Statistics, Under the Supervision of Professor Stephen D. Kachman. Lincoln, Nebraska: August, 2016

Copyright (c) 2016 Danielle Faye Wilson-Wells


In livestock, prediction of an animal’s genetic merit using genomic information is becoming increasingly common. The models used to make these predictions typically assume that we are sampling from a homogeneous population. However, in both commercial and experimental populations the sire and dam of an individual may be a mixture of different breeds. Haplotype models can capture this population structure.

Two models based on breed specific haplotype clusters where developed to account for differences across multiple breeds. The first model utilizes the breed composition of the individual, while the second utilizes the breed composition from the sire and dam. Haplotype clusters were modeled as hidden states in a hidden Markov model where the genomic effects are associated with loci located on the unobserved clusters. Similar to the Bayes C model, we can model the genomic effects at the loci using a prior, which consists of a mixture of a multivariate normal and a point mass at zero distribution.

The performance of the first model will be evaluated in a composite beef cattle population, representing various fractions of several breeds, using five weight traits, seven carcass traits, and two other traits related to calving on 6,552 cattle genotyped for 99,827 mapped SNPs. The performance of the second model will be evaluated in a two-way cross population, which was a cross between two independent lines, using age of puberty records on 1,654 swine genotyped for 48,408 mapped SNPs. Both models will also be evaluated in a simulated composite population of two lines of 12,500 individuals and 61,255 mapped SNPs.

Overall, the breed specific haplotype models led to larger and more clearly observed estimated QTL. However, the prediction accuracy for the haplotype models were typically lower than those for the traditional Bayesian GWAS models. Therefore, while our ability to locate QTLs was increased, the traditional models are still the preferred choice for prediction as they have higher prediction accuracy when it comes to estimating an animal’s genetic merit.