Off-campus UNL users: To download campus access dissertations, please use the following link to log into our proxy server with your NU ID and password. When you are done browsing please remember to return to this page and log out.

Non-UNL users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Improving the Accuracy of Genomic Predictions: Investigation of Training Methods and Data Pooling

Johnna Lynn Baller, University of Nebraska - Lincoln


One of the primary factors in the response to selection is the accuracy of selection. This study focused on methodologies to predict breeding values (BV) accurately within multi- and single-step genomic evaluations. Factors including cross-validation methods, dependent variables, and genotyping strategies were assessed on the accuracy of genomic BV while using multi-step prediction in real and simulated data. In both cases, random clustering led to largest estimated accuracies compared to clusters based on k-means, k-medoids, and principle component analysis, but differences in bias were not detected. Using deregressed estimated BV (EBV) to estimate SNP effects led to larger accuracies and smaller standard errors than adjusted phenotypes. Randomly genotyping animals instead of selectively genotyping the top 25% was associated with highest accuracies and least amount of bias.Genetic improvement of economically relevant traits (ERT) should be the goal of breeding programs. Although generally absent in seedstock herds, ERT are routinely collected within commercial sectors; therefore, pooling data was proposed to include commercial information in a cost-effective manner. Pooling involves collecting tissue samples from a group of animals and then combining the DNA to be genotyped as one. The accuracy of EBV when pooled data were used within single-step analysis was investigated through simulation. For a single trait, pool sizes of 2, 10, 20 or 50 did not generally lead to differences in EBV accuracy compared to using individual data when pools were constructed to minimize phenotypic variation. Low accuracy sires benefited the most from pooling, while EBV for the pools could be used for management purposes. For a bivariate analysis, pool sizes of at least 20 were recommended in combination with minimizing phenotypic variation. Additionally, if pools were constructed to minimize phenotypic variation, pooling could be used across a range of genetic correlations (0.1, 0.4, and 0.7) and ways in which missing values arise (randomly missing records or sequential culling). Collectively, these results suggest pooling can be used to include commercial data within genetic evaluations.

Subject Area

Animal sciences|Agriculture

Recommended Citation

Baller, Johnna Lynn, "Improving the Accuracy of Genomic Predictions: Investigation of Training Methods and Data Pooling" (2020). ETD collection for University of Nebraska-Lincoln. AAI28258515.