Jeremy T. Howard 0000-0001-6216-3540
Date of this Version
Published in Journal of Animal Breeding Genetics 135 (2018), pp 251–262.
Simulated and swine industry data sets were utilized to assess the impact of removing older data on the predictive ability of selection candidate estimated breeding values (EBV) when using single-step genomic best linear unbiased prediction (ssGBLUP). Simulated data included thirty replicates designed to mimic the structure of swine data sets. For the simulated data, varying amounts of data were truncated based on the number of ancestral generations back from the selection candidates. The swine data sets consisted of phenotypic and genotypic records for three traits across two breeds on animals born from 2003 to 2017. Phenotypes and genotypes were iteratively removed 1 year at a time based on the year an animal was born. For the swine data sets, correlations between corrected phenotypes (Cp) and EBV were used to evaluate the predictive ability on young animals born in 2016–2017. In the simulated data set, keeping data two generations back or greater resulted in no statistical difference (p-value > 0.05) in the reduction in the true breeding value at generation 15 compared to utilizing all available data. Across swine data sets, removing phenotypes from animals born prior to 2011 resulted in a negligible or a slight numerical increase in the correlation between Cp and EBV. Truncating data is a method to alleviate computational issues without negatively impacting the predictive ability of selection candidate EBV.
Includes Supplemental Table S1.
Applied Statistics Commons, Biostatistics Commons, Design of Experiments and Sample Surveys Commons, Genetics and Genomics Commons, Meat Science Commons, Other Mathematics Commons, Statistical Models Commons, Vital and Health Statistics Commons