Agronomy and Horticulture, Department of
Document Type
Article
Date of this Version
11-18-2024
Citation
Powadi A, Jubery TZ, Tross MC, Schnable JC and Ganapathysubramanian B (2024) Disentangling genotype and environment specific latent features for improved trait prediction using a compositional autoencoder. Front. Plant Sci. 15:1476070. doi: 10.3389/fpls.2024.1476070
Abstract
In plant breeding and genetics, predictive models traditionally rely on compact representations of high-dimensional data, often using methods like Principal Component Analysis (PCA) and, more recently, Autoencoders (AE). However, these methods do not separate genotype-specific and environment-specific features, limiting their ability to accurately predict traits influenced by both genetic and environmental factors. We hypothesize that disentangling these representations into genotype-specific and environment-specific components can enhance predictive models. To test this, we developed a compositional autoencoder (CAE) that decomposes high-dimensional data into distinct genotype-specific and environment-specific latent features. Our CAE framework employed a hierarchical architecture within an autoencoder to effectively separate these entangled latent features. Applied to a maize diversity panel dataset, the CAE demonstrated superior modeling of environmental influences and out-performs PCA (principal component analysis), PLSR (Partial Least square regression) and vanilla autoencoders by 7 times for ‘Days to Pollen’ trait and 10 times improved predictive performance for ‘Yield’. By disentangling latent features, the CAE provided a powerful tool for precision breeding and genetic research. This work has significantly enhanced trait prediction models, advancing agricultural and biological sciences.
Included in
Agricultural Science Commons, Agriculture Commons, Agronomy and Crop Sciences Commons, Botany Commons, Horticulture Commons, Other Plant Sciences Commons, Plant Biology Commons
Comments
Open access.