☆ 4.7 Article

A comparison of classical and machine learning-based phenotype prediction methods on simulated data and three plant species

FRONTIERS IN PLANT SCIENCE (2022)

期刊

FRONTIERS IN PLANT SCIENCE

卷 13, 期 -, 页码 -

出版社

FRONTIERS MEDIA SA

DOI: 10.3389/fpls.2022.932512

关键词

phenotype prediction; genomic selection; plant phenotyping; machine learning; Arabidopsis thaliana

类别

Plant Sciences

资金

Federal Ministry of Education and Research (BMBF), Germany
[01|S21038]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Genomic selection is an important tool for breeders, allowing them to select plants accurately based on genotype data and improve breeding programs. This study compared 12 different phenotype prediction models on simulated and real-world data, finding that some established methods performed well on simulated data but further research is needed for real-world data.

Genomic selection is an integral tool for breeders to accurately select plants directly from genotype data leading to faster and more resource-efficient breeding programs. Several prediction methods have been established in the last few years. These range from classical linear mixed models to complex non-linear machine learning approaches, such as Support Vector Regression, and modern deep learning-based architectures. Many of these methods have been extensively evaluated on different crop species with varying outcomes. In this work, our aim is to systematically compare 12 different phenotype prediction models, including basic genomic selection methods to more advanced deep learning-based techniques. More importantly, we assess the performance of these models on simulated phenotype data as well as on real-world data from Arabidopsis thaliana and two breeding datasets from soy and corn. The synthetic phenotypic data allow us to analyze all prediction models and especially the selected markers under controlled and predefined settings. We show that Bayes B and linear regression models with sparsity constraints perform best under different simulation settings with respect to explained variance. Further, we can confirm results from other studies that there is no superiority of more complex neural network-based architectures for phenotype prediction compared to well-established methods. However, on real-world data, for which several prediction models yield comparable results with slight advantages for Elastic Net, this picture is less clear, suggesting that there is a lot of room for future research.

A comparison of classical and machine learning-based phenotype prediction methods on simulated data and three plant species

期刊

FRONTIERS IN PLANT SCIENCE

出版社

FRONTIERS MEDIA SA

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A comparison of classical and machine learning-based phenotype prediction methods on simulated data and three plant species

期刊

FRONTIERS IN PLANT SCIENCE

出版社

FRONTIERS MEDIA SA

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文