4.6 Article

Bayesian functional regression as an alternative statistical analysis of high-throughput phenotyping data of modern agriculture

期刊

PLANT METHODS
卷 14, 期 -, 页码 -

出版社

BMC
DOI: 10.1186/s13007-018-0314-7

关键词

Hyperspectral data; Functional regression analysis; Bayesian functional regression; Functional data; Bayesian Ridge Regression

向作者/读者索取更多资源

Background: Modern agriculture uses hyperspectral cameras with hundreds of reflectance data at discrete narrow bands measured in several environments. Recently, Montesinos-Lopez et al. (Plant Methods 13(4): 1-23, 2017a. https ://doi. org/10.1186/s13007-016-0154-2; Plant Methods 13(62): 1-29, 2017b. https ://doi. org/10.1186/s13007-017-02124) proposed using functional regression analysis (as functional data analyses) to help reduce the dimensionality of the bands and thus decrease the computational cost. The purpose of this paper is to discuss the advantages and disadvantages that functional regression analysis offers when analyzing hyperspectral image data. We provide a brief review of functional regression analysis and examples that illustrate the methodology. We highlight critical elements of model specification: (i) type and number of basis functions, (ii) the degree of the polynomial, and (iii) the methods used to estimate regression coefficients. We also show how functional data analyses can be integrated into Bayesian models. Finally, we include an in-depth discussion of the challenges and opportunities presented by functional regression analysis. Results: We used seven model-methods, one with the conventional model (M1), three methods using the B-splines model (M2, M4, and M6) and three methods using the Fourier basis model (M3, M5, and M7). The data set we used comprises 976 wheat lines under irrigated environments with 250 wavelengths. Under a Bayesian Ridge Regression (BRR), we compared the prediction accuracy of the model-methods proposed under different numbers of basis functions, and compared the implementation time (in seconds) of the seven proposed model-methods for different numbers of basis. Our results as well as previously analyzed data (Montesinos-Lopez et al. 2017a, 2017b) support that around 23 basis functions are enough. Concerning the degree of the polynomial in the context of B-splines, degree 3 approximates most of the curves very well. Two satisfactory types of basis are the Fourier basis for period curves and the B-splines model for non-periodic curves. Under nine different basis, the seven method-models showed similar prediction accuracy. Regarding implementation time, results show that the lower the number of basis, the lower the implementation time required. Methods M2, M3, M6 and M7 were around 3.4 times faster than methods M1, M4 and M5. Conclusions: In this study, we promote the use of functional regression modeling for analyzing high- throughput phenotypic data and indicate the advantages and disadvantages of its implementation. In addition, many key elements that are needed to understand and implement this statistical technique appropriately are provided using a real data set. We provide details for implementing Bayesian functional regression using the developed genomic functional regression (GFR) package. In summary, we believe this paper is a good guide for breeders and scientists interested in using functional regression models for implementing prediction models when their data are curves.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Genetics & Heredity

Multi-trait genomic-enabled prediction enhances accuracy in multi-year wheat breeding trials

Abelardo Montesinos-Lopez, Daniel E. Runcie, Maria Itria Ibba, Paulino Perez-Rodriguez, Osval A. Montesinos-Lopez, Leonardo A. Crespo, Alison R. Bentley, Jose Crossa

Summary: Implementing genomic-based prediction models in genomic selection involves understanding how to evaluate prediction accuracy from different models and methods using multi-trait data. This study compared prediction accuracy using six large multi-trait wheat datasets and found that a corrected Pearson's correlation method was more accurate than the traditional method. For grain yield, using a multi-trait model yielded higher prediction performance compared to a single-trait model, with the benefits increasing as genetic correlations between traits strengthen.

G3-GENES GENOMES GENETICS (2021)

Article Genetics & Heredity

Bayesian multitrait kernel methods improve multienvironment genome-based prediction

Osval Antonio Montesinos-Lopez, Jose Cricelio Montesinos-Lopez, Abelardo Montesinos-Lopez, Juan Manuel Ramirez-Alcaraz, Jesse Poland, Ravi Singh, Susanne Dreisigacker, Leonardo Crespo, Sushismita Mondal, Velu Govidan, Philomin Juliana, Julio Huerta Espino, Sandesh Shrestha, Rajeev K. Varshney, Jose Crossa

Summary: This study explores Bayesian multitrait kernel methods for genomic prediction and finds that the Gaussian kernel method outperforms traditional methods in prediction performance, capturing nonlinear patterns more efficiently. Evaluating multiple kernels to select the best one is recommended.

G3-GENES GENOMES GENETICS (2022)

Article Plant Sciences

Using an incomplete block design to allocate lines to environments improves sparse genome-based prediction in plant breeding

Osval Antonio Montesinos-Lopez, Abelardo Montesinos-Lopez, Ricardo Acosta, Rajeev K. Varshney, Alison Bentley, Jose Crossa

Summary: Genomic selection is a predictive method used in plant breeding that trains machine learning models with a reference population to predict new lines. This study proposes using incomplete block designs for allocating lines to locations, which outperforms random allocation in terms of predictive performance.

PLANT GENOME (2022)

Article Plant Sciences

Comparing gradient boosting machine and Bayesian threshold BLUP for genome-based prediction of categorical traits in wheat breeding

Osval Antonio Montesinos-Lopez, Henry Nicole Gonzalez, Abelardo Montesinos-Lopez, Maria Daza-Torres, Morten Lillemo, Jose Cricelio Montesinos-Lopez, Jose Crossa

Summary: Genomic selection is a predictive methodology that is changing plant breeding. In this study, the performance of two algorithms (TGBLUP and GBM) was compared on wheat datasets, and GBM outperformed TGBLUP in terms of prediction accuracy. Further research is encouraged to explore the virtues of GBM in genomic selection.

PLANT GENOME (2022)

Article Genetics & Heredity

A General-Purpose Machine Learning R Library for Sparse Kernels Methods With an Application for Genome-Based Prediction

Osval Antonio Montesinos Lopez, Brandon Alejandro Mosqueda Gonzalez, Abel Palafox Gonzalez, Abelardo Montesinos Lopez, Jose Crossa

Summary: This paper presents a new software package (SKM) for implementing six popular supervised machine learning algorithms with the optional use of sparse kernels, as well as a function for computing seven different kernels. SKM focuses on user simplicity and computational efficiency, providing a user-friendly format for algorithms and reducing resources needed for kernel machine learning methods.

FRONTIERS IN GENETICS (2022)

Article Genetics & Heredity

Partial Least Squares Enhances Genomic Prediction of New Environments

Osval A. Montesinos-Lopez, Abelardo Montesinos-Lopez, Kismiantini, Armando Roman-Gallardo, Keith Gardner, Morten Lillemo, Roberto Fritsche-Neto, Jose Crossa

Summary: Improved prediction of future seasons or new environments is crucial for plant breeding. This study demonstrates that the partial least squares regression method outperforms the Bayesian genomic best linear unbiased predictor method in predicting future seasons or new environments.

FRONTIERS IN GENETICS (2022)

Article Genetics & Heredity

A Comparison of Three Machine Learning Methods for Multivariate Genomic Prediction Using the Sparse Kernels Method (SKM) Library

Osval A. Montesinos-Lopez, Abelardo Montesinos-Lopez, Bernabe Cano-Paez, Carlos Moises Hernandez-Suarez, Pedro C. Santana-Mancilla, Jose Crossa

Summary: Genomic selection has revolutionized the way plant breeders select genotypes, using statistical machine learning models to predict phenotypic values of new lines. Multi-trait genomic prediction models leverage correlated traits to improve accuracy. This paper compares the performance of three multi-trait methods and finds that their performance varies under different predictors.
Article Genetics & Heredity

Multi-trait genome prediction of new environments with partial least squares

Osval A. Montesinos-Lopez, Abelardo Montesinos-Lopez, David Alejandro Bernal Sandoval, Brandon Alejandro Mosqueda-Gonzalez, Marco Alberto Valenzo-Jimenez, Jose Crossa

Summary: The genomic selection methodology has revolutionized plant breeding by using statistical machine learning algorithms to predict candidate individuals. However, it faces challenges when predicting future seasons or new environments. This study compared the performance of the multi-trait partial least square (MT-PLS) regression method with the Bayesian Multi-trait Genomic Best Linear Unbiased Predictor (MT-GBLUP) method and found that MT-PLS outperforms MT-GBLUP in predicting future seasons or new environments.

FRONTIERS IN GENETICS (2022)

Article Plant Sciences

Enviromic-based kernels may optimize resource allocation with multi-trait multi-environment genomic prediction for tropical Maize

Raysa Gevartosky, Humberto Fanelli Carvalho, Germano Costa-Neto, Osval A. Montesinos-Lopez, Jose Crossa, Roberto Fritsche-Neto

Summary: This study aimed to design optimized training sets for genomic prediction considering multi-trait multi-environment trials and how those methods may increase accuracy reducing phenotyping costs. The combined use of genomic and enviromic data efficiently designs optimized training sets for genomic prediction, improving the response to selection per dollar invested.

BMC PLANT BIOLOGY (2023)

Article Genetics & Heredity

Integrating Parental Phenotypic Data Enhances Prediction Accuracy of Hybrids in Wheat Traits

Osval A. A. Montesinos-Lopez, Alison R. R. Bentley, Carolina Saint Pierre, Leonardo Crespo-Herrera, Josafhat Salinas Ruiz, Patricia Edwigis Valladares-Celis, Abelardo Montesinos-Lopez, Jose Crossa

Summary: Genomic selection (GS) is a revolutionary plant breeding method that allows the selection of candidate genotypes without the need for field phenotypic evaluation. This study investigated the genomic prediction accuracy of wheat hybrids by incorporating covariates with parental phenotypic information into the model. The results showed that the models with parental information outperformed those without parental information, and the inclusion of covariates significantly improved prediction accuracy compared to marker information. However, the use of parental phenotypic information as covariates is expensive and not always available.
Article Biotechnology & Applied Microbiology

Two simple methods to improve the accuracy of the genomic selection methodology

Osval A. Montesinos-Lopez, Abelardo Kismiantini, Abelardo Montesinos-Lopez

Summary: Genomic selection (GS) is being revolutionized in plant and animal breeding, but its practical implementation faces challenges due to uncontrolled factors. To improve prediction accuracy, this paper proposes two methods: reformulating GS as a binary classification problem, and applying postprocessing to adjust the classification threshold. Both methods outperformed the conventional regression model, with the postprocessing method showing better results.

BMC GENOMICS (2023)

Article Genetics & Heredity

Multimodal deep learning methods enhance genomic prediction of wheat breeding

Abelardo Montesinos-Lopez, Carolina Rivera, Francisco Pinto, Francisco Pinera, David Gonzalez, Mathew Reynolds, Paulino Perez-Rodriguez, H. Li, Osval A. Montesinos-Lopez, Jose Crossa

Summary: By comparing a novel DL method with conventional GP models, this study found that DL method has higher accuracy in predicting genomic phenotypes in plant breeding research and can account for the complexity of genotype-environment interaction. However, traditional GP models can also achieve high accuracy in certain situations.

G3-GENES GENOMES GENETICS (2023)

Article Plant Sciences

Efficacy of plant breeding using genomic information

Osval A. Montesinos-Lopez, Alison R. Bentley, Carolina Saint Pierre, Leonardo Crespo-Herrera, Leonardo Rebollar-Ruellas, Patricia Edwigis Valladares-Celis, Morten Lillemo, Abelardo Montesinos-Lopez, Jose Crossa

Summary: Genomic selection (GS), proposed by Meuwissen et al. more than 20 years ago, is revolutionizing plant and animal breeding. In our study of 14 real datasets, we found that the average gain in prediction accuracy when genomic information is considered was 26.31%. The quality of the markers and relatedness of the individuals can greatly impact the increase in prediction accuracy.

PLANT GENOME (2023)

Article Environmental Sciences

Yield Adjustment Using GPR-Derived Spatial Covariance Structure in Cassava Field: A Preliminary Investigation

Afolabi Agbona, Osval A. Montesinos-Lopez, Mark E. Everett, Henry Ruiz-Guzman, Dirk B. Hays

Summary: Many aspects of below-ground plant performance are not fully understood, including their spatial and temporal dynamics in relation to environmental factors. In this study, Ground-Penetrating Radar (GPR) was evaluated for its potential in normalizing spatial heterogeneity and estimating fresh root yield in a cassava field trial. The results showed that the GPR-based autoregressive (AR) model outperformed other models, indicating the potential of GPR in non-destructive yield estimation and field spatial heterogeneity normalization in root and tuber crop programs.

REMOTE SENSING (2023)

Article Agronomy

Designing optimal training sets for genomic prediction using adversarial validation with probit regression

Osval Montesinos-Lopez, Kismiantini, Abelardo Montesinos-Lopez

Summary: Genomic selection is revolutionizing animal and plant breeding, but its implementation faces challenges due to mismatch in training and testing set distributions. This research used the adversarial validation method with probit regression to address the distribution mismatch and select optimal training sets. Evaluations showed that the proposed method effectively detected the mismatch and outperformed existing methods, achieving higher prediction accuracy.

PLANT BREEDING (2023)

暂无数据