Article
Multidisciplinary Sciences
Iqra Babar, Hamdi Ayed, Sohail Chand, Muhammad Suhail, Yousaf Ali Khan, Riadh Marzouki
Summary: This study proposes new estimators for shrinkage parameter in the context of multiple linear regression models with multicollinearity. The new estimators outperform existing ones in scenarios with high and severe multicollinearity. The empirical application using tobacco dataset supports the superiority of the new estimators.
Article
Multidisciplinary Sciences
Mahdi Roozbeh, Arta Rouhi, Nur Anisah Mohamed, Fatemeh Jahadi
Summary: Classical regression approaches are not suitable for analyzing high-dimensional datasets with more explanatory variables than observations, as the results can be misleading. In this study, we propose using modern techniques like support vector regression, symmetry functional regression, ridge, and lasso regression methods to analyze such data. We introduce a generalized support vector regression approach that improves the performance of support vector regression by accurately estimating the penalty parameter using cross-validation. We evaluate the efficiency of the proposed estimators based on three criteria and apply them to real and simulated high-dimensional datasets.
Article
Forestry
Seongsu Park, Byung-Dae Park, Yongku Kim
Summary: This study analyzed the influence of manufacturing variables on the properties of MDF panels using Lasso and Elastic-net regression. The results showed that panel density and modulus of rupture were closely related to specific manufacturing variables. However, it is still unclear which variables predominantly affect the formaldehyde emission of MDF.
EUROPEAN JOURNAL OF WOOD AND WOOD PRODUCTS
(2023)
Article
Public, Environmental & Occupational Health
Sewwandi Bandara, Wakana Oishi, Syun-suke Kadoya, Daisuke Sano
Summary: The majority of viral outbreaks are fast-spreading events established within 2-10 hours, depending on the decay rates of viruses, which determine the critical time interval for successful transmission between humans. By calculating the decay rate values for different surfaces and aerosols, we determined the best estimations for respiratory viruses, including SARS-CoV-2, SARS-CoV, MERS-CoV, influenza viruses, and RSV. The decay rate values in aerosols for these viruses were found to be 4.83 +/- 5.70, 0.40 +/- 0.24, 0.11 +/- 0.04, 2.43 +/- 5.94, and 1.00 +/- 0.50 h-1, respectively. The choice of regression model varied based on virus type, with Bayesian regression performing better for SARS-CoV-2 and influenza viruses, and ridge regression performing better for SARS-CoV and MERS-CoV. Simulation using improved estimations can facilitate the discovery of effective non-pharmaceutical interventions for virus control.
INTERNATIONAL JOURNAL OF HYGIENE AND ENVIRONMENTAL HEALTH
(2023)
Article
Chemistry, Multidisciplinary
Sangsung Park, Sunghae Jun
Summary: This paper proposes a statistical method for quantitative patent analysis and applies it to analyze drone technology. By transforming patent documents into structured data using text mining techniques and applying Bayesian additive regression trees for analysis, technology scenarios for drones are constructed.
APPLIED SCIENCES-BASEL
(2022)
Article
Engineering, Electrical & Electronic
Jiahao Xu, Sai Tang, Pengyan Li, Hexu Zhang
Summary: Based on literature review and empirical analysis, it is found that the increase in grain output is mainly attributed to the increase in sown area of grain crops. The use of agricultural fertilizers and the increase in rural electricity consumption are also driving factors. However, the impact of total power of agricultural machinery is limited, and natural disasters have a certain negative impact on food production.
JOURNAL OF SENSORS
(2022)
Article
Green & Sustainable Science & Technology
Vincent Tanoe, Saul Henderson, Amir Shahirinia, Mohammad Tavakoli Bina
Summary: This study utilizes frequentist statistical methods and machine learning techniques to conduct multivariate linear regression analysis on wind speed data, examining the differences between the Bayesian and non-Bayesian approaches and identifying key features. The results demonstrate close coefficient estimations and parameters between the Bayesian and non-Bayesian methods, with daily data showing strong coefficient estimations and the highest R-squared values compared to hourly and weekly data.
JOURNAL OF RENEWABLE AND SUSTAINABLE ENERGY
(2021)
Article
Automation & Control Systems
Fouzi Douak, Noureddine Ghoggali, Rachid Hedjam, Mohamed Lamine Mekhalfi, Nabil Benoudjit, Farid Melgani
Summary: This work introduces a new algorithm that combines nonlinear kernel regressors with optimization based on a multi-objective genetic algorithm to improve techniques used in spectroscopic data regression analysis. The algorithm simultaneously optimizes multiple complementary objectives for better outlier detection.
JOURNAL OF CHEMOMETRICS
(2021)
Article
Environmental Sciences
Shu Ji, Chen Gu, Xiaobo Xi, Zhenghua Zhang, Qingqing Hong, Zhongyang Huo, Haitao Zhao, Ruihong Zhang, Bin Li, Changwei Tan
Summary: In this study, the correlation between canopy reflectance spectrum and leaf area index (LAI) of rice was analyzed. Estimation models based on characteristic bands and vegetation indices were able to accurately predict the LAI of rice, meeting the requirements of large-scale statistical monitoring in the field.
Article
Economics
Luc Bauwens, Guillaume Chevillon, Sebastien Laurent
Summary: Two recent studies have found conditions for large dimensional networks or systems to generate long memory. Building on these findings, we propose a multivariate methodology for modeling and forecasting series with long range dependence. By incorporating long memory properties in a vector autoregressive system of order 1, and applying Bayesian estimation or ridge regression, we outperform univariate time series long memory models in forecasting daily volatility for 250 U.S. company stocks over twelve years. This empirical validation supports the theoretical results that long memory can be sourced from marginalization within a large dimensional system.
JOURNAL OF ECONOMETRICS
(2023)
Article
Plant Sciences
Ce Liu, Xiaoxiao Liu, Yike Han, Xi'ao Wang, Yuanyuan Ding, Huanwen Meng, Zhihui Cheng
Summary: Genomic prediction was applied in cucumber breeding, demonstrating high predictive ability of GCA models for cucumber traits. Non-additive effects significantly influenced trait prediction, with a relatively higher proportion of additive-by-additive genetic variance components.
FRONTIERS IN PLANT SCIENCE
(2021)
Article
Multidisciplinary Sciences
Adewale F. Lukman, Emmanuel Adewuyi, Kristofer Mansson, B. M. Golam Kibria
Summary: A new estimator is proposed in this study to address the issue of multicollinearity in Poisson regression models, and both simulation experiments and real-life application demonstrate its superior performance compared to other estimators.
SCIENTIFIC REPORTS
(2021)
Article
Multidisciplinary Sciences
Faisal Maqbool Zahid, Shahla Faisal, Christian Heumann
Summary: In high-dimensional settings, Multiple Imputation (MI) is challenging, a semi-compatible imputation model is proposed by relaxing the lasso penalty and using a ridge penalty to address instability and convergence issues. The proposed approach shows superior performance to existing MI techniques in simulation studies and real-life datasets while addressing compatibility problems.
Article
Geosciences, Multidisciplinary
Jianfeng Sun, Tiesheng Yan, Jinshu Hu, Chao Ma, Jiajun Gao, Hui Xu
Summary: This paper proposes a susceptibility assessment model based on the frequency ratio (FR) coupled with multiple regression analysis to overcome the challenges of complex parameters and limited historical hazards data. Five coupled models were established and verified, with FR-PLSR and FR-RR showing better performance. The recommended models provide an effective approach for high-accuracy landslide assessment and prevention at the slope scale.
Article
Remote Sensing
Jin Xu, Lindi J. Quackenbush, Timothy A. Volk, Stephen Stehman
Summary: This study evaluated the uncertainty of shrub willow health characterization based on unmanned aerial systems (UAS) data. The results showed that regression models built at different spatial scales could be applied across time, space, and scales. The study also quantified the uncertainty of model parameters and found that the uncertainty increased as pixel size increased. The findings provide guidance for future experimental design to save resources.
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION
(2022)
Article
Genetics & Heredity
Abelardo Montesinos-Lopez, Daniel E. Runcie, Maria Itria Ibba, Paulino Perez-Rodriguez, Osval A. Montesinos-Lopez, Leonardo A. Crespo, Alison R. Bentley, Jose Crossa
Summary: Implementing genomic-based prediction models in genomic selection involves understanding how to evaluate prediction accuracy from different models and methods using multi-trait data. This study compared prediction accuracy using six large multi-trait wheat datasets and found that a corrected Pearson's correlation method was more accurate than the traditional method. For grain yield, using a multi-trait model yielded higher prediction performance compared to a single-trait model, with the benefits increasing as genetic correlations between traits strengthen.
G3-GENES GENOMES GENETICS
(2021)
Article
Genetics & Heredity
Osval Antonio Montesinos-Lopez, Jose Cricelio Montesinos-Lopez, Abelardo Montesinos-Lopez, Juan Manuel Ramirez-Alcaraz, Jesse Poland, Ravi Singh, Susanne Dreisigacker, Leonardo Crespo, Sushismita Mondal, Velu Govidan, Philomin Juliana, Julio Huerta Espino, Sandesh Shrestha, Rajeev K. Varshney, Jose Crossa
Summary: This study explores Bayesian multitrait kernel methods for genomic prediction and finds that the Gaussian kernel method outperforms traditional methods in prediction performance, capturing nonlinear patterns more efficiently. Evaluating multiple kernels to select the best one is recommended.
G3-GENES GENOMES GENETICS
(2022)
Article
Plant Sciences
Osval Antonio Montesinos-Lopez, Abelardo Montesinos-Lopez, Ricardo Acosta, Rajeev K. Varshney, Alison Bentley, Jose Crossa
Summary: Genomic selection is a predictive method used in plant breeding that trains machine learning models with a reference population to predict new lines. This study proposes using incomplete block designs for allocating lines to locations, which outperforms random allocation in terms of predictive performance.
Article
Plant Sciences
Osval Antonio Montesinos-Lopez, Henry Nicole Gonzalez, Abelardo Montesinos-Lopez, Maria Daza-Torres, Morten Lillemo, Jose Cricelio Montesinos-Lopez, Jose Crossa
Summary: Genomic selection is a predictive methodology that is changing plant breeding. In this study, the performance of two algorithms (TGBLUP and GBM) was compared on wheat datasets, and GBM outperformed TGBLUP in terms of prediction accuracy. Further research is encouraged to explore the virtues of GBM in genomic selection.
Article
Genetics & Heredity
Osval Antonio Montesinos Lopez, Brandon Alejandro Mosqueda Gonzalez, Abel Palafox Gonzalez, Abelardo Montesinos Lopez, Jose Crossa
Summary: This paper presents a new software package (SKM) for implementing six popular supervised machine learning algorithms with the optional use of sparse kernels, as well as a function for computing seven different kernels. SKM focuses on user simplicity and computational efficiency, providing a user-friendly format for algorithms and reducing resources needed for kernel machine learning methods.
FRONTIERS IN GENETICS
(2022)
Article
Genetics & Heredity
Osval A. Montesinos-Lopez, Abelardo Montesinos-Lopez, Kismiantini, Armando Roman-Gallardo, Keith Gardner, Morten Lillemo, Roberto Fritsche-Neto, Jose Crossa
Summary: Improved prediction of future seasons or new environments is crucial for plant breeding. This study demonstrates that the partial least squares regression method outperforms the Bayesian genomic best linear unbiased predictor method in predicting future seasons or new environments.
FRONTIERS IN GENETICS
(2022)
Article
Genetics & Heredity
Osval A. Montesinos-Lopez, Abelardo Montesinos-Lopez, Bernabe Cano-Paez, Carlos Moises Hernandez-Suarez, Pedro C. Santana-Mancilla, Jose Crossa
Summary: Genomic selection has revolutionized the way plant breeders select genotypes, using statistical machine learning models to predict phenotypic values of new lines. Multi-trait genomic prediction models leverage correlated traits to improve accuracy. This paper compares the performance of three multi-trait methods and finds that their performance varies under different predictors.
Article
Genetics & Heredity
Osval A. Montesinos-Lopez, Abelardo Montesinos-Lopez, David Alejandro Bernal Sandoval, Brandon Alejandro Mosqueda-Gonzalez, Marco Alberto Valenzo-Jimenez, Jose Crossa
Summary: The genomic selection methodology has revolutionized plant breeding by using statistical machine learning algorithms to predict candidate individuals. However, it faces challenges when predicting future seasons or new environments. This study compared the performance of the multi-trait partial least square (MT-PLS) regression method with the Bayesian Multi-trait Genomic Best Linear Unbiased Predictor (MT-GBLUP) method and found that MT-PLS outperforms MT-GBLUP in predicting future seasons or new environments.
FRONTIERS IN GENETICS
(2022)
Article
Plant Sciences
Raysa Gevartosky, Humberto Fanelli Carvalho, Germano Costa-Neto, Osval A. Montesinos-Lopez, Jose Crossa, Roberto Fritsche-Neto
Summary: This study aimed to design optimized training sets for genomic prediction considering multi-trait multi-environment trials and how those methods may increase accuracy reducing phenotyping costs. The combined use of genomic and enviromic data efficiently designs optimized training sets for genomic prediction, improving the response to selection per dollar invested.
Article
Genetics & Heredity
Osval A. A. Montesinos-Lopez, Alison R. R. Bentley, Carolina Saint Pierre, Leonardo Crespo-Herrera, Josafhat Salinas Ruiz, Patricia Edwigis Valladares-Celis, Abelardo Montesinos-Lopez, Jose Crossa
Summary: Genomic selection (GS) is a revolutionary plant breeding method that allows the selection of candidate genotypes without the need for field phenotypic evaluation. This study investigated the genomic prediction accuracy of wheat hybrids by incorporating covariates with parental phenotypic information into the model. The results showed that the models with parental information outperformed those without parental information, and the inclusion of covariates significantly improved prediction accuracy compared to marker information. However, the use of parental phenotypic information as covariates is expensive and not always available.
Article
Biotechnology & Applied Microbiology
Osval A. Montesinos-Lopez, Abelardo Kismiantini, Abelardo Montesinos-Lopez
Summary: Genomic selection (GS) is being revolutionized in plant and animal breeding, but its practical implementation faces challenges due to uncontrolled factors. To improve prediction accuracy, this paper proposes two methods: reformulating GS as a binary classification problem, and applying postprocessing to adjust the classification threshold. Both methods outperformed the conventional regression model, with the postprocessing method showing better results.
Article
Genetics & Heredity
Abelardo Montesinos-Lopez, Carolina Rivera, Francisco Pinto, Francisco Pinera, David Gonzalez, Mathew Reynolds, Paulino Perez-Rodriguez, H. Li, Osval A. Montesinos-Lopez, Jose Crossa
Summary: By comparing a novel DL method with conventional GP models, this study found that DL method has higher accuracy in predicting genomic phenotypes in plant breeding research and can account for the complexity of genotype-environment interaction. However, traditional GP models can also achieve high accuracy in certain situations.
G3-GENES GENOMES GENETICS
(2023)
Article
Plant Sciences
Osval A. Montesinos-Lopez, Alison R. Bentley, Carolina Saint Pierre, Leonardo Crespo-Herrera, Leonardo Rebollar-Ruellas, Patricia Edwigis Valladares-Celis, Morten Lillemo, Abelardo Montesinos-Lopez, Jose Crossa
Summary: Genomic selection (GS), proposed by Meuwissen et al. more than 20 years ago, is revolutionizing plant and animal breeding. In our study of 14 real datasets, we found that the average gain in prediction accuracy when genomic information is considered was 26.31%. The quality of the markers and relatedness of the individuals can greatly impact the increase in prediction accuracy.
Article
Environmental Sciences
Afolabi Agbona, Osval A. Montesinos-Lopez, Mark E. Everett, Henry Ruiz-Guzman, Dirk B. Hays
Summary: Many aspects of below-ground plant performance are not fully understood, including their spatial and temporal dynamics in relation to environmental factors. In this study, Ground-Penetrating Radar (GPR) was evaluated for its potential in normalizing spatial heterogeneity and estimating fresh root yield in a cassava field trial. The results showed that the GPR-based autoregressive (AR) model outperformed other models, indicating the potential of GPR in non-destructive yield estimation and field spatial heterogeneity normalization in root and tuber crop programs.
Article
Agronomy
Osval Montesinos-Lopez, Kismiantini, Abelardo Montesinos-Lopez
Summary: Genomic selection is revolutionizing animal and plant breeding, but its implementation faces challenges due to mismatch in training and testing set distributions. This research used the adversarial validation method with probit regression to address the distribution mismatch and select optimal training sets. Evaluations showed that the proposed method effectively detected the mismatch and outperformed existing methods, achieving higher prediction accuracy.