Article
Multidisciplinary Sciences
Omid Bazgir, James Lu
Summary: Robust and accurate survival prediction is crucial in pharmacogenomics, however, current machine learning tools lack predictive performance and model interpretability. In this study, we extend the application of REFINED-CNN to survival predictions using high-dimensional RNA sequencing data. We show that the REFINED-CNN survival model can be easily adapted to new tasks with low patient numbers, and it can provide both local and global interpretations of feature importance in survival prediction.
Review
Genetics & Heredity
Bader Arouisse, Tom P. J. M. Theeuwen, Fred A. van Eeuwijk, Willem Kruijer
Summary: The advances in high-throughput phenotyping have led to a greater number of secondary traits being observed, posing a challenge to improving genomic prediction for the target trait. Existing methods have limitations when dealing with a large number of secondary traits, emphasizing the need for novel approaches to enhance prediction accuracy.
FRONTIERS IN GENETICS
(2021)
Article
Business, Finance
Caio Vigo Pereira
Summary: By utilizing various frameworks and datasets with increasing predictors, efficient portfolios can be built effectively. Compared to previous studies using naive OLS and low-dimensional information sets, better out-of-sample results can be achieved by considering large conditioning information sets and using methods like variable selection, shrinkage methods, and factor models.
INTERNATIONAL REVIEW OF FINANCIAL ANALYSIS
(2021)
Article
Mathematics, Applied
Siwei Xia, Yuehan Yang, Hu Yang
Summary: Portfolio selection, a fundamental problem in finance, is addressed in this paper with considerations of dimensionality and market complexities. The focus is on passive portfolio management strategy, specifically index tracking, taking into account factors such as no-short sales, volatility, transaction costs, and limited effective samples. An effective method utilizing the nonconcave penalty SCAD and nonnegative constraint is proposed for high-dimensional sparse portfolio selection. Statistical properties and the Multiplicative Updates algorithm are studied, and comprehensive comparisons with existing nonnegative methods are provided through simulations and empirical analysis, revealing the superior performance of the proposed method.
APPLIED MATHEMATICS AND COMPUTATION
(2023)
Article
Chemistry, Analytical
Eric Martial Taguem, Luisa Mennicken, Anne-Claude Romain
Summary: The study developed a quantile regression model for methane estimation using MOS gas sensors over a municipal solid waste treatment plant subject to biogas leakages. Data processing involved drift correction, interaction addition, PCA analysis, and log transformation. The field-calibrated model demonstrated the effectiveness of data processing methods and highlighted the caution needed when using models with MOS gas sensors.
SENSORS AND ACTUATORS B-CHEMICAL
(2021)
Article
Biochemical Research Methods
Delora Baptista, Pedro G. Ferreira, Miguel Rocha
Summary: This article critically reviews recent studies that have used deep learning methods to predict drug response in cancer cell lines and introduces the characteristics of DL and the architectures used in these studies. It also provides an overview of publicly available drug screening data resources and discusses the limitations of these methods.
BRIEFINGS IN BIOINFORMATICS
(2021)
Article
Mathematical & Computational Biology
Yuzhe Zhang, Xu Zhang, Hong Zhang, Aiyi Liu, Catherine C. Liu
Summary: Motivated by diagnosing the COVID-19 disease using 2D image biomarkers, this study proposes a novel latent matrix-factor regression model. The model conducts dimension reduction respecting the geometric characteristics of the matrix covariate to avoid iteration and maintain structural information. The proposed approach outperforms existing methods and efficiently predicts COVID-19 based on simulation experiments and real-world dataset.
STATISTICS IN MEDICINE
(2023)
Article
Multidisciplinary Sciences
Faisal Maqbool Zahid, Shahla Faisal, Christian Heumann
Summary: In high-dimensional settings, Multiple Imputation (MI) is challenging, a semi-compatible imputation model is proposed by relaxing the lasso penalty and using a ridge penalty to address instability and convergence issues. The proposed approach shows superior performance to existing MI techniques in simulation studies and real-life datasets while addressing compatibility problems.
Article
Engineering, Electrical & Electronic
Boning Liu, Yan Zhao, Xiaomeng Jiang, Shigang Wang
Summary: In recent years, kernel methods have been extensively studied. This study introduces a 3-D Epanechnikov Mixture Regression (EMR) based on Epanechnikov Kernel (EK) for image coding. The research improves the EM algorithm with mean square error optimization and proposes an Adaptive Mode Selection (AMS) algorithm for optimal modeling in combination with Gaussian and Epanechnikov kernel.
Article
Biochemical Research Methods
Di He, Lei Xie
Summary: CLEIT is a novel network framework that aims to address challenges in predicting genotype-phenotype associations by integrating multiple incoherent omics data, learning latent representations of high-level domains, and leveraging unlabeled data to improve generalizability of predictive models. It demonstrates effectiveness in predicting anti-cancer drug sensitivity from somatic mutations with the assistance of gene expressions compared to state-of-the-art methods.
Article
Statistics & Probability
Jingwen Tu, Hu Yang, Chaohui Guo, Jing Lv
Summary: This article introduces a high-dimensional semiparametric model averaging approach for predicting the conditional quantile of the response variable. By estimating and selecting important model weights, the proposed method evaluates the finite sample performance through simulations and real data analysis.
STATISTICAL PAPERS
(2021)
Article
Multidisciplinary Sciences
Zhi Zhao, Shixiong Wang, Manuela Zucknick, Tero Aittokallio
Summary: Researchers developed a mix-lasso model that accurately predicts drug response and identifies tissue-specific predictive features, addressing the limitations of current statistical models in leveraging various cancer tissues and multi-omics profiles.
Article
Statistics & Probability
Shiyi Tang, Jiali Zheng
Summary: This paper proposes a penalized estimation method for finite mixtures of ultra-high dimensional regression models. A two-step procedure is used for order selection and variables selection, with a specific EM algorithm maximizing the penalized log-likelihood function. Numerical simulations and an empirical study validate the effectiveness of the method.
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS
(2022)
Article
Biochemistry & Molecular Biology
Alexander L. R. Lubbock, Leonard A. Harris, Vito Quaranta, Darren R. Tyson, Carlos F. Lopez
Summary: Thunor is an open-source software platform designed to manage, analyze, and visualize large dose-dependent cell proliferation datasets. It supports both end-point and time-based proliferation assays, providing a user-friendly interface with interactive plots.
NUCLEIC ACIDS RESEARCH
(2021)
Review
Biochemical Research Methods
Fangfang Xia, Jonathan Allen, Prasanna Balaprakash, Thomas Brettin, Cristina Garcia-Cardona, Austin Clyde, Judith Cohn, James Doroshow, Xiaotian Duan, Veronika Dubinkina, Yvonne Evrard, Ya Ju Fan, Jason Gans, Stewart He, Pinyi Lu, Sergei Maslov, Alexander Partin, Maulik Shukla, Eric Stahlberg, Justin M. Wozniak, Hyunseung Yoo, George Zaki, Yitan Zhu, Rick Stevens
Summary: To enable personalized cancer treatment, machine learning models have been developed to predict drug response based on tumor and drug features. This study used machine learning to analyze five publicly available cell line-based data sets and rigorously evaluated the model generalizability between different studies. The results showed that a multitasking deep neural network achieved the best cross-study generalizability, with models trained on the CTRP data set providing the most accurate predictions on testing data, and the gCSI data set being the most predictable among the cell line data sets.
BRIEFINGS IN BIOINFORMATICS
(2022)