Editorial Material
Biochemistry & Molecular Biology
Maxwell A. Sherman
Summary: It is crucial for the biomedical community to protect the privacy of participants in genomic studies, and the accurate and efficient implementation of secure genotype imputation offers practical ways to safeguard sensitive genomic data for various bioinformatics applications.
Article
Multidisciplinary Sciences
Rui Marcalo, Sonya Neto, Miguel Pinheiro, Ana J. Rodrigues, Nuno Sousa, Manuel A. S. Santos, Paula Simao, Carla Valente, Lilia Andrade, Alda Marques, Gabriela R. Moura
Summary: This study reveals a high genetic heterogeneity for COVID-19 susceptibility and severity across global populations, and it suggests that the prognosis of patients with COPD is not related to genetic risk.
Article
Biotechnology & Applied Microbiology
Xiutao Pan, Zhong Li, Shengwei Qin, Minzhe Yu, Hang Hu
Summary: The novel method scLRTC, based on low-rank tensor completion, shows superior performance in imputing dropout values in scRNA-seq data compared to state-of-the-art tools. It excels in restoring gene expression levels and achieving accurate cell classification results on both simulated and real datasets.
Article
Multidisciplinary Sciences
Chih Chuan Shih, Jieqi Chen, Ai Shan Lee, Nicolas Bertin, Maxime Hebrard, Chiea Chuen Khor, Zheng Li, Joanna Hui Juan Tan, Wee Yang Meah, Su Qin Peh, Shi Qi Mok, Kar Seng Sim, Jianjun Liu, Ling Wang, Eleanor Wong, Jingmei Li, Aung Tin, Ching-Yu Cheng, Chew-Kiat Heng, Jian-Min Yuan, Woon-Puay Koh, Seang Mei Saw, Yechiel Friedlander, Xueling Sim, Jin Fang Chai, Yap Seng Chong, Sonia Davila, Liuh Ling Goh, Eng Sing Lee, Tien Yin Wong, Neerja Karnani, Khai Pang Leong, Khung Keong Yeo, John C. Chambers, Su Chi Lim, Rick Siow Mong Goh, Patrick Tan, Rajkumar Dorajoo
Summary: Genomic researchers are increasingly using commercial cloud service providers (CSPs) to manage data and analytics needs. However, without adequate security controls, the risk of unauthorized access to cloud-stored data may be higher. The Research Assets Provisioning and Tracking Online Repository (RAPTOR) by the Genome Institute of Singapore is a cloud-native genomics data repository and analytics platform that implements a five-safes framework to provide security and governance controls to data contributors and users, ensuring compliance with regulations.
Article
Biochemical Research Methods
Kristiina Ausmees, Carl Nettelblad
Summary: The study investigates the benefits of an imputation method based on haplotype frequencies in ancient DNA analysis, showing improved accuracy and ability to capture rare variation at lower coverages. The software prophaser is optimized for parallel processing on GPUs, offering reasonable runtimes in experiments.
Article
Engineering, Electrical & Electronic
Hanxuan Dong, Fan Ding, Huachun Tan, Yuankai Wu, Qin Li, Bin Ran
Summary: A novel tensor completion method is proposed in this paper for imputing missing data in the origin-destination matrices of rail transit. By establishing an OD-matrix tensor and extracting similarity matrices, the method successfully achieves accurate imputation of missing data.
IET INTELLIGENT TRANSPORT SYSTEMS
(2021)
Article
Genetics & Heredity
Alison R. Barton, Maxwell A. Sherman, Ronen E. Mukamel, Po-Ru Loh
Summary: This study leveraged haplotype sharing in the UK Biobank to impute exome-wide variants and identified significant associations involving rare protein-altering variants. The research revealed significant associations in multiple genes and proposed allelic series containing multiple "likely-causal" variants.
Article
Computer Science, Information Systems
Liqiao Yang, Jifei Miao, Kit Ian Kou
Summary: In this paper, a method is proposed to apply quaternion matrix framework to image completion, which approximates rank with new quaternion matrix logarithmic norm. Unlike traditional methods that handle RGB channels separately and may destroy the image structure, this method uses a pure quaternion matrix to preserve the image structure. The logarithmic norm is used to improve the accuracy of rank estimation, and experimental results show that this approach achieves superior performance in color image completion.
INFORMATION SCIENCES
(2022)
Article
Biochemistry & Molecular Biology
Zhichao Li, Xiaosen Jiang, Mingyan Fang, Yong Bai, Siyang Liu, Shujia Huang, Xin Jin
Summary: The Chinese Millionome Database (CMDB) is a database that contains low-coverage whole-genome sequencing (WGS) data from 141,431 unrelated healthy Chinese individuals, covering 9.04 million single nucleotide variants (SNV) with allele frequency information. The CMDB is the most representative and comprehensive Chinese population genome database to date, housing data from a multi-ethnic Chinese population with wide geographical distribution.
NUCLEIC ACIDS RESEARCH
(2023)
Article
Computer Science, Artificial Intelligence
Guorui Li, Guang Guo, Sancheng Peng, Cong Wang, Shui Yu, Jianwei Niu, Jianli Mo
Summary: This paper introduces a new approach to solve the low-rank matrix completion problem. By designing a new non-convex Schatten capped p norm, which balances between the rank and nuclear norm of the matrix, a matrix completion method is proposed. Through extensive experiments in image inpainting, the proposed method is shown to improve the accuracy of matrix completion compared with existing methods.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2022)
Article
Multidisciplinary Sciences
Adityanarayanan Radhakrishnan, George Stefanakis, Mikhail Belkin, Caroline Uhler
Summary: This paper proposes an infinite width neural network framework for matrix completion, which is simple, fast, and flexible. The effectiveness of the framework is demonstrated through competitive results in applications such as virtual drug screening and image inpainting/reconstruction.
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
(2022)
Article
Psychiatry
Bettina Konte, James T. R. Walters, Dan Rujescu, Sophie E. Legge, Antonio F. Pardinas, Dan Cohen, Munir Pirmohamed, Jari Tiihonen, Annette M. Hartmann, Jan P. Bogers, Jan van der Weide, Karen van der Weide, Anu Putkonen, Eila Repo-Tiihonen, Tero Hallikainen, Ed Silva, Oddur Ingimarsson, Engilbert Sigurdsson, James L. Kennedy, Patrick F. Sullivan, Marcella Rietschel, Gerome Breen, Hreinn Stefansson, Kari Stefansson, David A. Collier, Michael C. O'Donovan, Ina Giegling
Summary: The study suggests that the HLA-DQB1 gene plays a significant role in clozapine-induced agranulocytosis and neutropenia, indicating the involvement of immune system factors. Using local ancestry estimates can help identify risk variants and improve the prediction of hematological adverse effects.
TRANSLATIONAL PSYCHIATRY
(2021)
Article
Radiology, Nuclear Medicine & Medical Imaging
Ming Fan, You Zhang, Zhenyu Fu, Maosheng Xu, Shiwei Wang, Sangma Xie, Xin Gao, Yue Wang, Lihua Li
Summary: The DMC method proposed in this study significantly improved the imputation performance by integrating tumor histological and radiomics data. It showed better prediction performance compared to other methods, indicating its potential in tumor characterization and patient management.
Article
Environmental Sciences
Wei Zhang, Jian Yang, Qiang Li, Jingran Lin, Huaizong Shao, Guomin Sun
Summary: Cooperative electromagnetic data annotation is a crucial step in signal processing applications. This study proposes a low-rank matrix recovery approach for cooperative annotation, which effectively exploits the correlation of electromagnetic signals and observations from multiple receivers.
Article
Multidisciplinary Sciences
Sho Hosoya, Sota Yoshikawa, Mana Sato, Kiyoshi Kikuchi
Summary: The study found that the prediction accuracy of genomic selection for standard length, body weight, and testes weight in tiger pufferfish was within an acceptable range when using 4000 or 1200 SNPs. However, predictive abilities decreased with less than 1200 SNPs due to reduced accuracy in estimating genetic relationships among individuals.
SCIENTIFIC REPORTS
(2021)
Article
Biology
Christopher A. German, Janet S. Sinsheimer, Jin Zhou, Hua Zhou
Summary: The availability of longitudinal data from electronic health records and wearable devices has opened up new research questions. In many studies, individual variability of a longitudinal outcome is as important as the mean. This article proposes a scalable method, WiSER, for estimating and inferring the effects of predictors on within-subject variance. It is robust and computationally efficient.
Article
Mathematical & Computational Biology
Shanpeng Li, Ning Li, Hong Wang, Jin Zhou, Hua Zhou, Gang Li
Summary: This paper addresses the computational barriers in semiparametric joint models for longitudinal and competing risk survival data, and proposes customized linear scan algorithms to reduce computational complexities and significantly speed up the existing methods.
COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE
(2022)
Article
Statistics & Probability
Jocelyn T. Chi, Eric C. Chi
Summary: We introduce a user-friendly computational framework for implementing robust versions of structured regression methods. The framework allows robust regression with the L-2 criterion for additional structural constraints, without requiring complex tuning procedures. It can be used to identify heterogeneous subpopulations and can incorporate nonrobust structured regression solvers. We provide convergence guarantees for the framework and demonstrate its flexibility with examples.
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
(2022)
Article
Computer Science, Artificial Intelligence
Xiaoqian Liu, Eric C. Chi
Summary: The paper introduces a newly proposed regularizer called the generalized minimax concave (GMC) penalty, which maintains the convexity of the objective function. The paper focuses on signal recovery with the linearly involved GMC penalty and presents a new method for setting the matrix parameter and solving the penalty. The paper also analyzes the desirable properties of the solution path and applies the linearly involved GMC penalty to 1-D signal recovery and matrix regression, demonstrating its superior performance compared to the total variation (TV) regularizer.
PATTERN RECOGNITION LETTERS
(2022)
Article
Computer Science, Artificial Intelligence
Min Zhang, Gal Mishne, Eric C. Chi
Summary: The paper introduces a new method for constructing row and column affinities even when data are missing by leveraging a co-clustering technique. It exploits solving the optimization problem for multiple pairs of cost parameters and filling in missing values with increasingly smooth estimates. This approach takes advantage of the coupled similarity structure among both the rows and columns of a data matrix.
STATISTICAL ANALYSIS AND DATA MINING
(2022)
Article
Computer Science, Artificial Intelligence
Xinkai Zhou, Jin J. Zhou, Hua Zhou
Summary: The study introduces a highly efficient statistical method for analyzing very large longitudinal datasets, showing significant advantages over traditional methods.
STATISTICAL ANALYSIS AND DATA MINING
(2022)
Article
Statistics & Probability
Kenneth Lange, Hua Zhou
Summary: Nan Laird has made a significant impact on computational statistics, particularly in the areas of the expectation-maximisation algorithm and longitudinal modelling. This article revisits the derivation of some of her most useful algorithms, using the perspective of the minorisation-maximisation principle. The MM principle allows for a more straightforward implementation of the classical EM algorithm and suggests the potential for faster convergence in entirely new algorithms, particularly in high-dimensional settings.
INTERNATIONAL STATISTICAL REVIEW
(2022)
Article
Plant Sciences
Ryan Buck, Diego Ortega-Del Vecchyo, Catherine Gehring, Rhett Michelson, Dulce Flores-Renteria, Barbara Klein, Amy V. Whipple, Lluvia Flores-Renteria
Summary: This study evaluates the formation, structure, and maintenance of a multispecies interbreeding network, and finds that gene flow in syngameons can increase genetic diversity, facilitate colonization of new environments, and contribute to hybrid speciation. The study also demonstrates that participation in syngameons can maintain morphological and genetic distinctiveness at species boundaries, while allowing for extensive gene flow in sympatric areas.
Article
Mathematics, Applied
Joong-Ho Won, Teng Zhang, Hua Zhou
Summary: This paper studies an optimization problem on the sum of traces of matrix quadratic forms in m semiorthogonal matrices, which can be considered as a generalization of the synchronization of rotations. The paper shows that its semidefinite programming relaxation solves the original nonconvex problems exactly with high probability under an additive noise model with small noise in the order of O(m(1/4)). In addition, it shows that the sufficient condition for global optimality considered in a previous paper is also necessary under a similar small noise condition.
SIAM JOURNAL ON OPTIMIZATION
(2022)
Article
Statistics & Probability
Xiaoqian Liu, Eric C. Chi, Kenneth Lange
Summary: Building on previous research, this article focuses on estimation in robust structured regression under the L2E criterion. The authors propose a new algorithm for updating the regression coefficients using the majorization-minimization (MM) principle, which achieves faster convergence compared to the existing method. They also simplify and accelerate the estimation process by reparameterizing the model and estimating precision using a modified Newton's method. Additionally, the authors introduce distance-to-set penalties for constrained estimation, resulting in improved performance in coefficient estimation and structure recovery. The proposed tactics are validated through simulation examples and a real data application.
Article
Statistics & Probability
Qiang Heng, Hua Zhou, Eric C. Chi
Summary: Proximal Markov chain Monte Carlo is a novel approach that combines Bayesian computation with convex optimization to popularize the use of nondifferentiable priors in Bayesian statistics. This article extends the paradigm of proximal MCMC by introducing a new class of nondifferentiable priors called epigraph priors. The proposed method enables automated regularization parameter selection and achieves simultaneous calibration of mean, scale, and regularization parameters in a fully Bayesian framework.
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
(2023)
Article
Statistics & Probability
Maoran Xu, Hua Zhou, Yujie Hu, Leo L. Duan
Summary: In statistical applications, it is common to encounter parameters supported on a varying or unknown dimensional space. To avoid this issue, we propose a new generative process for the prior: starting from a continuous random variable, we transform it into a varying-dimensional space using the proximal mapping. This allows us to directly exploit popular frequentist regularizations and algorithms, while providing a principled and probabilistic uncertainty estimation.
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
(2023)
Article
Statistics & Probability
Qiang Heng, Eric C. Chi, Yufeng Liu
Summary: In this article, a robust Tucker decomposition estimator called Tucker-L2E, based on the L-2 criterion, is presented to enhance the robustness against outliers. Numerical experiments demonstrate that Tucker-L2E has stronger recovery performance in challenging high-rank scenarios compared to existing alternatives. The appropriate Tucker-rank can be selected in a data-driven manner using cross-validation or hold-out validation. The practical effectiveness of Tucker-L2E is validated on real data applications in fMRI tensor denoising, PARAFAC analysis of fluorescence data, and feature extraction for classification of corrupted images.
Article
Genetics & Heredity
Santiago G. Medina-Munoz, Diego Ortega-Del Vecchyo, Luis Pablo Cruz-Hervert, Leticia Ferreyra-Reyes, Lourdes Garcia-Garcia, Andres Moreno-Estrada, Aaron P. Ragsdale
Summary: This study used high-coverage whole-genome data and existing genomes from Latin America to infer the complex evolutionary history of Latin American populations. The models developed in this study provide a more accurate prediction of genetic variation in admixed populations and can be a valuable resource for future studies.
AMERICAN JOURNAL OF HUMAN GENETICS
(2023)