Article
Statistics & Probability
Hongyuan Cao, Jun Chen, Xianyang Zhang
Summary: The article introduces a method to improve the statistical power of large-scale multiple testing by utilizing auxiliary information in high-dimensional statistical inference. By using a framework based on a two-group mixture model and imposing structural relationship constraints and an optimal rejection rule to control the false discovery rate, the method's power is enhanced. The advantages of the proposed method are verified through empirical and theoretical analysis.
ANNALS OF STATISTICS
(2022)
Article
Statistics & Probability
Xianyang Zhang, Jun Chen
Summary: This article introduces an FDR control procedure that can incorporate covariate information in large-scale inference problems. The proposed procedure is implemented using a fast algorithm and has been shown to have asymptotic validity even in cases of misspecified models and weakly dependent p-values. Extensive simulations demonstrate that the method improves upon existing approaches in terms of flexibility, robustness, power, and computational efficiency. The method is applied to omics datasets from genomics studies to identify features associated with clinical and biological phenotypes, and shows superiority, particularly in sparse signal scenarios.
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
(2022)
Article
Biochemical Research Methods
Belinda Phipson, Choon Boon Sim, Enzo R. Porrello, Alex W. Hewitt, Joseph Powell, Alicia Oshlack
Summary: Single cell RNA-Sequencing (scRNA-seq) is popular for profiling cell transcriptomes. propeller is a robust method leveraging biological replication to find significant differences in cell type proportions. It performs well in various scenarios.
Article
Statistics & Probability
Hsin-Cheng Huang, Noel Cressie, Andrew Zammit-Mangion, Guowen Huang
Summary: The study introduces a new EFDR-CS procedure for incomplete data defined on irregular small areas, using conditional simulation to estimate the signal and combining M p-values using copulas for hypothesis testing. This procedure is demonstrated through a simulation study and applications to real data in the Asia-Pacific region and Middle East, Afghanistan, and Pakistan.
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
(2021)
Article
Automation & Control Systems
Molei Liu, Yin Xia, Kelly Cho, Tianxi Cai
Summary: Identifying informative predictors in a high-dimensional regression model is crucial for association analysis and predictive modeling. Signal detection often fails in high-dimensional settings due to limited sample size, but meta-analyzing multiple studies can help improve power. Integrative analysis of high-dimensional data from different studies poses challenges, especially with data sharing constraints, but a new method called DSILT is proposed for signal detection without sharing individual-level data. The method incorporates proper estimation and debiasing procedures to construct test statistics for specific covariates, and a multiple testing procedure is developed to control false discovery rate and identify significant effects. Simulation studies show the proposed testing procedure performs well in controlling false discoveries and achieving power.
JOURNAL OF MACHINE LEARNING RESEARCH
(2021)
Article
Management
Ron Berman, Christophe Van den Bulte
Summary: The study reveals that up to 70% of significant results in website A/B testing are actually null effects, leading to high false discovery rates. Decision makers should be aware that one in five interventions achieving significance at a 5% confidence level may be ineffective in practice.
MANAGEMENT SCIENCE
(2021)
Article
Multidisciplinary Sciences
Otilia Menyhart, Boglarka Weltz, Balazs Gyorffy
Summary: Scientists across disciplines face the challenge of evaluating multiple hypotheses simultaneously, which requires consideration of statistical testing and confidence measures. Various strategies exist to address the issue of multiple hypothesis testing, with one approach being the use of multiple-testing correction methods.
Article
Mathematics, Interdisciplinary Applications
Iram Mushtaq, Qin Zhou, Xuemin Zi
Summary: In the era of big data, it is crucial to make timely and accurate decisions due to the arrival of high-dimensional data in streams. Identifying individuals with deviant behavior from the norm has become particularly important. The authors propose a large-scale dynamic testing system based on false discovery rate (FDR) control in order to detect as many irregular behavioral patterns as possible. By leveraging the sequential feature of datastreams, they develop a screening-assisted procedure that filters and tests streams in a sequential manner. The proposed method is shown to be accurate and powerful through simulation studies and a real-data example.
JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY
(2023)
Article
Health Care Sciences & Services
Lauren J. Beesley, Irina Bondarenko, Michael R. Elliot, Allison W. Kurian, Steven J. Katz, Jeremy M. G. Taylor
Summary: This paper describes how to generalize the sequential regression multiple imputation procedure to handle non-random missingness when missingness may depend on other variables. The method reduces bias in the final analysis compared to standard techniques, using approximation strategies involving inclusion of an offset in the imputation model.
STATISTICAL METHODS IN MEDICAL RESEARCH
(2021)
Article
Automation & Control Systems
Tijana Zrnic, Aaditya Ramdas, Michael Jordan
Summary: This study focuses on controlling the false discovery rate in asynchronous online testing, proposing a general framework that addresses dependency issues and improves existing algorithms. The use of conflict sets is highlighted as a way to better manage dependencies among test statistics.
JOURNAL OF MACHINE LEARNING RESEARCH
(2021)
Article
Biochemical Research Methods
Tristan Mary-Huard, Sarmistha Das, Indranil Mukhopadhyay, Stephane Robin
Summary: This study introduces the concept of composed hypothesis and rephrases the problem of testing complex hypotheses as a classification task, demonstrating that finding items for which the composed null hypothesis is rejected boils down to fitting a mixture model and classifying the items according to their posterior probabilities. The study showcases the efficiency and usefulness of the developed method in simulations and on two different applications, providing valuable biological insight.
Article
Multidisciplinary Sciences
Ludivine Obry, Cyril Dalmasso
Summary: In this study, we evaluated recent weighted multiple testing procedures for genome wide association studies (GWAS) through a simulation study. We also introduced a new efficient procedure called wBHa, which prioritizes the detection of genetic variants with low minor allel frequencies while maximizing overall detection power. Our results demonstrated that wBHa outperformed other procedures in detecting rare variants while maintaining good overall power.
Article
Biochemical Research Methods
Arya Ebadi, Jack Freestone, William S. Noble, Uri Keich
Summary: Controlling the false discovery rate (FDR) in proteomics experiments using target decoy competition (TDC) only controls the average proportion of false discoveries. However, the actual proportion of false discoveries (FDP) may exceed the specified FDR threshold. We demonstrate this using real data and present two methods, FDP Stepdown and TDC Uniform Band, which help bridge the gap between controlling the expected FDR and the empirical FDP.
JOURNAL OF PROTEOME RESEARCH
(2023)
Article
Biochemical Research Methods
Kyowon Jeong, Philipp T. Kaulich, Wonhyeuk Jung, Jihyung Kim, Andreas Tholey, Oliver Kohlbacher
Summary: Top-down proteomics provides more comprehensive proteoform-level information, but reliable data analysis remains challenging. The conventional FDR estimation method may not work at the proteoform level, and the precursor deconvolution error rate should be taken into account.
Article
Physics, Multidisciplinary
Viola Meroni, Carlo De Michele
Summary: This study investigates the application of multiple testing corrections in complex network analysis. By comparing four different methods, it is found that false discovery rate correction is a better option.
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS
(2022)