Article
Biochemistry & Molecular Biology
Shin June Kim, Youngjae Oh, Jaesik Jeong
Summary: With advancements in technology, analyzing complex and large-scale data requires more advanced techniques. In comparing different methods for analyzing omics data from two different groups, controlling error rates such as false discovery rate is crucial. Three methods were selected for comparison study, with Efron's approach being one-dimensional and Ploner's and Kim's approaches being two-dimensional. Variants of Ploner's approach were also considered in the performance comparison on simulated and real data.
Article
Biochemical Research Methods
Arya Ebadi, Jack Freestone, William S. Noble, Uri Keich
Summary: Controlling the false discovery rate (FDR) in proteomics experiments using target decoy competition (TDC) only controls the average proportion of false discoveries. However, the actual proportion of false discoveries (FDP) may exceed the specified FDR threshold. We demonstrate this using real data and present two methods, FDP Stepdown and TDC Uniform Band, which help bridge the gap between controlling the expected FDR and the empirical FDP.
JOURNAL OF PROTEOME RESEARCH
(2023)
Article
Statistics & Probability
Gunnar Taraldsen, Jarle Tufto, Bo H. Lindqvist
Summary: The paper discusses the use of symmetry assumptions when actual prior knowledge is not easily accessible for complex models, often resulting in an improper choice of prior. It introduces a theoretical framework for statistics that includes both improper priors and improper posteriors, with a focus on the transformation from prior to posterior knowledge. Through examples like Markov Chain Monte Carlo simulations and data on tropical butterfly species, it illustrates how improper posteriors can naturally occur and be beneficial, extending the conventional Bayesian inference defined by Kolmogorov's axioms to accommodate new constructions based on Renyi's axioms for conditional probability spaces.
SCANDINAVIAN JOURNAL OF STATISTICS
(2022)
Article
Computer Science, Interdisciplinary Applications
Cristiano Villa, Stephen G. Walker
Summary: This paper presents a new perspective on the use of improper priors in Bayes factors for model comparison. It introduces an alternative approach that establishes the value of the constant in the Bayes factor by matching divergences between density functions. This method, unlike existing ones, does not require any input from the experimenter and is fully automated.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2022)
Review
Biochemical Research Methods
Mitra Ebrahimpoor, Jelle J. Goeman
Summary: Volcano plots are commonly used to select interesting discoveries, but they may lead to inflated false discovery rates. We demonstrate this issue with simulation experiments and data, and propose alternative approaches for multiple testing that do not inflate the false discovery rate.
BRIEFINGS IN BIOINFORMATICS
(2021)
Article
Biochemical Research Methods
Yan Liu, Hao Liang, Quan Zou, Zengyou He
Summary: The identification of essential proteins is an important problem in bioinformatics. Existing methods have limitations in providing context-free and easily interpretable quantifications of centrality values, specifying proper thresholds, and controlling the quality of reported essential proteins. To overcome these limitations, this study formulates the essential protein discovery problem as a multiple hypothesis testing problem and presents a significance-based method named SigEP. Experimental results demonstrate that SigEP outperforms competing algorithms.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
(2022)
Article
Engineering, Electrical & Electronic
Mingye Ju, Can Ding, Wenqi Ren, Yi Yang
Summary: In this research, a new image dehazing technique called IDBP is developed using a robust and promising atmospheric scattering model. It overcomes the limitations of existing techniques by incorporating multiple priors and minimal information loss principle. The proposed technique consists of two modules, the atmospheric light estimation module and the multiple prior constraint module, and outperforms the state-of-the-art alternates according to numerous experiments.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2022)
Article
Mathematical & Computational Biology
Xiaoya Sun, Yan Fu
Summary: The paper proposes a competition-based method, TDfdr, to accurately estimate the false discovery rate (fdr) without relying on P-values or known null distributions. Extensive simulation studies and real biomedical tasks demonstrate that TDfdr has higher discovery power.
STATISTICS IN MEDICINE
(2023)
Article
Automation & Control Systems
Xintao Xia, Zhanrui Cai
Summary: This paper proposes a differentially private adaptive FDR control method that can protect individual information and precisely control the FDR metric. By using a novel p-value transformation and a mirror peeling algorithm, privacy protection and optimal stopping technique are achieved. Empirical studies demonstrate that this method can better control FDR while reducing computation cost.
JOURNAL OF MACHINE LEARNING RESEARCH
(2023)
Article
Biochemical Research Methods
Dominik Madej, Long Wu, Henry Lam
Summary: This study introduces a new method called CDD for FDR estimation, utilizing a fixed empirical null score distribution. Benchmarking CDD against decoy-based PeptideProphet showed similar accuracy and stability in retrieving correct PSMs. This finding highlights the potential of Big Data approaches for statistical analysis in proteomics and questions the necessity of dataset-specific target-decoy searches.
JOURNAL OF PROTEOME RESEARCH
(2022)
Article
Statistics & Probability
Iris Ivy M. Gauran, Junyong Park, Ilia Rattsev, Thomas A. Peterson, Maricel G. Kann, DoHwan Park
Summary: In cancer research at the molecular level, understanding the role of somatic mutations in the initiation or progression of cancer is crucial. Recently, studying cancer somatic variants at the protein domain level has become important for uncovering functionally related mutations. The main challenge is to identify protein domain hotspots with significantly high mutation frequency.
ANNALS OF APPLIED STATISTICS
(2022)
Article
Statistics & Probability
Hongyuan Cao, Jun Chen, Xianyang Zhang
Summary: The article introduces a method to improve the statistical power of large-scale multiple testing by utilizing auxiliary information in high-dimensional statistical inference. By using a framework based on a two-group mixture model and imposing structural relationship constraints and an optimal rejection rule to control the false discovery rate, the method's power is enhanced. The advantages of the proposed method are verified through empirical and theoretical analysis.
ANNALS OF STATISTICS
(2022)
Article
Statistics & Probability
Sanat K. Sarkar, Zhigen Zhao
Summary: This paper extends previous research and proposes a novel framework for multiple testing of hypotheses using hypothesis-specific local false discovery rates. It captures the group structure of hypotheses more effectively by extending the standard two-class mixture model. The proposed methods show higher power in simulation studies and real-data applications.
ELECTRONIC JOURNAL OF STATISTICS
(2022)
Article
Management
Ron Berman, Christophe Van den Bulte
Summary: The study reveals that up to 70% of significant results in website A/B testing are actually null effects, leading to high false discovery rates. Decision makers should be aware that one in five interventions achieving significance at a 5% confidence level may be ineffective in practice.
MANAGEMENT SCIENCE
(2021)
Article
Statistics & Probability
Xianyang Zhang, Jun Chen
Summary: This article introduces an FDR control procedure that can incorporate covariate information in large-scale inference problems. The proposed procedure is implemented using a fast algorithm and has been shown to have asymptotic validity even in cases of misspecified models and weakly dependent p-values. Extensive simulations demonstrate that the method improves upon existing approaches in terms of flexibility, robustness, power, and computational efficiency. The method is applied to omics datasets from genomics studies to identify features associated with clinical and biological phenotypes, and shows superiority, particularly in sparse signal scenarios.
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
(2022)
Letter
Anthropology
Ornella Comandini, Stefano Cabras, Jude T. Ssensamba, Justine N. Bukenya, Alessandro Cipriano, Giovanni Carmignani, Gabriele Carmignani, Elisabetta Marini
AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY
(2020)
Article
Anthropology
Magdalena Zeglen, Elisabetta Marini, Stefano Cabras, Lukasz Kryst, Rituparna Das, Anindita Chakraborty, Parasmani Dasgupta
AMERICAN JOURNAL OF HUMAN BIOLOGY
(2020)
Article
Physics, Multidisciplinary
Stefano Cabras
Article
Statistics & Probability
Stefano Cabras, Maria Eugenia Castellanos, Oliver Ratmann
Summary: A general and computationally inexpensive Monte Carlo framework for obtaining p-values asymptotically uniformly distributed in [0, 1] is proposed for models with intractable likelihoods. Numerical investigations show favorable power properties in detecting actual model discrepancies compared to other diagnostic approaches. The technique is illustrated on analytically tractable examples and a complex tuberculosis transmission model.
Article
Mathematics
Gonzalo Garcia-Donato, Maria Eugenia Castellanos, Alicia Quiros
Summary: This paper introduces the basic concepts of Bayesian approach for variable selection, emphasizing model choice, model space prior adoption and sampling algorithms. Applications to genetics and cardiology demonstrate the importance of this approach in expanding access to Bayesian methods for variable selection.
Article
Nutrition & Dietetics
Silvia Stagi, Alfredo Irurtia, Joaquim Rosales Rafel, Stefano Cabras, Roberto Buffa, Marta Carrasco-Marginet, Jorge Castizo-Olier, Elisabetta Marini
Summary: The study aimed to analyze the association between specific bioelectric impedance vector analysis (BIVA) and dual-energy X-ray absorptiometry (DXA) to assess segmental body composition, using DXA as the reference technique. Results showed a good agreement between DXA and BIVA, with specific BIVA demonstrating to balance the effect of body size on bioelectrical measurements in both whole and segmental approaches. Specific BIVA represents a promising technique for monitoring segmental body composition changes in sport science and clinical applications.
CLINICAL NUTRITION
(2021)
Article
Mathematics
Stefano Cabras
Summary: This approach combines deep learning techniques and a Bayesian model to estimate the evolution of COVID-19 in Spain, providing a suitable description of count time series and predicting future evolution or consequences of scenarios.
Article
Biology
Gonzalo Garcia-Donato, Stefano Cabras, Maria Eugenia Castellanos
Summary: This study addresses covariate selection and model uncertainty in Cox regression using a probabilistic approach within a Bayesian framework. The selection of suitable prior for model parameters is a critical element, and we derive the conventional prior approach and propose a comprehensive implementation for automatic selection. Our simulation studies and real applications demonstrate improvements over existing literature. To enhance reproducibility and appeal to practitioners, a web application with minimal statistical knowledge requirement is developed to implement the proposed approach.
Article
Economics
Stefano Cabras, Marco Delogu, J. D. Tena
Summary: This study examines the impact of scheduled tasks on the productivity of working teams, and how team size and external conditions may modify this impact. The research finds that participating in UEFA Champions League matches significantly affects the performance of teams in domestic league matches, especially for small teams. Additionally, away teams tend to react more conservatively by increasing their probability of drawing.
Article
Geosciences, Multidisciplinary
Stefano Cabras, Sun He
Summary: The growing population density in cities requires fast and accurate urban transportation to meet citizens' travel needs. This study proposes a Bayesian spatial-temporal model for predicting station occupancy in urban subway transportation. The model provides point estimations of daily passenger flow, reliable assessment of uncertainty, and understanding of traffic features. It also meets the prediction accuracy standards of the Beijing Metro enterprise. The discussed model is currently implemented at Beijing Metro Group Ltd for daily train scheduling.
SPATIAL STATISTICS
(2023)
Article
Economics
Stefano Cabras, J. D. Tena
Summary: This study proposes a methodology to identify tacit organizational incentives based on direct observations of institutional reactions to operational decisions. The results suggest the presence of institutional incentives for referees in Spanish football to take gradual decisions in order to achieve the expected outcome of the game. The implications of these findings in organizations are discussed.
MANAGERIAL AND DECISION ECONOMICS
(2023)
Article
Mathematics, Interdisciplinary Applications
Maria Eugenia Castellanos, Gonzalo Garcia-Donato, Stefano Cabras
Summary: The study investigates variable selection in the presence of censoring in the response, using model selection and an objective Bayesian perspective to address the issue. It demonstrates that a new methodology specifically developed for survival problems outperforms standard approaches in terms of variable selection.