☆ 4.2 Article

What is an optimal value of k in k-fold cross-validation in discrete Bayesian network analysis?

COMPUTATIONAL STATISTICS (2021)

期刊

COMPUTATIONAL STATISTICS

卷 36, 期 3, 页码 2009-2031

出版社

SPRINGER HEIDELBERG

DOI: 10.1007/s00180-020-00999-9

关键词

Model validation; Classification error; randomized subsets; sample size

类别

Statistics & Probability

资金

U.S. Forest Service, Pacific Northwest Research Station
University of Melbourne, Australia

向作者/读者索取更多资源

Protocol

Reagent

智能总结 New
摘要

This study examines the impact of different values of k and sample sizes on the validation results of Bayesian network models, finding that classification error decreases with increasing sample size and k value, with k = 10 generally yielding the best results.

Cross-validation using randomized subsets of data-known as k-fold cross-validation-is a powerful means of testing the success rate of models used for classification. However, few if any studies have explored how values of k (number of subsets) affect validation results in models tested with data of known statistical properties. Here, we explore conditions of sample size, model structure, and variable dependence affecting validation outcomes in discrete Bayesian networks (BNs). We created 6 variants of a BN model with known properties of variance and collinearity, along with data sets of n = 50, 500, and 5000 samples, and then tested classification success and evaluated CPU computation time with seven levels of folds (k = 2, 5, 10, 20, n - 5, n - 2, and n - 1). Classification error declined with increasing n, particularly in BN models with high multivariate dependence, and declined with increasing k, generally levelling out at k = 10, although k = 5 sufficed with large samples (n = 5000). Our work supports the common use of k = 10 in the literature, although in some cases k = 5 would suffice with BN models having independent variable structures.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Radiology, Nuclear Medicine & Medical Imaging

Risks of feature leakage and sample size dependencies in deep feature extraction for breast mass classification

Ravi K. Samala, Heang-Ping Chan, Lubomir Hadjiiski, Mark A. Helvie

Summary: This study examines the risk of feature leakage and its dependence on sample size when using pretrained deep convolutional neural network (DCNN) for breast mass classification. The simulation study and analysis on training and independent test sets reveal that feature leakage can lead to large generalization errors, emphasizing the importance of evaluation on unseen test cases for realistic performance assessment in clinical implementation.

MEDICAL PHYSICS (2021)

添加到收藏夹

Article Health Care Sciences & Services

Estimation of required sample size for external validation of risk models for binary outcomes

Menelaos Pavlou, Chen Qu, Rumana Z. Omar, Shaun R. Seaman, Ewout W. Steyerberg, Ian R. White, Gareth Ambler

Summary: This paper investigates the sample size requirements for validation studies with binary outcomes to estimate measures of predictive performance, providing various estimators which perform well even when normality assumptions are violated. Our estimators show good performance, even when normality assumptions are violated.

STATISTICAL METHODS IN MEDICAL RESEARCH (2021)

添加到收藏夹

Article Psychology, Multidisciplinary

Sample Size Requirements for Applying Diagnostic Classification Models

Sedat Sen, Allan S. Cohen

Summary: The study investigates the effects of sample size, test length, number of attributes, and base rate of mastery on item parameter recovery and classification accuracy of four DCMs. Results show that larger sample size and longer test length lead to more precise estimates of item parameters, but the recovery decreases as the number of attributes increases. The DINA and DINO models demonstrate higher item parameter recovery and classification accuracy.

FRONTIERS IN PSYCHOLOGY (2021)

添加到收藏夹

Article Biology

Sample size estimation for cancer randomized trials in the presence of heterogeneous populations

Derek Dinart, Carine Bellera, Virginie Rondeau

Summary: A key issue in clinical trial design is accurately estimating the number of subjects needed, especially in multicenter or biomarker-stratified designs where the treatment effect size may vary. Limited research exists on determining sample size for such trials, highlighting the importance of considering baseline hazards and treatment effects heterogeneity to avoid bias in sample size estimates. Many current methods only account for one type of heterogeneity, lacking the ability to simultaneously address both sources of variation.

BIOMETRICS (2022)

添加到收藏夹

Article Mathematical & Computational Biology

Bayesian sample size determination in basket trials borrowing information between subsets

Haiyan Zheng, Michael J. Grayling, Pavel Mozgunov, Thomas Jaki, James M. S. Wason

Summary: Basket trials are increasingly used for evaluating new treatments in different patient subgroups. This paper proposes a Bayesian approach to determine sample size in basket trials, allowing information borrowing between similar subsets. The proposed approach yields comparable sample sizes for circumstances of no borrowing, and significantly reduces sample size when borrowing is enabled between commensurate subtrials. Examples and simulation studies demonstrate the feasibility and effectiveness of the proposed methodology.

BIOSTATISTICS (2023)

添加到收藏夹

Review Medicine, General & Internal

How to calculate sample size in animal and human studies

Xinlian Zhang, Phillipp Hartmann

Summary: The calculation of required sample size is crucial in designing both animal and human studies. This review defines key terms related to sample size determination, such as mean, standard deviation, statistical hypothesis testing, type I/II error, power, direction of effect, effect size, expected attrition, corrected sample size, and allocation ratio. It also provides practical examples of sample size calculations based on pilot studies, similar larger studies, or estimated effect sizes per Cohen and Sawilowsky if no previous studies are available.

FRONTIERS IN MEDICINE (2023)

添加到收藏夹

Article Surgery

Predictors of failure to reach target sample size in surgical randomized trials

David Chadow, N. Bryce Robinson, Gianmarco Cancelli, Giovanni Soletti, Katia Audisio, Mohamed Rahouma, Roberto Perezgrovas, Mario Gaudino

Summary: It is estimated that 25-30% of randomized controlled trials fail to reach their target sample size. Factors such as multicentre design, publication year, and commercial sponsor are inversely associated with failure to reach the target sample size. A substantial proportion of surgical trials fail to reach the target sample size, but there is an improving trend.

BRITISH JOURNAL OF SURGERY (2022)

添加到收藏夹

Article Mathematical & Computational Biology

Minimum sample size for external validation of a clinical prediction model with a continuous outcome

Lucinda Archer, Kym I. E. Snell, Joie Ensor, Mohammed T. Hudda, Gary S. Collins, Richard D. Riley

Summary: Clinical prediction models offer personalized outcome predictions for patient counseling and decision making, with external validation crucial for assessing model performance. Proposed criteria aim to determine minimum sample size needed for external validation of a clinical prediction model, considering factors like proportion of variance explained and agreement between predicted and observed values. The recommendations provide a framework for estimating precision and ensuring adequate sample sizes in future validation studies.

STATISTICS IN MEDICINE (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

A proxy learning curve for the Bayes classifier

Addisson Salazar, Luis Vergara, Enrique Vidal

Summary: In this paper, a theoretical learning curve is derived for the multi-class Bayes classifier, fitting general multivariate parametric models, and providing an estimate of the reduction in error probability with increased training set size. It does not depend on model parameters but relies on the training set size and feature vector dimension. This curve is useful in determining appropriate training set sizes in practice.

PATTERN RECOGNITION (2023)

添加到收藏夹

Article Health Care Sciences & Services

External validation of clinical prediction models: simulation-based sample size calculations were more reliable than rules-of-thumb

Kym I. E. Snell, Lucinda Archer, Joie Ensor, Laura J. Bonnett, Thomas P. A. Debray, Bob Phillips, Gary S. Collins, Richard D. Riley

Summary: Rules-of-thumb for sample size in external validation of clinical prediction models may not be precise, with factors like LP distribution affecting precision of performance estimates. A tailored simulation-based approach can offer more flexibility and reliability in determining sample size requirements for validation.

JOURNAL OF CLINICAL EPIDEMIOLOGY (2021)

添加到收藏夹

Review Geriatrics & Gerontology

Sample size calculation of clinical trials in geriatric medicine

Graziella D'Arrigo, Stefanos Roumeliotis, Claudia Torino, Giovanni Tripepi

Summary: A crucial step in planning a randomized clinical trial (RCT) is the calculation of sample size, which determines the optimal number of patients needed to ensure the study has enough power to detect differences in specific endpoints between study arms. This calculation involves inputting variables such as the expected effect size, alpha error (α), beta error (β), and the allocation ratio in order to determine the number of participants allocated to each arm of the RCT.

AGING CLINICAL AND EXPERIMENTAL RESEARCH (2021)

添加到收藏夹

Article Medicine, General & Internal

Error Consistency for Machine Learning Evaluation and Validation with Application to Biomedical Diagnostics

Jacob Levman, Bryan Ewenson, Joe Apaloo, Derek Berger, Pascal N. N. Tyrrell

Summary: Supervised machine learning classification is widely used in industry and research. The article introduces an enhanced technique for hold-out validation, which assesses the consistency of mistakes made by the learning algorithm. This technique can improve the evaluation and design of reliable and predictable AI models.

DIAGNOSTICS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Maximum Decentral Projection Margin Classifier for High Dimension and Low Sample Size problems

Zhiwang Zhang, Jing He, Jie Cao, Shuqing Li

Summary: Compared with easy feature creation or generation in data analysis, manual data labeling requires significant time and effort in most cases. Despite the potential improvement provided by automated data labeling, manual checking and verification is still necessary. Data mining and machine learning often encounter High Dimension and Low Sample Size (HDLSS) data, where traditional classifiers struggle due to data piling and approximate equidistance. This paper proposes a Maximum Decentral Projection Margin Classifier (MDPMC) within the framework of a Support Vector Classifier (SVC), effectively addressing issues related to data piling and approximate equidistance, as demonstrated by experimental results on real HDLSS datasets.

NEURAL NETWORKS (2023)

添加到收藏夹

Article Mathematical & Computational Biology

Minimum sample size for external validation of a clinical prediction model with a binary outcome

Richard D. Riley, Thomas P. A. Debray, Gary S. Collins, Lucinda Archer, Joie Ensor, Maarten van Smeden, Kym I. E. Snell

Summary: External validation is crucial in examining the performance of prediction models, but current studies often face issues with small sample sizes. To address this, determining the minimum sample size needed for a new external validation study with precise estimation calculations is proposed, taking into account calibration, discrimination, and clinical utility measures.

STATISTICS IN MEDICINE (2021)

添加到收藏夹

Article Mathematical & Computational Biology

Sample size re-estimation for covariate-adaptive randomized clinical trials

Xin Li, Wei Ma, Feifang Hu

Summary: Combining Covariate-adaptive randomization (CAR) with sample size re-estimation (SSR) in clinical trials has become increasingly popular due to its advantages in statistical efficiency and cost reduction. However, adjustments are necessary to protect the accuracy of the combined design, and this article provides a framework for the application of SSR in CAR trials and studies the underlying theoretical properties. Numerical studies show that the advantages of CAR and SSR can be further improved in terms of power and sample size.

STATISTICS IN MEDICINE (2021)

添加到收藏夹

Article Engineering, Industrial

Calibrating experts' probabilistic assessments for improved probabilistic predictions

A. M. Hanea, G. F. Nane

SAFETY SCIENCE (2019)

添加到收藏夹

Article Ecology

Weighting and aggregating expert ecological judgments

Victoria Hemming, Anca M. Hanea, Terry Walshe, Mark A. Burgman

ECOLOGICAL APPLICATIONS (2020)

添加到收藏夹

Article Engineering, Multidisciplinary

Improving expert forecasts in reliability: Application and evidence for structured elicitation protocols

Victoria Hemming, Nicholas Armstrong, Mark A. Burgman, Anca M. Hanea

QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL (2020)

添加到收藏夹

Article Public, Environmental & Occupational Health

Uncertainty Quantification with Experts: Present Status and Research Needs

Anca M. Hanea, Victoria Hemming, Gabriela F. Nane

Summary: Expert elicitation is used when data is lacking and important decisions need to be made. When designing expert elicitation, practitioners aim to balance best practices with practical constraints. The choices made impact time and effort investment, data quality, expert engagement, result defensibility, and decision acceptability.

RISK ANALYSIS (2022)

添加到收藏夹

Article Public, Environmental & Occupational Health

What is a Good Calibration Question?

Victoria Hemming, Anca M. Hanea, Mark A. Burgman

Summary: The study suggests that weighted aggregation outperforms equal weights on the combined CM score, but not on statistical accuracy. Experts were unable to adapt their knowledge across different domains, and in-sample validation on irrelevant questions did not accurately predict out-of-sample performance.

RISK ANALYSIS (2022)

添加到收藏夹

Article Biodiversity Conservation

Predicting species and community responses to global change using structured expert judgement: An Australian mountain ecosystems case study

James S. Camac, Kate D. L. Umbers, John W. Morgan, Sonya R. Geange, Anca Hanea, Rachel A. Slatyer, Keith L. McDougall, Susanna E. Venn, Peter A. Vesk, Ary A. Hoffmann, Adrienne B. Nicotra

Summary: Conservation managers are facing challenges in making decisions to protect biodiversity in the Australian Alps due to climate change impacts. Expert predictions suggest that by 2050, most alpine vegetation communities will decrease in extent, while woodlands and heathlands are expected to increase. The responses of alpine plants vary greatly, while animal species are predicted to decline or remain stable.

GLOBAL CHANGE BIOLOGY (2021)

添加到收藏夹

Article Multidisciplinary Sciences

Mathematically aggregating experts' predictions of possible futures

A. M. Hanea, D. P. Wilkinson, M. McBride, A. Lyon, D. van Ravenzwaaij, F. Singleton Thorn, C. Gray, D. R. Mandel, A. Willcox, E. Gould, E. T. Smith, F. Mody, M. Bush, F. Fidler, H. Fraser, B. C. Wintle

Summary: Structured protocols provide a transparent and systematic way to aggregate probabilistic predictions from multiple experts. By using mathematical rules for aggregation, the objectivity and quality of predictions can be enhanced and measured through accuracy, calibration, and informativeness. Performance-based weighted aggregation can be effective when experts' performance can be scored beforehand, while other aggregation methods informed by measurable proxies for good performance can also be considered.

PLOS ONE (2021)

添加到收藏夹

Article Public, Environmental & Occupational Health

Balancing the Elicitation Burden and the Richness of Expert Input When Quantifying Discrete Bayesian Networks

Martine J. Barons, Steven Mascaro, Anca M. Hanea

Summary: SEJ is a structured method for obtaining estimates from groups of experts, aiming to minimize cognitive frailties. When the number of quantities required is large, imputation methods can be used for unelicited quantities. InterBeta is effective in interpolating conditional probability tables to reduce expert burden.

RISK ANALYSIS (2022)

添加到收藏夹

Article Public, Environmental & Occupational Health

Co-designing and building an expert-elicited non-parametric Bayesian network model: demonstrating a methodology using a Bonamia Ostreae spread risk case study

Anca M. Hanea, Zoe Hilton, Ben Knight, Andrew P. Robinson

Summary: This article introduces the development and use of probabilistic models, particularly Bayesian networks (BN), for supporting risk-based decision making. It also highlights the promise of codesign and nonparametric Bayesian networks (NPBNs) in achieving a balance between model complexity and ease of development. A case study on the local spread of a marine pathogen is presented to demonstrate the process of codesigning, building, quantifying, and validating an NPBN model using structured expert judgment (SEJ).

RISK ANALYSIS (2022)

添加到收藏夹

Editorial Material Biology

Reimagining peer review as an expert elicitation process

Alexandru Marcoci, Ans Vercammen, Martin Bush, Daniel G. Hamilton, Anca Hanea, Victoria Hemming, Bonnie C. Wintle, Mark Burgman, Fiona Fidler

Summary: Journal peer review plays an important role in regulating the flow of ideas in academic disciplines. However, research shows that editors cannot accurately identify the best experts for peer review. To prevent biases and uneven power distributions, introducing greater transparency and structure into the process is crucial.

BMC RESEARCH NOTES (2022)

添加到收藏夹

Editorial Material Public, Environmental & Occupational Health

Bayesian networks for risk analysis and decision support

Anca M. Hanea, Annemarie Christophersen, Sandra Alday

RISK ANALYSIS (2022)

添加到收藏夹

Article Mathematics, Interdisciplinary Applications

Improving the Computation of Brier Scores for Evaluating Expert-Elicited Judgements

Gayan Dharmarathne, Anca Hanea, Andrew P. Robinson

Summary: Structured expert judgment (SEJ) is a suite of techniques used to elicit expert predictions in situations where data are too expensive or impossible to obtain. The quality of expert predictions can be assessed using Brier scores and calibration questions. Research recommends using mixed-effects models to improve expert Brier scores and related operations.

FRONTIERS IN APPLIED MATHEMATICS AND STATISTICS (2021)

添加到收藏夹

Review Public, Environmental & Occupational Health

Levee System Reliability Modeling: The Length Effect and Bayesian Updating

Kathryn Roscoe, Anca Hanea, Ruben Jongejan, Ton Vrouwenvelder

SAFETY (2020)

添加到收藏夹

Article Geosciences, Multidisciplinary

Bayesian Network Modeling and Expert Elicitation for Probabilistic Eruption Forecasting: Pilot Study for Whakaari/White Island, New Zealand

Annemarie Christophersen, Natalia Deligne, Anca M. Hanea, Lauriane Chardot, Nicolas Fournier, Willy P. Aspinall

FRONTIERS IN EARTH SCIENCE (2018)

添加到收藏夹

暂无数据

© Peeref 2019-2024. All rights reserved.