4.6 Article

External validation of clinical prediction models: simulation-based sample size calculations were more reliable than rules-of-thumb

Journal

JOURNAL OF CLINICAL EPIDEMIOLOGY
Volume 135, Issue -, Pages 79-89

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.jclinepi.2021.02.011

Keywords

Sample size; External validation; Clinical prediction model; Calibration and discrimination; Net benefit; Simulation

Funding

  1. National Institute for Health Research School for Primary Care Research(NIHR SPCR)
  2. Evidence Synthesis Working Group - National Institute for Health Research School for Primary Care Research(NIHR SPCR) [390]
  3. Netherlands Organisation for Health Research and Development [91617050]
  4. National Institute for Health Research [PDF-2015-08-044]
  5. Cancer Research UK [C49297/A27294]
  6. NIHR Biomedical Research Centre, Oxford
  7. NIHR [PDF2014-10872]
  8. National Institute for Health Research(NIHR)
  9. National Institute for Health Research [PDF-2015-08-044] Funding Source: researchfish

Ask authors/readers for more resources

Rules-of-thumb for sample size in external validation of clinical prediction models may not be precise, with factors like LP distribution affecting precision of performance estimates. A tailored simulation-based approach can offer more flexibility and reliability in determining sample size requirements for validation.
Introduction: Sample size rules-of-thumb for external validation of clinical prediction models suggest at least 100 events and 100 non-events. Such blanket guidance is imprecise, and not specific to the model or validation setting. We investigate factors affecting precision of model performance estimates upon external validation, and propose a more tailored sample size approach. Methods: Simulation of logistic regression prediction models to investigate factors associated with precision of performance estimates. Then, explanation and illustration of a simulation-based approach to calculate the minimum sample size required to precisely estimate a model's calibration, discrimination and clinical utility. Results: Precision is affected by the model's linear predictor (LP) distribution, in addition to number of events and total sample size. Sample sizes of 100 (or even 200) events and non-events can give imprecise estimates, especially for calibration. The simulationbased calculation accounts for the LP distribution and (mis)calibration in the validation sample. Application identifies 2430 required participants (531 events) for external validation of a deep vein thrombosis diagnostic model. Conclusion: Where researchers can anticipate the distribution of the model's LP (eg, based on development sample, or a pilot study), a simulation-based approach for calculating sample size for external validation offers more flexibility and reliability than rules-of-thumb. (c) 2021 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Primary Health Care

Predicting the risk of acute kidney injury in primary care: derivation and validation of STRATIFY-AKI

Constantinos Koshiaris, Lucinda Archer, Sarah Lay-Flurrie, Kym I. E. Snell, Richard D. Riley, Richard Stevens, Amitava Banerjee, Juliet A. Usher-Smith, Andrew Clegg, Rupert A. Payne, Margaret Ogden, F. D. Richard Hobbs, Richard J. McManus, James P. Sheppard

Summary: This study developed a prediction model to estimate the risk of acute kidney injury (AKI) in people potentially indicated for antihypertensive treatment. The model showed good accuracy in identifying high-risk patients and provided reassurance for the majority of low-risk patients, indicating the safety and appropriateness of antihypertensive treatment.

BRITISH JOURNAL OF GENERAL PRACTICE (2023)

Review Clinical Neurology

Prognostic factors associated with outcome following an epidural steroid injection for disc-related sciatica: a systematic review and narrative synthesis

Alan Nagington, Nadine E. Foster, Kym Snell, Kika Konstantinou, Siobhan Stynes

Summary: This systematic review aimed to identify prognostic factors associated with outcomes following epidural steroid injection (ESI) for patients with imaging confirmed disc-related sciatica. The review found limited evidence and low quality studies regarding prognostic factors for this treatment. Future well-designed prospective cohort studies are needed to determine these factors.

EUROPEAN SPINE JOURNAL (2023)

Article Multidisciplinary Sciences

Comparison of Bayesian methods for incorporating adult clinical trial data to improve certainty of treatment effect estimates in children

Ruth Walker, Bob Phillips, Sofia Dias

Summary: Recruiting children for randomized clinical trials poses challenges, resulting in less certainty about the safety and effectiveness of treatments compared to adults. However, it is possible to use adult evidence to better understand the effectiveness of treatments in children, and there are various statistical methods available for conducting these analyses.

PLOS ONE (2023)

Review Medicine, General & Internal

Multivariable prediction models for atrial fibrillation after cardiac surgery: a systematic review protocol

Kara G. Fields, Jie Ma, Tatjana Petrinic, Hassan Alhassan, Anthony Eze, Ankith Reddy, Mona Hedayat, Rui Providencia, Gregory Y. H. Lip, Jonathan P. Bedford, David A. Clifton, Oliver C. Redfern, Benjamin O'Brien, Peter J. Watkinson, Gary S. Collins, Jochen D. Muehlschlegel

Summary: This systematic review aims to critically appraise the methodology and risk of bias in the development and validation of multivariable prediction models for atrial fibrillation after cardiac surgery (AFACS). The findings will be disseminated to improve future studies and provide a clinically useful risk estimation tool.

BMJ OPEN (2023)

Article Medicine, General & Internal

Association between surgeon volume and patient outcomes after elective shoulder replacement surgery using data from the National Joint Registry and Hospital Episode Statistics for England: population based cohort study

Epaminondas Markos Valsamis, Gary S. Collins, Rafael Pinedo-Villanueva, Michael R. Whitehouse, Amar Rangan, Adrian Sayers, Jonathan L. Rees

Summary: This study aimed to investigate the association between surgeon volume and patient outcomes after elective shoulder replacement surgery. The results showed that surgeons with an average annual volume of more than 10.4 procedures had lower rates of revision surgery and reoperation, lower risk of serious adverse events, and shorter hospital stays.

BMJ-BRITISH MEDICAL JOURNAL (2023)

Article Pediatrics

Development of treatment-decision algorithms for children evaluated for pulmonary tuberculosis: an individual participant data meta-analysis

Kenneth S. Gunasekera, Olivier Marcy, Johanna Munoz, Elisa Lopez-Varela, Moorine P. Sekadde, Molly F. Franke, Maryline Bonnet, Shakil Ahmed, Farhana Amanullah, Aliya Anwar, Orvalho Augusto, Rafaela Baroni Aurilio, Sayera Banu, Iraj Batool, Annemieke Brands, Kevin P. Cain, Lucia Carratala-Castro, Maxine Caws, Eleanor S. Click, Lisa M. Cranmer, Alberto L. Garcia-Basteiro, Anneke C. Hesseling, Julie Huynh, Senjuti Kabir, Leonid Lecca, Anna Mandalakas, Farai Mavhunga, AyeAye Myint, Kyaw Myo, Dorah Nampijja, Mark P. Nicol, Patrick Orikiriza, Megan Palmer, Clemax Couto Sant'Anna, Sara Ahmed Siddiqui, Jonathan P. Smith, Rinn Song, Nguyen Thuy Thuong Thuong, Vibol Ung, Marieke M. van der Zalm, Sabine Verkuijl, Kerri Viney, Elisabetta G. Walters, Joshua L. Warren, Heather J. Zar, Ben J. Marais, Stephen M. Graham, Thomas P. A. Debray, Ted Cohen, James A. Seddon

Summary: This study evaluated the performance of current diagnostic algorithms for pediatric tuberculosis and developed evidence-based algorithms using prediction modeling to assist in treatment decision-making. The existing treatment-decision algorithms had highly variable diagnostic performance. The scoring system derived from the prediction model that included clinical and chest x-ray features had higher sensitivity and lower specificity, while the scoring system derived from clinical features only also accurately diagnosed tuberculosis to some extent.

LANCET CHILD & ADOLESCENT HEALTH (2023)

Article Mathematical & Computational Biology

Stability of clinical prediction models developed using statistical or machine learning methods

Richard D. Riley, Gary S. Collins

Summary: Clinical prediction models estimate an individual's risk of a particular health outcome. Many models are developed using small datasets, leading to instability in the model and its predictions. Researchers should examine instability at the model development stage and propose instability plots and measures to assess model reliability and inform critical appraisal, fairness, and validation requirements.

BIOMETRICAL JOURNAL (2023)

Article Mathematical & Computational Biology

Regularized parametric survival modeling to improve risk prediction models

J. Hoogland, T. P. A. Debray, M. J. Crowther, R. D. Riley, J. Inthout, J. B. Reitsma, A. H. Zwinderman

Summary: This study proposes a method that combines flexible parametric survival modeling and regularization to improve risk prediction models for time-to-event data. By introducing different penalty terms, the models can be regularized to enhance prediction accuracy and model performance.

BIOMETRICAL JOURNAL (2023)

Review Health Care Sciences & Services

Systematic review highlights high risk of bias of clinical prediction models for blood transfusion in patients undergoing elective surgery

Paula Dhiman, Jie Ma, Victoria N. Gibbs, Alexandros Rampotas, Hassan Kamal, Sahar S. Arshad, Shona Kirtley, Carolyn Doree, Michael F. Murphy, Gary S. Collins, Antony J. R. Palmer

Summary: This study conducted a systematic review and found that most blood transfusion prediction models have a high risk of bias and poor methodological quality, which need to be improved and enhanced in clinical practice.

JOURNAL OF CLINICAL EPIDEMIOLOGY (2023)

Article Health Care Sciences & Services

The estimand framework had implications in time to patient-reported outcomes deterioration analyses in cancer clinical trials

Francesco Cottone, Fabio Efficace, David Cella, Neil K. Aaronson, Johannes M. Giesinger, Jean-Baptiste Bachet, Christophe Louvet, Emilie Charton, Gary S. Collins, Amelie Anota

Summary: This study applied the estimand framework to analyze time to deterioration in patient-reported outcomes. The results showed significant differences in estimates depending on the statistical methods used, especially when considering death as a competing risk. Therefore, the Fine-Gray competing risks model should be considered to reflect the patient's experience of the disease and treatment burden.

JOURNAL OF CLINICAL EPIDEMIOLOGY (2023)

Review Psychiatry

Predicting antipsychotic-induced weight gain in first episode psychosis - A field-wide systematic review and meta-analysis of non-genetic prognostic factors

Ita Fitzgerald, Laura J. Sahm, Amy Byrne, Jean O'Connell, Joie Ensor, Ciara Ni Dhubhlaing, Sarah O'Dwyer, Erin K. Crowley

Summary: This study systematically explored the impact of non-genetic prognostic factors on the variable prognosis of antipsychotic-induced weight gain (AIWG). The results showed that age, baseline BMI, and sex had nonsignificant effects on AIWG prognosis. However, the trend of early BMI increase was identified as the most clinically significant prognostic factor, which should be included in AIWG management guidance.

EUROPEAN PSYCHIATRY (2023)

Article Rheumatology

Global, regional, and national burden of low back pain, 1990-2020, its attributable risk factors, and projections to 2050: a systematic analysis of the Global Burden of Disease Study 2021

Manuela L. Ferreira, Katie de Luca, Lydia M. Haile, Jaimie Steinmetz, Garland Culbreth, Marita Cross, Jacek A. Kopec, Paulo H. Ferreira, Fiona M. Blyth, Rachelle Buchbinder, Jan Hartvigsen, Ai-Min Wu, Saeid Safiri, Anthony Woolf, Gary S. Collins, Kanyin Liane Ong, Stein Emil Vollset, Amanda E. Smith, Jessica A. Cruz, Kai Glenn Fukutaki, Semagn Mekonnen Abate, Mitra Abbasifard, Mohsen Abbasi-Kangevari, Zeinab Abbasi-Kangevari, Ahmed Abdelalim, Aidin Abedi, Hassan Abidi, Qorinah Estiningtyas Sakilah Adnani, Ali Ahmadi, Rufus Olusola Akinyemi, Abayneh Tadesse Alamer, Adugnaw Zeleke Alem, Yousef Alimohamadi, Mansour Abdullah Alshehri, Mohammed Mansour Alshehri, Hosam Alzahrani, Saeed Amini, Sohrab Amiri, Hubert Amu, Catalina Liliana Andrei, Tudorel Andrei, Benny Antony, Jalal Arabloo, Judie Arulappan, Ashokan Arumugam, Tahira Ashraf, Seyyed Shamsadin Athari, Nefsu Awoke, Sina Azadnajafabad, Till Winfried Baernighausen, Lope H. Barrero, Amadou Barrow, Akbar Barzegar, Lindsay M. Bearne, Isabela M. Bensenor, Alemshet Yirga Berhie, Bharti Bhandari Bhandari, Vijayalakshmi S. Bhojaraja, Ali Bijani, Belay Boda Abule Bodicha, Srinivasa Rao Bolla, Javier Brazo-Sayavera, Andrew M. Briggs, Chao Cao, Periklis Charalampous, Vijay Kumar Chattu, Flavia M. Cicuttini, Benjamin Clarsen, Sarah Cuschieri, Omid Dadras, Xiaochen Dai, Lalit Dandona, Rakhi Dandona, Azizallah Dehghan, Takele Gezahegn G. Demie, Edgar Denova-Gutierrez, Syed Masudur Rahman Dewan, Samath Dhamminda Dharmaratne, Mandira Lamichhane Dhimal, Meghnath Dhimal, Daniel Diaz, Mojtaba Didehdar, Lankamo Ena Digesa, Mengistie Diress, Hoa Thi Do, Linh Phuong Doan, Michael Ekholuenetale, Muhammed Elhadi, Sharareh Eskandarieh, Shahriar Faghani, Jawad Fares, Ali Fatehizadeh, Getahun Fetensa, Irina Filip, Florian Fischer, Richard Charles Franklin, Balasankar Ganesan, Belete Negese Belete Gemeda, Motuma Erena Getachew, Ahmad Ghashghaee, Tiffany K. Gill, Mahaveer Golechha, Pouya Goleij, Bhawna Gupta, Nima Hafezi-Nejad, Arvin Haj-Mirzaian, Pawan Kumar Hamal, Asif Hanif, Netanja Harlianto, Hamidreza Hasani, Simon Hay, Jeffrey J. Hebert, Golnaz Heidari, Mohammad Heidari, Reza Heidari-Soureshjani, Mbuzeleni Mbuzeleni Hlongwa, Mohammad-Salar Hosseini, Alexander Kevin Hsiao, Ivo Iavicoli, Segun Emmanuel Ibitoye, Irena M. Ilic, Milena Ilic, Sheikh Mohammed Shariful Islam, Manthan Dilipkumar Janodia, Ravi Prakash Jha, Har Ashish Jindal, Jost B. Jonas, Gebisa Guyasa Kabito, Himal Kandel, Rimple Jeet Kaur, Vikash Ranjan Keshri, Yousef Saleh Khader, Ejaz Ahmad Khan, Md Jobair Khan, Moien AB Khan, Hamid Reza Khayat Kashani, Jagdish Khubchandani, Yun Jin Kim, Adnan Kisa, Jitka Klugarova, Ali-Asghar Kolahi, Hamid Reza Koohestani, Ai Koyanagi, G. Anil Kumar, Narinder Kumar, Tea Lallukka, Savita Lasrado, Wei-Chen Lee, Yo Han Lee, Ata Mahmoodpoor, Jeadran N. Malagon-Rojas, Mohammad-Reza Malekpour, Reza Malekzadeh, Narges Malih, Man Mohan Mehndiratta, Entezar Mehrabi Nasab, Ritesh G. Menezes, Alexios-Fotios A. Mentis, Mohamed Kamal Mesregah, Ted R. Miller, Mohammad Mirza-Aghazadeh-Attari, Maryam Mobarakabadi, Yousef Mohammad, Esmaeil Mohammadi, Shafiu Mohammed, Ali H. Mokdad, Sara Momtazmanesh, Lorenzo Monasta, Mohammad Ali Moni, Ebrahim Mostafavi, Christopher J. L. Murray, Tapas Sadasivan Nair, Javad Nazari, Seyed Aria Nejadghaderi, Subas Neupane, Sandhya Neupane Kandel, Cuong Tat Nguyen, Ali Nowroozi, Hassan Okati-Aliabad, Emad Omer, Abderrahim Oulhaj, Mayowa Owolabi, Songhomitra Panda-Jonas, Anamika Pandey, Eun-Kee Park, Shrikant Pawar, Paolo Pedersini, Jeevan Pereira, Mario F. P. Peres, Ionela-Roxana Petcu, Mohammadreza Pourahmadi, Amir Radfar, Shahram Rahimi-Dehgolan, Vafa Rahimi-Movaghar, Mosiur Rahman, Amir Masoud Rahmani, Nazanin Rajai, Chythra R. Rao, Vahid Rashedi, Mohammad-Mahdi Rashidi, Zubair Ahmed Ratan, David Laith Rawaf, Salman Rawaf, Andre M. N. Renzaho, Negar Rezaei, Zahed Rezaei, Leonardo Roever, Guilherme de Andrade Ruela, Basema Saddik, Amirhossein Sahebkar, Sana Salehi, Francesco Sanmarchi, Sadaf G. Sepanlou, Saeed Shahabi, Shayan Shahrokhi, Elaheh Shaker, Mohammadbagher Shamsi, Mohammed Shannawaz, Saurab Sharma, Maryam Shaygan, Rahim Ali Sheikhi, Jeevan K. Shetty, Rahman Shiri, Siddharudha Shivalli, Parnian Shobeiri, Migbar Mekonnen Sibhat, Ambrish Singh, Jasvinder A. Singh, Helen Slater, Marco Solmi, Ranjani Somayaji, Ker-Kan Tan, Rekha Thapar, Seyed Abolfazl Tohidast, Sahel Valadan Tahbaz, Rohollah Valizadeh, Tommi Juhani Vasankari, Narayanaswamy Venketasubramanian, Vasily Vlassov, Bay Vo, Yuan-Pang Wang, Taweewat Wiangkham, Lalit Yadav, Ali Yadollahpour, Seyed Hossein Yahyazadeh Jabbari, Lin Yang, Fereshteh Yazdanpanah, Naohiro Yonemoto, Mustafa Z. Younis, Iman Zare, Armin Zarrintan, Mohammad Zoladl, Theo Vos, Lyn M. March

Summary: This study provides the most up-to-date global, regional, and national data on the prevalence and years lived with disability (YLDs) for low back pain. It reveals that low back pain is the leading cause of YLDs globally and projects that there will be over 800 million cases of low back pain worldwide by 2050. Challenges persist in obtaining primary country-level data on low back pain, highlighting the need for more high-quality data to improve accuracy and monitoring of the condition.

LANCET RHEUMATOLOGY (2023)

Article Medical Informatics

Predicting 10-year breast cancer mortality risk in the general female population in England: a model development and validation study

Ash Kieran Clift, Gary S. Collins, Simon Lord, Stavros Petrou, David Dodwell, Michael Brady, Julia Hippisley-Cox

Summary: This study aims to develop a prognostic model that accurately predicts the 10-year risk of breast cancer mortality in female individuals without breast cancer at baseline.

LANCET DIGITAL HEALTH (2023)

Article Primary Health Care

Predicting the risk of acute kidney injury in primary care: derivation and validation of STRATIFY-AKI

Constantinos Koshiaris, Lucinda Archer, Sarah Lay-Flurrie, Kym I. E. Snell, Richard D. Riley, Richard Stevens, Amitava Banerjee, Juliet A. Usher-Smith, Andrew Clegg, Rupert A. Payne, Margaret Ogden, F. D. Richard Hobbs, Richard J. McManus, James P. Sheppard

Summary: A prediction model has been developed in this study to aid treatment decisions by identifying high-risk AKI patients.

BRITISH JOURNAL OF GENERAL PRACTICE (2023)

No Data Available