4.7 Article

Mind the gap: Performance metric evaluation in brain-age prediction

期刊

HUMAN BRAIN MAPPING
卷 43, 期 10, 页码 3113-3129

出版社

WILEY
DOI: 10.1002/hbm.25837

关键词

brain-age prediction; machine learning; neuroimaging; statistics

资金

  1. Collaboratory on Research Definitions for Reserve and Resilience in Cognitive Aging and Dementia [5R24AG061421-03]
  2. Deutsche Forschungsgemeinschaft [FR 3709/1-2, HA7070/2-2, HA7070/3, HA7070/4]
  3. ERA-net Cofound
  4. Fondation Leenaards
  5. European Research Council [802998]
  6. HDH Wills 1965 Charitable Trust [1117747]
  7. Helse Sor-Ost RHF [2015073]
  8. Interdisciplinary Center for Clinical Research of the Jena University hospital [AMSP 07]
  9. Interdisciplinary Center for Clinical Research of the Medical Faculty of Munster [MzH 3/020/20]
  10. Medical Research Council [G1001354, MR/R024790/2]
  11. Norges Forskningsrad [223273, 249795, 273345, 276082]
  12. Swiss National Science Foundation [32003B_135679, 32003B_159780, 324730_192755, CRSK-3_190185, PZ00P3_193658]
  13. European Research Council (ERC) [802998] Funding Source: European Research Council (ERC)
  14. Swiss National Science Foundation (SNF) [PZ00P3_193658, CRSK-3_190185, 324730_192755] Funding Source: Swiss National Science Foundation (SNF)

向作者/读者索取更多资源

Estimating age based on neuroimaging-derived data is a popular approach, but there is significant variation in model accuracy across studies. This study found that performance metrics for age prediction models depend on cohort and study-specific data characteristics. Age range, sample size, and age-bias correction all have an impact on the accuracy of the models. Furthermore, evaluating prediction variance and age-bias provides important information about underlying model attributes.
Estimating age based on neuroimaging-derived data has become a popular approach to developing markers for brain integrity and health. While a variety of machine-learning algorithms can provide accurate predictions of age based on brain characteristics, there is significant variation in model accuracy reported across studies. We predicted age in two population-based datasets, and assessed the effects of age range, sample size and age-bias correction on the model performance metrics Pearson's correlation coefficient (r), the coefficient of determination (R-2), Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). The results showed that these metrics vary considerably depending on cohort age range; r and R-2 values are lower when measured in samples with a narrower age range. RMSE and MAE are also lower in samples with a narrower age range due to smaller errors/brain age delta values when predictions are closer to the mean age of the group. Across subsets with different age ranges, performance metrics improve with increasing sample size. Performance metrics further vary depending on prediction variance as well as mean age difference between training and test sets, and age-bias corrected metrics indicate high accuracy-also for models showing poor initial performance. In conclusion, performance metrics used for evaluating age prediction models depend on cohort and study-specific data characteristics, and cannot be directly compared across different studies. Since age-bias corrected metrics generally indicate high accuracy, even for poorly performing models, inspection of uncorrected model results provides important information about underlying model attributes such as prediction variance.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据