期刊
SLEEP
卷 40, 期 10, 页码 -出版社
OXFORD UNIV PRESS INC
DOI: 10.1093/sleep/zsx139
关键词
sleep stages; EEG; machine learning; big data
资金
- NIH-NINDS (NIH-NINDS) [1K23NS090900]
- Department of Neurology (MGH)
- Milton Foundation
- American Sleep Medicine Foundation
- MGH-MIT Grand Challenge
- Center for Integration of Medicine and Innovative Technology
- National Research Foundation
- Prime Minister's Office, Singapore under its International Research Centres in Singapore Funding Initiative
Study Objectives: Automated sleep staging has been previously limited by a combination of clinical and physiological heterogeneity. Both factors are in principle addressable with large data sets that enable robust calibration. However, the impact of sample size remains uncertain. The objectives are to investigate the extent to which machine learning methods can approximate the performance of human scorers when supplied with sufficient training cases and to investigate how staging performance depends on the number of training patients, contextual information, model complexity, and imbalance between sleep stage proportions. Methods: A total of 102 features were extracted from six electroencephalography (EEG) channels in routine polysomnography. Two thousand nights were partitioned into equal (n = 1000) training and testing sets for validation. We used epoch-by-epoch Cohen's kappa statistics to measure the agreement between classifier output and human scorer according to American Academy of Sleep Medicine scoring criteria. Results: Epoch-by-epoch Cohen's kappa improved with increasing training EEG recordings until saturation occurred (n = similar to 300). The kappa value was further improved by accounting for contextual (temporal) information, increasing model complexity, and adjusting the model training procedure to account for the imbalance of stage proportions. The final kappa on the testing set was 0.68. Testing on more EEG recordings leads to kappa estimates with lower variance. Conclusion: Training with a large data set enables automated sleep staging that compares favorably with human scorers. Because testing was performed on a large and heterogeneous data set, the performance estimate has low variance and is likely to generalize broadly.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据