4.7 Article

Improving depression prediction using a novel feature selection algorithm coupled with context-aware analysis

Journal

JOURNAL OF AFFECTIVE DISORDERS
Volume 295, Issue -, Pages 1040-1048

Publisher

ELSEVIER
DOI: 10.1016/j.jad.2021.09.001

Keywords

Depression prediction; Feature selection; Context-aware analysis; Maximal information coefficient; Support vector machine

Funding

  1. 'One Enterprise, One Technology' Research & Development Center of Industrial Enterprise of Shandong Province, China [165]
  2. National Natural Science Foundation of China [31701164]
  3. Natural Science Foundation of Hunan Province, China [2018JJ3238]
  4. Training Program for Excellent Young Innovators of Changsha, China [kq1802013]

Ask authors/readers for more resources

The proposed two-stage feature selection algorithm showed excellent performance in depression prediction, extracting parsimonious subsets effectively in each case. Audio features were predominant in depression classification, while the contributions of the three feature categories to severity estimation were almost equal.
Background: Developing machine learning based depression prediction method with information from long-term recordings is important and challenging to clinical diagnosis of depression. Methods: We developed a novel two-stage feature selection algorithm conducted on the high-dimensional (over thirty thousand) features constructed by a context-aware analysis on the data set of DAIC-WOZ, including audio, video, and semantic features. The prediction performance was compared with seven reference models. The preferred topics and feature categories related to the retained features were also analyzed respectively. Results: Parsimonious subsets (tens of features) were selected by the proposed method in each case of prediction. We obtained the best performance in depression classification with F1-score as 0.96 (0.67), Precision as 1.00 (0.63), and Recall as 0.92 (0.71) on the development set (test set). We also achieved promising results in depression severity estimation with RMSE as 4.43 (5.11) and MAE as 3.22 (3.98), having a marginal difference with the best reference model (random forest with 'Selected-Text' features). Five most important topics related to depression were revealed. The audio features were predominant to the other feature categories in depression classification while the contributions of the three feature categories to severity estimation were almost equal. Limitations: More depression samples in the database we used should be further included. The second stage of feature selection is relatively time-consuming. Conclusion: This pipeline of depression recognition as well as the preferred topics and feature categories are expected to be useful in supporting the diagnosis of psychological distress conditions.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available