Estimation of Parkinson's disease severity using speech features and extreme gradient boosting

Tunc H. C., Sakar C. O., Apaydin H., SERBES G., Gunduz A., Tutuncu M., ...More

MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, vol.58, no.11, pp.2757-2773, 2020 (SCI-Expanded) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 58 Issue: 11
  • Publication Date: 2020
  • Doi Number: 10.1007/s11517-020-02250-5
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, ABI/INFORM, Agricultural & Environmental Science Database, Applied Science & Technology Source, BIOSIS, Biotechnology Research Abstracts, Business Source Elite, Business Source Premier, CINAHL, Compendex, Computer & Applied Sciences, EMBASE, INSPEC, MEDLINE
  • Page Numbers: pp.2757-2773
  • Keywords: Unified Parkinson's Disease Rating Scale, UPDRS prediction, Machine learning, Telemonitoring, E-health, RATING-SCALE, WAVELET TRANSFORM, INTERRATER RELIABILITY, IMPAIRMENT, DISORDERS, FREQUENCY, DIAGNOSIS, GENDER
  • Yıldız Technical University Affiliated: Yes


In recent years, there is an increasing interest in building e-health systems. The systems built to deliver the health services with the use of internet and communication technologies aim to reduce the costs arising from outpatient visits of patients. Some of the related recent studies propose machine learning-based telediagnosis and telemonitoring systems for Parkinson's disease (PD). Motivated from the studies showing the potential of speech disorders in PD telemonitoring systems, in this study, we aim to estimate the severity of PD from voice recordings of the patients using motor Unified Parkinson's Disease Rating Scale (UPDRS) as the evaluation metric. For this purpose, we apply various speech processing algorithms to the voice signals of the patients and then use these features as input to a two-stage estimation model. The first step is to apply a wrapper-based feature selection algorithm, called Boruta, and select the most informative speech features. The second step is to feed the selected set of features to a decision tree-based boosting algorithm, extreme gradient boosting, which has been recently applied successfully in many machine learning tasks due to its generalization ability and speed. The feature selection analysis showed that the vibration pattern of the vocal fold is an important indicator of PD severity. Besides, we also investigate the effectiveness of using age and years passed since diagnosis as covariates together with speech features. The lowest mean absolute error with 3.87 was obtained by combining these covariates and speech features with prediction level fusion.