Improved Space Forest: A Meta Ensemble Method


AMASYALI M. F.

IEEE TRANSACTIONS ON CYBERNETICS, cilt.49, sa.3, ss.816-826, 2019 (SCI-Expanded) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 49 Sayı: 3
  • Basım Tarihi: 2019
  • Doi Numarası: 10.1109/tcyb.2017.2787718
  • Dergi Adı: IEEE TRANSACTIONS ON CYBERNETICS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.816-826
  • Anahtar Kelimeler: Bagging, classification, decision trees, ensemble, random forest, rotation forest, VC dimension
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

The performance of the ensemble algorithms is related with the individual accuracy of the base learners and their results diversity. Individual accuracy of a base learner is directly related to the similarity between the original training set and the base learner's training set. When a modified training set by randomly selecting features/classes/samples is given to the base learners, the diversity is created but the individual accuracy is decreased. From this point of view, different ensemble algorithms can be seen as a selection between having more accurate but less diverse base learners and having more diverse but less accurate base learners. We propose a meta ensemble method named as improved space forest which adds generated and (hopefully) more accurate features to the original features. The new features are obtained from randomly selected original features. When the new features are more distinctive than the original ones, they are selected by the learners. So, the ensemble may have more accurate base learners. However, a different improved space is generated for each learner to create diversity. The proposed method can be used with different ensemble methods. We compared original and improved space versions of bagging, random forest, and rotation forest algorithms. Improved space versions have generally better or comparable results than the original ones. We also present a theoretical framework to analyze the individual accuracies and diversities of the base learners.