Improved Space Forest: A Meta Ensemble Method

AMASYALI, Mehmet

doi:10.1109/tcyb.2017.2787718

Improved Space Forest: A Meta Ensemble Method

Atıf İçin Kopyala

AMASYALI M. F.

IEEE TRANSACTIONS ON CYBERNETICS, cilt.49, sa.3, ss.816-826, 2019 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 49 Sayı: 3
Basım Tarihi: 2019
Doi Numarası: 10.1109/tcyb.2017.2787718
Dergi Adı: IEEE TRANSACTIONS ON CYBERNETICS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.816-826
Anahtar Kelimeler: Bagging, classification, decision trees, ensemble, random forest, rotation forest, VC dimension
Yıldız Teknik Üniversitesi Adresli: Evet

Özet

The performance of the ensemble algorithms is related with the individual accuracy of the base learners and their results diversity. Individual accuracy of a base learner is directly related to the similarity between the original training set and the base learner's training set. When a modified training set by randomly selecting features/classes/samples is given to the base learners, the diversity is created but the individual accuracy is decreased. From this point of view, different ensemble algorithms can be seen as a selection between having more accurate but less diverse base learners and having more diverse but less accurate base learners. We propose a meta ensemble method named as improved space forest which adds generated and (hopefully) more accurate features to the original features. The new features are obtained from randomly selected original features. When the new features are more distinctive than the original ones, they are selected by the learners. So, the ensemble may have more accurate base learners. However, a different improved space is generated for each learner to create diversity. The proposed method can be used with different ensemble methods. We compared original and improved space versions of bagging, random forest, and rotation forest algorithms. Improved space versions have generally better or comparable results than the original ones. We also present a theoretical framework to analyze the individual accuracies and diversities of the base learners.