5. Uluslararası Uygulamalı İstatistik Kongresi, İstanbul, Türkiye, 21 - 23 Mayıs 2024, cilt.1, sa.1, ss.315-324
Feature selection is important in machine learning and data preprocessing processes. Selecting more important features allows better generalization of the model. It also helps reduce the complexity of the model. Thus, more accurate and reliable results can be obtained in the analysis of classification and prediction problems. In the study, Recursive Feature Elimination, Random Forest and Boruta Feature Selection methods are used for important features in the chronic kidney disease dataset. Then, the data sets formed with the selected features are analyzed using Support Vector Machines, k-Nearest Neighbors and Naive Bayes classifiers, which are commonly used classification techniques in the field of machine learning. And finally, we find the best method in the study by calculating the classification accuracy values.
Keywords: Boruta, Classification, Feature selection, Random Forest, Recursive Feature Elimination