Journal on Artificial Intelligence, cilt.8, ss.203-230, 2026 (Hakemli Dergi)
Cardiovascular diseases remain one of the leading causes of mortality worldwide, making early and reliable diagnosis a critical challenge for modern healthcare systems. In this study, a systematic comparative performance analysis of widely used machine learning algorithms is conducted for the early detection of heart disease using tabular clinical data. Rather than proposing a novel model architecture, the primary objective is to provide a fair, reproducible, and clinically meaningful evaluation of commonly adopted classifiers under consistent experimental conditions. The Kaggle Heart Failure dataset is employed, and multiple machine learning models—including tuned Random Forest, tuned XGBoost, and a soft voting ensemble—are evaluated using a unified preprocessing pipeline, hyperparameter optimization strategy, and validation protocol. Model performance is assessed using multiple evaluation metrics, including accuracy, sensitivity, specificity, F1-score, and ROC–AUC, to capture both overall predictive performance and clinically relevant error trade-offs. The experimental results demonstrate that while moderate accuracy values are obtained, the proposed models achieve strong ROC–AUC performance and balanced sensitivity–specificity characteristics, indicating robust discriminative capability across different decision thresholds. These findings highlight the limitations of relying solely on accuracy, particularly in class-imbalanced clinical datasets, and emphasize the importance of multi-metric evaluation for reliable clinical decision support. Overall, this study contributes a transparent and methodologically rigorous comparative framework that facilitates objective assessment of machine learning models for heart disease prediction and supports informed model selection in healthcare applications.