Integrating feature engineering, genetic algorithm and tree-based machine learning methods to predict the post-accident disability status of construction workers


KOÇ K. , Ekmekcioğlu Ö., GÜRGÜN A. P.

Automation in Construction, vol.131, 2021 (Journal Indexed in SCI Expanded) identifier

  • Publication Type: Article / Article
  • Volume: 131
  • Publication Date: 2021
  • Doi Number: 10.1016/j.autcon.2021.103896
  • Title of Journal : Automation in Construction
  • Keywords: Artificial intelligence, Construction safety, Genetic algorithm, Machine learning, Occupational accident, Safety management, Tree-based ensemble models, Worker disability

Abstract

© 2021 Elsevier B.V.The construction industry is among the riskiest industries around the world. Hence, the preliminary studies exploring the consequences of occupational accidents have received considerable attention in research society. This study aims to develop a comprehensive framework to predict the post-accident disability status of construction workers. The dataset comprising 47,938 construction accidents recorded in Turkey was subjected to a detailed multi-step feature engineering approach, including data encoding, data scaling, dimension reduction, and data resampling. Predictions were performed through four tree-based ensemble machine learning models: Random Forest, XGBoost, AdaBoost, and Extra Trees, as well as a state-of-the-art optimization method for hyperparameter tuning, Genetic Algorithm (GA). GA-XGBoost presented the highest prediction rate with 0.8292 in terms of accuracy and 0.8120 with respect to AUROC. The findings may aid in predicting construction workers' post-accident disability status, resulting in a safer working environment and productivity planning in construction projects.