Predicting Cost Impacts of Nonconformances in Construction Projects Using Interpretable Machine Learning

KOÇ, Kerim; BUDAYAN, CENK; Ekmekcioğlu, Ömer; Tokdemir, Onur

doi:10.1061/jcemd4.coeng-13857

Predicting Cost Impacts of Nonconformances in Construction Projects Using Interpretable Machine Learning

KOÇ K., BUDAYAN C., Ekmekcioğlu Ö., Tokdemir O. B.

Journal of Construction Engineering and Management, cilt.150, sa.1, 2024 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 150 Sayı: 1
Basım Tarihi: 2024
Doi Numarası: 10.1061/jcemd4.coeng-13857
Dergi Adı: Journal of Construction Engineering and Management
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aerospace Database, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, ICONDA Bibliographic, INSPEC, Metadex, Public Affairs Index, DIALNET, Civil Engineering Abstracts
Anahtar Kelimeler: Cost of quality, Explainable artificial intelligence, Nonconformance (NCR), Quality failures, Tree-based ensemble model
Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Nonconformance (NCR) has long been a subject of research interest for its potential to extrapolate information leading to a more productive environment in construction projects. Despite a variety of traditional attempts, a systematic understanding of how machine learning (ML) approaches can contribute to proactively detecting the severity of NCRs remains limited. This study aims to develop a data-driven ML framework to predict the cost impacts of NCRs (high severity versus low severity) in construction projects. To accomplish this aim, the random forest (RF) algorithm reinforced with a metaheuristic hyperparameter-tuning strategy, namely the gravitational search algorithm (GSA), is adopted for the binary classification problem. Furthermore, this study incorporates the Shapley additive explanations (SHAP) ensuring transparent interpretations into the GSA-RF predictive framework to tackle the inherent black-box nature of the ML rationale. The results reveal that the proposed model detects the severity of NCRs in terms of their cost impact with an overall AUROC value of 0.776 for the preseparated and blinded testing set. This indicates that the proposed model can be used confidently for newly introduced datasets from real-life cases. In addition, the SHAP analysis results emphasized the role of season, inadequate application procedure, and NCR type in detecting the severity of NCRs. Overall, this research not only makes an important contribution through its novel data-driven approaches but also provides insights for project managers concerning productivity improvements in the sector.