Scenario-based automated data preprocessing to predict severity of construction accidents

Koç, Kerim; Gürgün, Aslı

doi:10.1016/j.autcon.2022.104351

Scenario-based automated data preprocessing to predict severity of construction accidents

Koç K., Gürgün A. P.

Automation in Construction, cilt.140, 2022 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 140
Basım Tarihi: 2022
Doi Numarası: 10.1016/j.autcon.2022.104351
Dergi Adı: Automation in Construction
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Aerospace Database, Communication Abstracts, ICONDA Bibliographic, INSPEC, Metadex, Civil Engineering Abstracts
Anahtar Kelimeler: Automated pre-processing, Accident risk assessment, Occupational health and safety (OHS), Accident severity, Machine learning, Artificial intelligence, eXtreme gradient boosting (XGBoost), OCCUPATIONAL ACCIDENTS, CLASSIFICATION, GENERATION, INCIDENTS, EQUIPMENT, INDUSTRY, WORKERS
Yıldız Teknik Üniversitesi Adresli: Evet

Özet

© 2022 Elsevier B.V.Occupational accidents are common in the construction industry, therefore developing prediction models to detect high severe accidents would be useful. However, existing studies are limited and usually focus on selecting the most appropriate machine learning method rather than identifying the most effective preprocessing pipeline before the prediction. In this study, a scenario-basis automated preprocessing model that identifies the best scenario is developed to predict the severity of construction accidents. The results show that the scenario combination of not removing missing data, not applying data binning, considering outliers, applying Min-Max-Scaler and one-hot encoding, and data resampling with random oversampling yielded the highest prediction performance with 0.6092 of F1-score. Permutation importance of XGBoost analysis indicates that year, cause material, age, past accidents, experience, and salary are the most influential attributes. This study contributes to society/practice through a model preventing high-severe accidents and theory/technology with novel preprocessing model to perform more reliable predictions.