Deep Learning for Automated Sewer Defect Detection: Benchmarking YOLO and RT-DETR on the Istanbul Dataset


Oğurlu M., BAYRAM B., KULAVUZ B., BAKIRMAN T.

Applied Sciences (Switzerland), cilt.15, sa.20, 2025 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 15 Sayı: 20
  • Basım Tarihi: 2025
  • Doi Numarası: 10.3390/app152011096
  • Dergi Adı: Applied Sciences (Switzerland)
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aerospace Database, Agricultural & Environmental Science Database, Applied Science & Technology Source, Communication Abstracts, INSPEC, Metadex, Directory of Open Access Journals, Civil Engineering Abstracts
  • Anahtar Kelimeler: deep learning, infrastructure inspection, object detection, RT-DETR, sewer defect detection, YOLO
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

The inspection and maintenance of urban sewer infrastructure remain critical challenges for megacities, where conventional manual inspection approaches are labor-intensive, time-consuming, and prone to human error. Although deep learning has been increasingly applied to sewer inspection, the field lacks both a publicly available large-scale dataset and a systematic evaluation of CNN and transformer-based models on real sewer footage. The primary aim of this study is to systematically evaluate and compare state-of-the-art deep learning architectures for automated sewer defect detection using a newly introduced dataset. We present the Istanbul Sewer Defect Dataset (ISWDS), comprising 13,491 expert-annotated images collected from Istanbul’s wastewater network and covering eight defect categories that account for approximately 90% of reported failures. The scientific novelty of this work lies in both the introduction of the ISWDS and the first systematic benchmarking of YOLO (v8/11/12) and RT-DETR (v1/v2) architectures under identical protocols on real sewer inspection footage. Experimental results demonstrate that RT-DETR v2 achieves the best performance (F1: 79.03%, Recall: 81.10%), significantly outperforming the best YOLO variant. While transformer-based architectures excel in detecting partially occluded defects and complex operational conditions, YOLO models provide computational efficiency advantages for resource-constrained deployments. Furthermore, a QGIS-based inspection tool integrating the best-performing models was developed to enable real-time video analysis and automated reporting. Overall, this study highlights the trade-offs between accuracy and efficiency, demonstrating that RT-DETR v2 is most suitable for server-based processing. In contrast, compact YOLO variants are more appropriate for edge deployment.