Cleaning landmark and geolocation dataset using deep learning methods Coǧrafi lokasyon ve yer sembolü veri kümelerinin derin öǧrenme yaklaşimlari ile iyileştirilmesi


Taskin B., KARSLIGİL YAVUZ M. E.

29th IEEE Conference on Signal Processing and Communications Applications, SIU 2021, Virtual, Istanbul, Türkiye, 9 - 11 Haziran 2021 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası:
  • Doi Numarası: 10.1109/siu53274.2021.9478031
  • Basıldığı Şehir: Virtual, Istanbul
  • Basıldığı Ülke: Türkiye
  • Anahtar Kelimeler: Dataset cleaning, Image classification, Google Landmark Challenge, EfficientNet
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

© 2021 IEEE.In this paper, a deep learning based approach to dataset cleaning has been explained by utilizing Google Landmark Challenge dataset v2, which is a noisy dataset that consists of images photographed by people. Study also includes improvements on the concept of image classification and recognition for large noisy datasets mainly revolving around the approach to dataset cleaning which are also listed. Results achieved using this approach has been detailed with both quantitative methods like graphs regarding the reduction of classes and number of eliminated noisy images and qualitative methods like visual analysis of said images. In conclusion, it is observed that using confidence score outputs of a deep learning network, it is possible to remove noisy samples from a dataset. This paper also includes the specific threshold values for the achieved results on this dataset using explained model architecture for better reproducibility.