Cleaning landmark and geolocation dataset using deep learning methods Coǧrafi lokasyon ve yer sembolü veri kümelerinin derin öǧrenme yaklaşimlari ile iyileştirilmesi


Taskin B., KARSLIGİL YAVUZ M. E.

29th IEEE Conference on Signal Processing and Communications Applications, SIU 2021, Virtual, Istanbul, Turkey, 9 - 11 June 2021 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Volume:
  • Doi Number: 10.1109/siu53274.2021.9478031
  • City: Virtual, Istanbul
  • Country: Turkey
  • Keywords: Dataset cleaning, Image classification, Google Landmark Challenge, EfficientNet

Abstract

© 2021 IEEE.In this paper, a deep learning based approach to dataset cleaning has been explained by utilizing Google Landmark Challenge dataset v2, which is a noisy dataset that consists of images photographed by people. Study also includes improvements on the concept of image classification and recognition for large noisy datasets mainly revolving around the approach to dataset cleaning which are also listed. Results achieved using this approach has been detailed with both quantitative methods like graphs regarding the reduction of classes and number of eliminated noisy images and qualitative methods like visual analysis of said images. In conclusion, it is observed that using confidence score outputs of a deep learning network, it is possible to remove noisy samples from a dataset. This paper also includes the specific threshold values for the achieved results on this dataset using explained model architecture for better reproducibility.