Explainable address matching in online geocoding: filter-based feature selection and ensemble classification


Creative Commons License

Kilic B., Bayrak O. C., Gülgen F., Uzar A. M.

GeoInformatica, cilt.30, sa.1, 2026 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 30 Sayı: 1
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1007/s10707-025-00562-y
  • Dergi Adı: GeoInformatica
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, Geobase, INSPEC
  • Anahtar Kelimeler: Address matching, Feature selection, Geocoding, Machine learning, Text similarity
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

The growing adoption of location-based services and mobile technologies has resulted in the extensive accumulation of address-tagged data across both commercial and public platforms. Leading location-based services are predominantly developed by commercial companies. Their online geocoding and address-matching solutions, however, do not permit users to modify their reference databases, which raises concerns regarding the accuracy of the geocoding process. In this study, we propose a feature selection framework aimed at enhancing online geocoding quality and overcoming the limitations of address matching. The proposed method integrates text similarity algorithms to improve address-matching result, achieving a significant accuracy gain of approximately 10–25% compared to standard outputs from services like Google Maps and ArcGIS Online. Unlike traditional approaches, this study specifically employs a feature selection framework to ‘reverse-engineer’ and rectify the opaque decision-making processes of commercial geocoders. Among the fourteen evaluated feature selection methods, mutual information-based selection and minimum redundancy-maximum relevance were identified as the most effective. The findings indicate that character-based text similarity algorithms are recommended for prioritization to further enhance the accuracy of online geocoding outputs.