Novel approaches on bulk-loading of large scale spatial datasets


KALAY M. U.

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, cilt.34, sa.9, 2022 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 34 Sayı: 9
  • Basım Tarihi: 2022
  • Doi Numarası: 10.1002/cpe.6596
  • Dergi Adı: CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Compendex, Computer & Applied Sciences, INSPEC, Metadex, zbMATH, Civil Engineering Abstracts
  • Anahtar Kelimeler: adaptive bulk-loading, index performance, R-tree spatial index, space filling curves, INSERTION
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Index structures that are bulk-loaded adaptively based on the query workload have been the subject of many database research, primarily in conventional databases holding one-dimensional data. Similar problems are also valid for spatial databases, and novel approaches are needed in these systems. In this study, we first consider the adaptive indexing approaches in a broad sense and then specifically investigate some possible forms of adaptivity on bulk-loading of large spatial datasets while retaining some core mechanics of well-known spatial data structures. In our first design, we examined how the internal configuration of the R-tree spatial index could be better for various workload properties, such as the workload's average aspect ratio. Our experiments observed that Adaptive Sort-Tile-Recursive (STR)-tree performs better than the original STR-tree. Our second design, the main contribution of this paper, focused on incremental indexing, another form of adaptivity. We concentrate on the data-to-insight latency and loaded spatial data faster within this multi-stage process. We used the Partitioned B-tree loaded with Morton codes of spatial objects to achieve this improvement by compromising spatial proximity. Finally, the response time of the first query in our design has been better than using the ordinary STR-tree index by a factor of 34%.