Investigation the performance of ensemble clustering techniques in latest GPS velocity field of Turkey


Kılıç B., Özarpacı S., Yılmaz Y.

EGU23, Vienna, Austria, 23 - 28 April 2023, pp.1

  • Publication Type: Conference Paper / Summary Text
  • City: Vienna
  • Country: Austria
  • Page Numbers: pp.1
  • Yıldız Technical University Affiliated: Yes

Abstract

The primary active strike-slip faults in Turkey are the North and East Anatolian Faults (NAF and EAF), as well as the Ölüdeniz Fault. These transform boundaries are the result of various tectonic regimes, including the collapse of the oceanic lithosphere in the Hellenic and Cyprus arcs, continental collisions in the Zagros/Caucasus and Black Sea; Anatolia's related continental escape and expansion in western Turkey; and the Nubian, Arabian, and Eurasian plate interactions, which are Turkey's main tectonic domains. Block modeling may be useful for establishing slip rates for major faults or calculating block movements in order to better understand these regimes and deformations. Previous to block modeling, clustering analysis may be used to identify Global Positioning System (GPS) velocities in the absence of prior data.


Clustering analysis, as an unsupervised learning, is an essential technique to discover the natural groupings of a set of multivariate data. Its aim is to explore the underlying structure of a data set based on certain criteria, specific characteristics in the data, and different ways of comparing data. There have been many studies conducted in the last ten years that determine and investigate cluster/block boundaries without any a priori information by considering the similarity of GPS-derived velocities. With the rapid progress of clustering technology, various partitioning, hierarchical, and distribution-based techniques such as k-means, k-medoids, Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH), Gaussian Mixture Model (GMM), and Hierarchical Agglomerative Clustering (HAC) have been utilized to find appropriate solutions that are acceptable and to determine boundaries before block modeling in geodetic studies.


Although clustering techniques are diverse and span in clustering GPS velocities, there are several common problems associated with clustering, including the inability of a single clustering algorithm to accurately determine the underlying structure of all data sets and the lack of consensus on a universal standard for selecting any clustering algorithm for a specific problem. To overcome this problem, ensemble clustering (consensus clustering) techniques that can employ from gathering the strengths of many individual clustering algorithms has been introduced (Kılıç and Özarpacı, 2022). Therefore, the objective of this study is to explore the performance of ensemble clustering techniques for clustering GPS-derived horizontal velocities. In the direction of this research, we used newly published horizontal velocities inferred from a combination of a dense network of long term GNSS observations in Turkey (Kurt et al., 2022). After that, we tested the number of clusters that best represents the data set using the GAP statistic algorithm, and we clustered GPS velocities using five different clustering techniques, including BIRCH, k-means, mini batch k-means, HAC, and spectral clustering. Then, we investigated the performance of three ensemble clustering techniques such as Cluster-based Similarity Partitioning Algorithm (CSPA), Hybrid Bipartite Graph Formulation (HBGF), and Meta-CLustering Algorithm (MCLA) by combining the strengths of five individual clustering algorithms. The outcome of this study revealed that the MCLA ensemble clustering algorithm can be utilized to determine cluster/block boundaries for this region and give enhanced results compared to single clustering techniques.