Comparison of K-means and Fuzzy C-means Clustering Algorithms on Water Quality Parameters: Case Study of Ergene Basin for 17 Stations

Arslan Çene G., Parim C., Çene E.

3rd International Congress of Engineering and Natural Sciences Studies, Ankara, Turkey, 24 - 25 May 2023, pp.62

  • Publication Type: Conference Paper / Summary Text
  • City: Ankara
  • Country: Turkey
  • Page Numbers: pp.62
  • Yıldız Technical University Affiliated: Yes


Water quality parameters are important measures of the health and safety of water sources, which can be affected by various natural and human-induced factors. There are several parameters to assess water quality. The aim of this study is to group 17 water stations in the Ergene Basin, Turkiye by using k-means and fuzzy c-means clustering algorithms which are methods of unsupervised machine learning. For this reason, 15 water related variables from the period of 1985-2013 are used to group 17 water stations. Different number of clusters are inspected in both of the algorithms and the optimal number of clusters is found as 4. These clusters are named as high-quality water, slightly polluted water, polluted water, and highly polluted water. The selected water parameters are Biochemical oxygen demand (BOD5), Chloride (Cl-), Dissolved oxygen (DO), Escherichia coli (EC), Aluminum (Al), Ammonium–nitrogen (NH4-N), Nitrite–nitrogen (NO2-N), Nitrate–nitrogen (NO3-N), Orthophosphate (o-PO4), Potential of Hydrogen (pH), Photovoltaics (pV), Suspended Solid (SS), Temperature (T), Total Dissolved Solid (TDS), Turbidity (Turb).

The center of the clusters are used to identify the characteristics of stations. The first cluster has the lowest BOD5, Al, NO2-N, T average, and the highest DO average. The second cluster has the lowest Cl-, EC, NH4-N, o-PO4, pV, SS, TDS and Turb average, and the highest NO3-N, pH and T average. The third cluster has the lowest DO average, and has the highest Cl-, EC, Al, NH4-N, NO2-N, o-PO4 and TDS average. The fourth cluster has the lowest NO3-N and pH average, and has the highest BOD5, pV, SS and Turb average.