An Application on Alternative Methods in Text Clustering and Notification Management


Thesis Type: Postgraduate

Institution Of The Thesis: Yildiz Technical University, Faculty Of Arts & Science, Department Of Statistics, Turkey

Approval Date: 2021

Thesis Language: Turkish

Student: Emre Rıdvan Muratlar

Supervisor: Doğan Yıldız

Abstract:

Notification management systems have an important place within the scope of CRM studies. Feedbacks from customers should be evaluated, complaints should be resolved and customer satisfaction should be ensured. With the increasing usage of social media in recent years, most of the notifications come from social media channels. In the case of large numbers of these notifications, data should be grouped automatically to determine the strategies to be applied to the notifications. Various text mining and machine learning algorithms are used within the scope of this topic. When the number of data is large, tagging the data brings a lot of workloads. In such cases, clustering methods can be used to group similar data. The clustering of text data is a challenging problem due to the high data size. Excessive data size causes a decrease in cluster quality and an increase in algorithm run times. Different methods are being worked on to solve these problems. Within the scope of the thesis, firstly, text mining processes will be discussed, then Spherical k-means and Mini-Batch k-means algorithms will be examined as an alternative to the k-means algorithm. At the last stage of the thesis, by using the Python programming language data clearing, stemming, tokenization, stopwords elimination, and vectorization will be done to tweets sent by tagging a bank. Text data will be clustered with k-means, Spherical k-means, and Mini-Batch k-means algorithms after the text mining process. The application results will be evaluated in terms of the sum of squared errors(SSE), silhouette coefficient, and algorithm run times.