Representation of Click-Stream DataSequences for Learning User Navigational Behavior by Using Embeddings

Ölmezoğulları E., AKTAŞ M. S.

IEEE Big Data, User Understanding from BigData Workshop, 12 December 2020 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/bigdata50022.2020.9378437
  • Keywords: User browsing graph-data, Clickstream data, User browsing behavior analysis, Unsupervised Machine Learning, Clustering, Embedding


User behavior can be identified by clustering the web-site navigational patterns expressed by clickstream sequences. Representation of clickstream sequences is yet a challenging problem. The main difficulty is related to the fact that one needs to represent clickstream sequences, which mainly consists of dynamic URLs, in such a way that traditional clustering algorithms are applicable to group together these sequences. In this study, we present the application of embedding approaches to represent click-stream data sequences, to enable machine learning algorithms learn the users' navigational behaviors on web-sites. By utilizing embedding representation, we propose an algorithm that takes clickstream data as input and creates clustered sequential patterns. We discuss the details of different representation algorithms and present its evaluations. We investigate the accuracy in finding the hidden clustered data sequences within the clickstream data. The results show that Word2Vec, representation method that can lead to high quality clustering of user navigational patterns.