Scalable recommendation systems based on finding similar items and sequences

Uzun-Per M., Gurel A. V., Can A. B., AKTAŞ M. S.

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, vol.34, no.20, 2022 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 34 Issue: 20
  • Publication Date: 2022
  • Doi Number: 10.1002/cpe.6841
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Compendex, Computer & Applied Sciences, INSPEC, Metadex, zbMATH, Civil Engineering Abstracts
  • Keywords: airline ancillary services, apache spark, association rule mining, distributed systems, sequential pattern mining, PATTERNS
  • Yıldız Technical University Affiliated: Yes


The rapid growth in the airline industry, which started in 2009, continued until the COVID-19 era, with the annual number of passengers almost doubling in 10 years. This situation has led to increased competition between airline companies, whose profitability has decreased considerably. They aimed to increase their profitability by making services like seat selection, excess baggage, Wi-Fi access optional under the name of ancillary services. To the best of our knowledge, there is no recommendation system for recommending ancillary services for airline companies. Also, to the best of our knowledge, there is no testing framework to compare recommendation algorithms considering their scalabilities and running times. In this paper, we propose a framework based on Lambda architecture for recommendation systems that run on a big data processing platform. The proposed method utilizes association rule and sequential pattern mining algorithms that are designed for big data processing platforms. To facilitate testing of the proposed method, we implement a prototype application. We conduct an experimental study on the prototype to investigate the performance of the proposed methodology using accuracy, scalability, and latency related performance metrics. The results indicate that the proposed method proves to be useful and has negligible processing overheads.