Web Page Classification Using RNN


Creative Commons License

Büber E., DİRİ B.

Procedia Computer Science, cilt.154, ss.62-72, 2019 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 154
  • Basım Tarihi: 2019
  • Doi Numarası: 10.1016/j.procs.2019.06.011
  • Dergi Adı: Procedia Computer Science
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC
  • Sayfa Sayıları: ss.62-72
  • Anahtar Kelimeler: web page classification, classification, categorization, deep learning, RNN, transfer learning
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Web page classification is an information retrieval application that provides useful information that can be a basis for many different application domains. In this study, a deep learning-based system has been developed for the classification of web pages. The meta tag information contained in the web page is used to classify a web page. The meta tags used are title, description and keywords. RNN based deep learning architecture was used during the tests. Transfer learning is the name given to the approach to building a machine learning model with the use of pre-trained parameters to solve a problem. The effect of using transfer learning on the system has also been examined. According to the results obtained, success rate of web page classification system is approximately 85%. It is not observed that transfer learning has significant contribution to the success rates. However, the use of transfer learning has …