Application Identification via Network Traffic Classification


YAMANSAVASCILAR B., GÜVENSAN M. A., YAVUZ A. G., Karsligil M. E.

International Conference on Computing, Networking and Communications (ICNC), California, Amerika Birleşik Devletleri, 26 - 29 Ocak 2017, ss.843-848 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/iccnc.2017.7876241
  • Basıldığı Şehir: California
  • Basıldığı Ülke: Amerika Birleşik Devletleri
  • Sayfa Sayıları: ss.843-848
  • Anahtar Kelimeler: Network Traffic Classification, Application-based, Machine Learning
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Recent developments in Internet technology have led to an increased importance of network traffic classification. In this study, we used machine-learning methods for the identification of applications using network traffic classification. Contrary to existing studies, which classify applications into categories like FTP, Instant Messaging, etc., we tried to identify popular end-user applications such as Facebook, Twitter, Skype and many more individually. We are motivated by the fact that individual identification of applications is of high importance for network security, QoS enforcement, and trend analysis. For our tests, we used UNB ISCX Network Traffic dataset and our internal dataset, consisting of 14 and 13 well-known applications respectively. In our experiments, we evaluated four classification algorithms, namely J48, Random Forest, k-NN, and Bayes Net. With the complete set of 111 features, k-NN gave the best result for the ISCX Dataset as 93.94% of accuracy using the value of k as 1, and Random Forest gave the best result for the internal dataset as 90.87% of accuracy. During the course of this study, the initial numbers of features were successfully reduced to two sets of 12 features specific to each dataset without a compromise to the success. Moreover, we observed a 2% increase in the success rate for the internal dataset. We believe that individual application identification by applying machine-learning methods is a viable solution and currently we are investigating a two-tier approach to make it more resilient to in category confusion.