text2arff: A Text Representation Library


Can E., AMASYALI M. F.

24th Signal Processing and Communication Application Conference (SIU), Zonguldak, Türkiye, 16 - 19 Mayıs 2016, ss.197-200 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/siu.2016.7495711
  • Basıldığı Şehir: Zonguldak
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.197-200
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Which features are the most important for the text classification tasks? In the automatic text categorization area, several studies seek answers to this question. In this paper, new version of Text2arff (a library for text representation) and its new features (word2vec, Word trajectories, etc.) are presented. Also, the software is now a java library which can be used in the user's own projects. In the experiments, the library is run on two sample datasets. The results show that the effect of text representation method is bigger than the classification method. This result also emphasizes the importance of developing new test representation methods.