text2arff: A Text Representation Library


Can E., AMASYALI M. F.

24th Signal Processing and Communication Application Conference (SIU), Zonguldak, Turkey, 16 - 19 May 2016, pp.197-200 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/siu.2016.7495711
  • City: Zonguldak
  • Country: Turkey
  • Page Numbers: pp.197-200

Abstract

Which features are the most important for the text classification tasks? In the automatic text categorization area, several studies seek answers to this question. In this paper, new version of Text2arff (a library for text representation) and its new features (word2vec, Word trajectories, etc.) are presented. Also, the software is now a java library which can be used in the user's own projects. In the experiments, the library is run on two sample datasets. The results show that the effect of text representation method is bigger than the classification method. This result also emphasizes the importance of developing new test representation methods.