MS-TR: A Morphologically enriched sentiment Treebank and recursive deep models for compositional semantics in Turkish

Zeybek, Sultan; Koc, Ebubekir; Secer, Aydın

doi:10.1080/23311916.2021.1893621

MS-TR: A Morphologically enriched sentiment Treebank and recursive deep models for compositional semantics in Turkish

Atıf İçin Kopyala

Zeybek S., Koc E., Secer A.

COGENT ENGINEERING, cilt.8, sa.1, 2021 (ESCI)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 8 Sayı: 1
Basım Tarihi: 2021
Doi Numarası: 10.1080/23311916.2021.1893621
Dergi Adı: COGENT ENGINEERING
Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), Scopus
Anahtar Kelimeler: Recursive neural networks, sentiment analysis, sentiment treebank, opinion mining, morphologically rich languages
Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Recursive Deep Models have been used as powerful models to learn compositional representations of text for many natural language processing tasks. However, they require structured input (i.e. sentiment treebank) to encode sentences based on their tree-based structure to enable them to learn latent semantics of words using recursive composition functions. In this paper, we present our contributions and efforts for the Turkish Sentiment Treebank construction. We introduce MS-TR, a Morphologically Enriched Sentiment Treebank, which was implemented for training Recursive Deep Models to address compositional sentiment analysis for Turkish, which is one of the well-known Morphologically Rich Language (MRL). We propose a semi-supervised automatic annotation, as a distant-supervision approach, using morphological features of words to infer the polarity of the inner nodes of MS-TR as positive and negative. The proposed annotation model has four different annotation levels: morph-level, stem-level, token-level, and review-level. Each annotation level's contribution was tested using three different domain datasets, including product reviews, movie reviews, and the Turkish Natural Corpus essays. Comparative results were obtained with the Recursive Neural Tensor Networks (RNTN) model which is operated over MS-TR, and conventional machine learning methods. Experiments proved that RNTN outperformed the baseline methods and achieved much better accuracy results compared to the baseline methods, which cannot accurately capture the aggregated sentiment information.