Turkish synonym identification from multiple resources: monolingual corpus, mono/bilingual online dictionaries, and WordNet


Yildiz T., DİRİ B. , Yildirim S.

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, cilt.25, ss.752-760, 2017

  • Cilt numarası: 25 Konu: 2
  • Basım Tarihi: 2017
  • Doi Numarası: 10.3906/elk-1508-89
  • Dergi Adı: TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
  • Sayfa Sayısı: ss.752-760

Özet

In this study, a model is proposed to determine synonymy by incorporating several resources. The model extracts the features from monolingual online dictionaries, a bilingual online dictionary, WordNet and a monolingual Turkish corpus. Once it has built a candidate list, it determines the synonymy for a given word by means of those features. All these resources and the approaches are evaluated. Taking all features into account and applying machine learning algorithms, the model shows good performance of F-Measure with 81.4%. The study contributes to the literature by integrating several resources and attempting the first corpus-driven synonym detection system for Turkish.