FEATURE EXTRACTION FROM PATIENT DATA FOR DIAGNOSING PEDIATRIC PATIENTS


Altunsaçan G., Aslanyürek B., Aydın E.

International Congress of New Searches in Sciences, İstanbul, Turkey, 8 - 09 June 2024, pp.37

  • Publication Type: Conference Paper / Summary Text
  • City: İstanbul
  • Country: Turkey
  • Page Numbers: pp.37
  • Yıldız Technical University Affiliated: Yes

Abstract

In the healthcare field, machine learning applications are extensively used to improve disease diagnosis and treatment processes and facilitate the work of healthcare professionals. Some studies in the healthcare field utilize structured data, while others use unstructured data such as images, videos, and text. The vast amount of unstructured data accumulated in electronic health records provides a significant resource for conducting advanced clinical research and developing AI-based diagnostic and treatment systems. Therefore, extracting features related to disease symptoms from medical texts is crucial for the development of AI-based diagnostic and treatment systems. Symptoms, which are among the most critical indicators in disease diagnosis, are found within unstructured texts in the record systems of healthcare institutions. This study aims to extract symptoms related to pediatric diseases from Turkish medical texts using natural language processing (NLP) techniques. Initially, the text data were labeled with the assistance of an expert, followed by preprocessing steps such as cleaning, tokenization, normalization, and stemming. In the second stage, the preprocessed texts were converted into numerical vectors using techniques such as Term Frequency-Inverse Document Frequency (TF-IDF), One-Hot Encoding, and Bag of Words. In the final stage, the generated numerical vectors were used to predict symptoms mentioned in the medical texts using machine learning methods such as logistic regression, k-nearest neighbors, decision trees, support vector machines, ensemble methods, and artificial neural networks. Thus, through the NLP approach developed in this study, symptoms can be directly utilized as features in studies aimed at diagnosing pediatric diseases.