8th International Artificial Intelligence and Data Processing Symposium, IDAP 2024, Malatya, Türkiye, 21 - 22 Eylül 2024
This study examines the performance of four different large language models (LLama2, LLama3, and Mistralbased) in doctor-patient written communication in Turkish health counseling. The models were trained and fine-tuned on a patient-doctor question-answer dataset [1]. The metrics used for performance evaluation include ROUGE, Elo rating, Winning percentage, and Expert evaluation. The comparative analysis results indicate that the SambaLingo-Turkish-Chat model was successful in terms of response accuracy and contextual relevance, while the Trendyol-LLM-7b-chat-v 1.8 model proved to be more successful when considering the ethical aspects of the task [14], [17]. This study demonstrates the potential of AI-powered virtual doctor assistants in Turkish healthcare services and contributes to the development of Turkish-specific medical chatbots.