Cancer Stage Discovery with StyleGAN3, Swin Transformer, and Multimodal LLM StyleGAN3, Swin Transformer ve Multimodal LLM ile Kanser Evrelerinin Kesfi


Dede R., BİLGİN G.

33rd IEEE Conference on Signal Processing and Communications Applications, SIU 2025, İstanbul, Türkiye, 25 - 28 Haziran 2025, (Tam Metin Bildiri) identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/siu66497.2025.11111845
  • Basıldığı Şehir: İstanbul
  • Basıldığı Ülke: Türkiye
  • Anahtar Kelimeler: Breast cancer staging, Multimodal Large Language Models (MLLM), StyleGAN3, Swin Transformer
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Breast cancer staging is crucial for understanding disease progression and identifying new sub-stages. In this study, patient-specific synthetic histopathological images were generated using StyleGAN3 and Swin Transformer, and a Qwen2-VL-based multimodal large language model (LLM) was fine-tuned to predict cancer stages and discover new ones. The GAN-generated images were labeled only with cancer stage information and fine-tuned on the LLM for classification. Out-of-Distribution (OOD) analysis was applied to evaluate model outputs, where logit values were analyzed to compute confidence scores and identify potential new stage candidates. Results indicate that GAN-based data augmentation and multimodal models enhance the potential for discovering previously undefined cancer stages.