Cancer Stage Discovery with StyleGAN3, Swin Transformer, and Multimodal LLM StyleGAN3, Swin Transformer ve Multimodal LLM ile Kanser Evrelerinin Kesfi


Dede R., BİLGİN G.

33rd IEEE Conference on Signal Processing and Communications Applications, SIU 2025, İstanbul, Turkey, 25 - 28 June 2025, (Full Text) identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/siu66497.2025.11111845
  • City: İstanbul
  • Country: Turkey
  • Keywords: Breast cancer staging, Multimodal Large Language Models (MLLM), StyleGAN3, Swin Transformer
  • Yıldız Technical University Affiliated: Yes

Abstract

Breast cancer staging is crucial for understanding disease progression and identifying new sub-stages. In this study, patient-specific synthetic histopathological images were generated using StyleGAN3 and Swin Transformer, and a Qwen2-VL-based multimodal large language model (LLM) was fine-tuned to predict cancer stages and discover new ones. The GAN-generated images were labeled only with cancer stage information and fine-tuned on the LLM for classification. Out-of-Distribution (OOD) analysis was applied to evaluate model outputs, where logit values were analyzed to compute confidence scores and identify potential new stage candidates. Results indicate that GAN-based data augmentation and multimodal models enhance the potential for discovering previously undefined cancer stages.