32nd IEEE Conference on Signal Processing and Communications Applications, SIU 2024, Mersin, Türkiye, 15 - 18 Mayıs 2024
This paper presents a research study on the performance analysis of vision transformer-based UNet models in semantic and instance segmentation of cell nuclei in colon histology images. In the proposed study, the performances of TransUNet and Swin-Unet architectures, which have vision transformer structures, in semantic and instance segmentation of cell nuclei were analyzed and the related models were compared with the classical UNet model based on Convolutional Neural Network (CNN). Within the scope of the study, the Colon Nuclei Identification and Counting (CoNIC) Challenge 2022 dataset, which is one of the challenging datasets with high class imbalance characteristics, was used. The performance of the models in the semantic segmentation task was evaluated using pixel accuracy, precision, recall, F1-measure, Dice Similarity Coefficient (DSC) and Intersection over Union (IoU) metrics, and their performance in the instance segmentation task was evaluated using the panoptic quality (PQ) metric. As a result of the experimental studies carried out on the CoNIC Challenge 2022 dataset, it has been observed that vision transformer-based UNet models are inadequate in spatial detail extraction compared to the CNN-based UNet model, and the CNN-based classical UNet model shows higher cell nuclei segmentation performance than the TransUNet and Swin-Unet models.