Introducing MOSAIC-SEN2-CC: A Multispectral Dataset and Adaptation Framework for Remote Sensing Change Captioning


Tuzlupinar B., Ozelbas E., AMASYALI M. F., KARACA A. C.

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2025 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1109/jstars.2025.3615113
  • Dergi Adı: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aerospace Database, Aquatic Science & Fisheries Abstracts (ASFA), Compendex, Geobase, INSPEC, Directory of Open Access Journals, Civil Engineering Abstracts
  • Anahtar Kelimeler: change captioning, multispectral change captioning, remote sensing images
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Remote Sensing Image Change Captioning (RSICC) aims to generate descriptive sentences that effectively characterize the changes between bi-temporal images. Although the state-of-the-art methods focus on predicting captions from RGB image pairs, change captioning in multispectral images has not been investigated yet. For this purpose, we created a new MOSAIC-SEN2-CC dataset, which contains 5 232 pairs of multispectral (MS) images captured from Sentinel-2 satellites and 26 160 change captions over a 12-month period. Our dataset consists of a total of eight categories, namely Wildfire (WF), Flood (FL), Wetland (WET), Green Field (GF), Glacier (GL), Urban (UR), Agriculture (AG), along with a No-Change (NO) category. In this paper, we propose a Multispectral Image Change Captioning (MSICC) framework that consists of BigEarthNet Feature Extractor, Feature Enhancement, and Transformer-Based Decoder modules to effectively benefit from spectral band information. Specifically, the state-of-the-art methods, such as RSICCformer, Chg2Cap and PSNet, are adapted to work with BigEarthNet models using 10 spectral band images. Detailed comparisons that include attention visualizations, RGB versus MS trade-offs, change captions, and performance metrics further demonstrate its effectiveness and ability to address RSICC challenges. We will make our dataset and codebase publicly available to facilitate future research at https://github.com/ChangeCapsInRS/MOSAIC-SEN2-CC.