33rd IEEE Conference on Signal Processing and Communications Applications, SIU 2025, İstanbul, Türkiye, 25 - 28 Haziran 2025, (Tam Metin Bildiri)
In remote sensing change captioning task, the change region can be detected by examining images taken from the same region at different times, and the change can be expressed in natural language. With this motivation, this study aims to adapt the BLIP (Bootstrapping Language-Image Pre-training) model, a general-purpose language-visual model, to the task of change captioning in the field of remote sensing. To perform change captioning with the BLIP model, updates were made to the model, different parameters were tested, and the training process was followed. Different methods were followed to obtain the change features from the images. The models obtained from experiments were tested and the most appropriate methods were selected, and the generation of meaningful sentences expressing the changes between RS images was achieved. In this context, comparable results have been obtained with the RSICCformer study, which is considered a benchmark in the field.