π₯Remote Sensing Spatio-Temporal Vision-Language Models: A Comprehensive Survey: [Paper] [Github]
- LEVIR-CC: A Large Dataset for Remote Sensing Image Change Captioning. [Paper] [code]
- LEVIR-MCI: A Large Dataset for exploring multi-task learning for change detection and change captioning. [Paper] [code]
Download Source:
- [π€ Huggingface]
- [Google Drive]
- [Baidu Pan (code:nq9y)]
The path list in the downloaded folder is as follows:
path to LEVIR_CC_dataset:
ββLevirCCcaptions.json
ββimages
ββtrain
β ββA
β ββB
ββval
β ββA
β ββB
ββtest
β ββA
β ββB
where A contains images of pre-phase, B contains images of post-phase.
- Dataset mean and variance:
Folder | R_mean | G_mean | B_mean | R_var | G_var | B_var |
---|---|---|---|---|---|---|
A | 0.44152 | 0.43863 | 0.37418 | 0.17623 | 0.16578 | 0.15337 |
B | 0.33992 | 0.33383 | 0.28561 | 0.13035 | 0.12678 | 0.11959 |
ALL | 0.39073 | 0.38623 | 0.32989 | 0.15329 | 0.14628 | 0.13648 |
-
The LEVIR-CC dataset contains 10,077 pairs of bi-temporal remote sensing images and 50,385 sentences describing the differences between images:
-
The dataset contains bi-temporal images as well as diverse change detection masks and descriptive sentences. It provides a crucial data foundation for exploring multi-task learning for change detection and change captioning.
-
Download link (https://github.com/Chen-Yang-Liu/Change-Agent)