¹ Edoardo Daniele Cannas, ² Sriram Baireddy, ¹ Paolo Bestagini
¹ Stefano Tubaro, ² Edward J. Delp
¹ Image and Sound Processing Laboratory, ² Video and Image Processing Laboratory
This is the official code repository for the paper Enhancement Strategies For Copy-Paste Generation & Localization in RGB Satellite Imagery, accepted to the 2023 IEEE International Workshop on Information Forensics and Security (WIFS).
The repository is currently under development, so feel free to open an issue if you encounter any problem.
Landsat8 sample, no equalization. | Landsat8 sample, uniform equalization. |
Sentinel2A sample, no equalization. | Sentinel2A sample, uniform equalization. |
In order to run our code, you need to:
- install conda
- create the
overhead-norm-strategies
environment using the environment.yml file
conda env create -f envinroment.yml
conda activate overhead-norm-strategies
You can download the dataset from this link.
The dataset is composed of 2 folders:
pristine_images
: contains the raw full resolution products (pristine_images/full_res_products
) and the256x256
patches extracted from them (pristine_images/patches
);spliced_images
: contains the copy-paste images generated from thepristine_images/patches/test_patches
using theisplutils/create_spliced_rgb_samples.py
script.
In order to train the model, you first have to divide the dataset into training, validation and test splits.
You can do this by running the notebook/Training dataset creation.ipynb
notebook.
Please notice that these splits and patches are the ones used in the paper, but you can create your own by modifying the notebook.
If you want to inspect the raw products, a starting point is the Raw satellite products processing notebook.
All the normalization strategies used in the paper are provided as classes in the isplutils/data.py
file.
Please notice that for the MinPMax
strategy, we used the RobustScaler implementation from sklearn
.
Statistics are learned from the training set, and then applied to the validation and test sets.
We provide the scalers used in the paper, one for each satellite product, inside the folders of pristine_images/full_res_products
.
The train_fe.py
takes care of training the models.
You can find the network definition in the isplutils/network.py
file.
All the hyperparameters for training are listed in the file.
To replicate the models used in the paper, follow the train_all.sh bash script.
Inside the data/spliced_images
folder are contained the two datasets used in the paper, i.e.:
Standard Generated Dataset (SGD)
: images generated by simply normalizing the dynamics between 0 and 1 using a maximum scaling;Histogram Equalized Generated Dataset (HEGD)
: images generated by equalizing the histogram of the images using a uniform distribution.
Inside each folder, there is a Pandas DataFrame containing info on the images.
Inside the models
folder, we provide the models presented in the paper (both weights and definitions).
You can replicate our results using the test_with_AUCs.py
script. In alternative, you can run the bash script test_all.sh.
Once you have the results, use the notebooks/Mean test results plot.ipynb notebook to plot the results shown in the paper.