PyTorch/FAISS implementation and pretrained models for the ICLR 2023 paper. For details, see Active Image Indexing.
[Webpage
]
[arXiv
]
[OpenReview
]
Goal: query image
Applications: IP protection, de-duplication, moderation, etc.
- Feature extractor that maps images to representation vectors is not completely robust to image transformations
- For large-scale databases, brute-force search is not possible
$\rightarrow$ we use approximate search with index structures (another source of error) - this makes the copy detection task very challenging at scale
Idea: change images before release to make them more indexing friendly
The main code for understanding the activation process is in activeindex/engine.py
, in the activate_images
function.
The 3 main inputs are:
- the images to be activated (batch of images 3xHxW)
- the index for which the images need to be activated
- the model used to extract features
The algorithm is as follows:
1. Initialize:
distortion Ξ΄: small perturbation added to the images to move their features. To be optimized.
targets: where the features of the activated images should be pushed closer to.
heatmaps: activation heatmaps that tell where to add the distortion (textured areas).
2. Optimize
for i in range(iterations):
a. Add perceptual constraints to Ξ΄ Ξ΄ -> Ξ΄'
b. Add Ξ΄' to original images img_o + Ξ΄' -> img
c. Extract features from images model(img) -> ft
d. Compute loss between ft and target ||ft - target|| -> L
e. Compute gradients of L wrt Ξ΄' βL(Ξ΄) -> βL
f. Update Ξ΄ with βL Ξ΄ - lr * βL -> Ξ΄
return img_o + Ξ΄'
First, clone the repository locally and move inside the folder:
git clone https://github.com/facebookresearch/active_indexing.git
cd active_indexing
To install the main dependencies, we recommand using conda. PyTorch and Faiss can be installed with:
conda install -c pytorch torchvision pytorch==1.11.0 cudatoolkit=11.3
conda install -c conda-forge faiss-gpu==1.7.2
Then, install the remaining dependencies with:
pip install -r requirements.txt
This codebase has been developed with python version 3.8, PyTorch version 1.11.0, CUDA 11.3 and FAISS 1.7.2.
Experiments are done on DISC21. It is available for download at https://ai.facebook.com/datasets/disc21-dataset/.
The dataset is composed of:
- 1M training images
- 1M reference images
- 50k query images, 10k of which came from the reference set.
We assume the dataset has been organized as follows:
DISC21
βββ train
β βββ T000000.jpg
β βββ ...
β βββ T999999.jpg
βββ references
β βββ R000000.jpg
β βββ ...
β βββ R999999.jpg
βββ dev_queries_groundtruth.csv
We then provide a script to extract the 10k reference images that are used as queries in the dev set thanks to the ground-truth file:
python prepare_disc --data_path path/to/DISC21 --output_dir path/to/DISC21
This should create new folders in the output_dir
(note that only symlinks are created, the images are not duplicated):
references_10k
containing the 10k reference images used as queries in the dev setreferences_990k
folder containing the remaining 990k reference imagesqueries_40k
folder containing 40k additional query images that are not in the reference set (contrary to the paper, we take images from DISC training set instead of the original images of the query dev set before augmentation - for legal convenience).
We provide the links to some models used as feature extractors:
Name | Trunk | Dimension | TorchVision |
---|---|---|---|
sscd_disc_advanced | ResNet-50 | 512 | link |
sscd_disc_mixup | ResNet-50 | 512 | link |
sscd_disc_large | ResNeXt101 | 1024 | link |
dino_r50 | ResNet-50 | 2048 | link |
dino_vits | ViT-s | 384 | link |
isc_dt1 | EffNetv2 | 256 | link |
There are standalone TorchScript models that can be used in any pytorch project without any code corresponding to the networks. (We are not the authors of these models, we just provide them for convenience).
For example, to use the sscd_disc_advanced
model:
mkdir -p models
wget https://dl.fbaipublicfiles.com/sscd-copy-detection/sscd_disc_advanced.torchscript.pt -O models/sscd_disc_advanced.torchscript.pt
Other links:
- SSCD: https://github.com/facebookresearch/sscd-copy-detection/
- DINO: https://github.com/facebookresearch/dino
- ISC-dt1: https://github.com/lyakaap/ISC21-Descriptor-Track-1st
We provide a simple script to extract features from a given model and a given image folder. The features are extracted from the last layer of the model.
python extract_fts --model_name torchscript --model_path path/to/model --data_dir path/to/folder --output_dir path/to/output
This will save in the --output_dir
folder:
fts.pt
: the features in a torch file,filenames.txt
: a file containing the list of filenames corresponding to the features.
By default, images are resized to --resize_size
argument).
To make things faster, the rest of the code assumes that features of the DISC21/training
and DISC21/ref_990k
image folders are pre-computed and saved in new folders.
To reproduce the results of the paper for IVF4096,PQ8x8, use the following command:
python -m activeindex.main --model_name torchscript --model_path path/to/model \
--idx_factory IVF4096,PQ8x8 --idx_dir indexes \
--fts_training_path path/to/train/fts.pth --fts_reference_path path/to/ref/fts.pth \
--data_dir path/to/DISC21/references_10k --query_nonmatch_dir path/to/DISC21/queries_40k \
--active True --output_dir output_active
Replace the last line by --active False --output_dir output_passive --save_imgs False
to do the same experiment with passive images.
This should create:
indexes/idx=IVF4096,PQ8x8_quant=L2.index
: the index created and trained with features offts_training_path
,output_active/
oroutput_passive/
: folder containing the results of the experiment,output_active/imgs
: folder containing the activated images (only if--save_imgs True
),output_active/retr_df.csv
: a csv file containing the results of the retrieval experiment (see below for more details),output_active/icd_df.csv
: a csv file containing the results of the image copy detection experiment (see below for more details).
Useful arguments:
Argument | Default | Description |
---|---|---|
output_dir |
output/ | Path to the output folder where images and logs will be saved. |
idx_dir |
indexes/ | Path to the folder containing the index files (some of them can be long to create/train, so saving them is useful). |
idx_factory |
IVF4096,PQ8x8 | Index string to use to build index. See Faiss documentation for more details. |
kneighbors |
100 | Number of neighbors to retrieve when evaluating. |
model_name |
torchscript | Type of model to use. You can alternatively use Torchvision or Timm models. |
model_path |
None | Path to the torch file containing the model. |
fts_training_path |
None | Path to the torch file containing the features of the training set. |
fts_reference_path |
None | Path to the torch file containing the features of the reference set. |
save_imgs |
True | Saves the images during the active indexing process. It is useful to visualize the images, but is slower and takes more disk space. |
active |
True | If True, uses active indexing. If False, uses passive indexing. |
retr_df.csv
: retrieval results. For every augmented version of the image it stores:
batch | image_index | attack | attack_param | retrieved_distances | retrieved_indices |
---|---|---|---|---|---|
batch number | image number in the reference set | attack used | attack parameter | distances of the retrieved images (in feature space) | indices of the retrieved images |
rank | r@1 | r@10 | r@100 | ap |
---|---|---|---|---|
rank of the original image in the retrieved images | recall at 1 | recall at 10 | recall at 100 | average precision |
icd_df.csv
: image copy detection results. For each augmented version of the image it stores:
batch | image_index | attack | attack_param | retrieved_distances | retrieved_indices |
---|---|---|---|---|---|
batch number | image number in the reference set | attack used | attack parameter | distances of the retrieved images (in feature space) | indices of the retrieved images |
To compute the precision-recall curve, you can use the associated code in the notebook analysis.ipynb
.
- The
--model_path
argument should be the same as the one used to extract the features. - The overlay onto screenshot transform (from Augly) that is used in the paper is the mobile version (Augly's default: web). To change it, you need to locate the file
augly/utils/base_paths.py
(runpip show augly
to locate the Augly library). Then change the line "TEMPLATE_PATH = os.path.join(SCREENSHOT_TEMPLATES_DIR, "web.png")" to "TEMPLATE_PATH = os.path.join(SCREENSHOT_TEMPLATES_DIR, "mobile.png")".
active_indexing is CC-BY-NC licensed, as found in the LICENSE file.
If you find this repository useful, please consider giving a star β and please cite as:
@inproceedings{fernandez2022active,
title={Active Image Indexing},
author={Fernandez, Pierre and Douze, Matthijs and JΓ©gou, HervΓ© and Furon, Teddy},
booktitle={International Conference on Learning Representations (ICLR)},
year={2023}
}