QueryMatch

This is the official implementation of "QueryMatch: A Query-based Contrastive Learning Framework for Weakly Supervised Visual Grounding". In this paper, we propose a novel query-based one-stage framework for weakly supervised visual grounding, namely QueryMatch, Different from previous work, QueryMatch represents candidate objects with a set of query features, which inherently establish accurate one-to-one associations with visual objects. In this case, QueryMatch re-formulates weakly supervised visual grounding as a query-text matching problem, which can be optimized via the query-based contrastive learning. Based on QueryMatch we further propose an innovative strategy for effective weakly supervised learning, namely Active Query Selection (AQS). In particular, AQS aims to enhance the effectiveness of query-based contrastive learning by actively selecting high-quality query features.

Installation

Clone this repo

git clone https://github.com/TensorThinker/QueryMatch.git
cd QueryMatch

Create a conda virtual environment and activate it

conda create -n querymatch python=3.8 -y
conda activate querymatch

Install Pytorch following the official installation instructions

# CUDA 11.7
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2

Install detectron following the official installation instructions

git clone https://github.com/facebookresearch/detectron2.git
python -m pip install -e detectron2

Install apex following the official installation guide

pip install -v --disable-pip-version-check --no-build-isolation --no-cache-dir ./

Compile the DCN layer:

cd utils_querymatch/DCN
./make.sh

cd mask2former
pip install -r requirements.txt
cd ./modeling/pixel_decoder/ops
sh make.sh

wget https://github.com/explosion/spacy-models/releases/download/en_vectors_web_lg-2.1.0/en_vectors_web_lg-2.1.0.tar.gz -O en_vectors_web_lg-2.1.0.tar.gz
pip install en_vectors_web_lg-2.1.0.tar.gz
pip install albumentations
pip install Pillow==9.5.0
pip install tensorboardX

Data Preparation

Download images and Generate annotations according to SimREC.
Download the pretrained weights of Mask2former from OneDrive.
The project structure should look like the following:

| -- QueryMatch
     | -- data
        | -- anns
            | -- refcoco.json
            | -- refcoco+.json
            | -- refcocog.json
        | -- images
            | -- train2014
                | -- COCO_train2014_000000000072.jpg
                | -- ...
     | -- config_querymatch
     | -- configs
     | -- datasets
     | -- datasets_querymatch
     | -- DCNv2_latest
     | -- detectron2
     | -- mask2former
     | -- models_querymatch
     | -- ...

NOTE: our Mask2former is trained on COCO’s training images, excluding those in RefCOCO, RefCOCO+, and RefCOCOg’s validation+testing.

QueryMatch

Training

python train_querymatch.py --config ./config_querymatch/[DATASET_NAME].yaml --config-file ./configs/coco/instance-segmentation/swin/maskformer2_swin_base_384_bs16_50ep.yaml --eval-only MODEL.WEIGHTS [PATH_TO_MASK2FORMER_WEIGHT]

Evaluation

python test_querymatch.py --config ./config_querymatch/[DATASET_NAME].yaml --eval-weights [PATH_TO_CHECKPOINT_FILE] --config-file ./configs/coco/instance-segmentation/swin/maskformer2_swin_base_384_bs16_50ep.yaml --eval-only MODEL.WEIGHTS [PATH_TO_MASK2FORMER_WEIGHT]

Model Zoo

QueryMatch on three RES benchmark datasets

Method	RefCOCO			RefCOCO+			RefCOCOg
	val	testA	testB	val	testA	testB	val-g
QueryMatch	59.10	59.08	58.82	39.87	41.44	37.22	43.06

QueryMatch on three REC benchmark datasets

Method	RefCOCO			RefCOCO+			RefCOCOg
	val	testA	testB	val	testA	testB	val-g
QueryMatch	66.02	66.00	65.48	44.76	46.72	41.50	48.47

Notes

Experimental Environment for Ours

GPU: RTX 4090(24GB)
CPU: 32 vCPU Intel(R) Xeon(R) Platinum 8352V CPU @ 2.10GHz
CUDA 11.7
torch 2.0.1

Compatibility Note

This project is compatible with multiple CUDA versions, including but not limited to CUDA 11.3. While the relative performance trends remain consistent across different hardware environments, please note that the specific numerical results may vary slightly.

Citation

@inproceedings{chen2024querymatch,
  title={QueryMatch: A Query-based Contrastive Learning Framework for Weakly Supervised Visual Grounding},
  author={Chen, Shengxin and Luo, Gen and Zhou, Yiyi and Sun, Xiaoshuai and Jiang, Guannan and Ji, Rongrong},
  booktitle={Proceedings of the 32nd ACM International Conference on Multimedia},
  pages={4177--4186},
  year={2024}
}

Acknowledgement

Thanks a lot for the nicely organized code from the following repos

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QueryMatch

Installation

Data Preparation

QueryMatch

Training

Evaluation

Model Zoo

QueryMatch on three RES benchmark datasets

QueryMatch on three REC benchmark datasets

Notes

Experimental Environment for Ours

Compatibility Note

Citation

Acknowledgement

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
DCNv2_latest		DCNv2_latest
config_querymatch		config_querymatch
configs		configs
datasets		datasets
datasets_querymatch		datasets_querymatch
figs		figs
mask2former		mask2former
models_querymatch		models_querymatch
tools		tools
utils_querymatch		utils_querymatch
README.md		README.md
predict.py		predict.py
test_querymatch.py		test_querymatch.py
train_net.py		train_net.py
train_querymatch.py		train_querymatch.py

TensorThinker/QueryMatch

Folders and files

Latest commit

History

Repository files navigation

QueryMatch

Installation

Data Preparation

QueryMatch

Training

Evaluation

Model Zoo

QueryMatch on three RES benchmark datasets

QueryMatch on three REC benchmark datasets

Notes

Experimental Environment for Ours

Compatibility Note

Citation

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages