Spatial-Temporal Transformer for Dynamic Scene Graph Generation

Pytorch Implementation of our paper Spatial-Temporal Transformer for Dynamic Scene Graph Generation accepted by ICCV2021. We propose a Transformer-based model STTran to generate dynamic scene graphs of the given video. STTran can detect the visual relationships in each frame.

The introduction video is available now: https://youtu.be/gKpnRU8btLg

About the code We run the code on a single RTX2080ti for both training and testing. We borrowed some code from Yang's repository and Zellers' repository.

Requirements

python=3.6
pytorch=1.1
scipy=1.1.0
cypthon
dill
easydict
h5py
opencv
pandas
tqdm
yaml

Usage

We use python=3.6, pytorch=1.1 and torchvision=0.3 in our code. First, clone the repository:

git clone https://github.com/yrcong/STTran.git

We borrow some compiled code for bbox operations.

cd lib/draw_rectangles
python setup.py build_ext --inplace
cd ..
cd fpn/box_intersections_cpu
python setup.py build_ext --inplace

For the object detector part, please follow the compilation from https://github.com/jwyang/faster-rcnn.pytorch We provide a pretrained FasterRCNN model for Action Genome. Please download here and put it in

fasterRCNN/models/faster_rcnn_ag.pth

Dataset

We use the dataset Action Genome to train/evaluate our method. Please process the downloaded dataset with the Toolkit. The directories of the dataset should look like:

|-- action_genome
    |-- annotations   #gt annotations
    |-- frames        #sampled frames
    |-- videos        #original videos

In the experiments for SGCLS/SGDET, we only keep bounding boxes with short edges larger than 16 pixels. Please download the file object_bbox_and_relationship_filtersmall.pkl and put it in the dataloader

Train

You can train the STTran with train.py. We trained the model on a RTX 2080ti:

For PredCLS:

python train.py -mode predcls -datasize large -data_path $DATAPATH

For SGCLS:

python train.py -mode sgcls -datasize large -data_path $DATAPATH

For SGDET:

python train.py -mode sgdet -datasize large -data_path $DATAPATH

Evaluation

You can evaluate the STTran with test.py.

For PredCLS (trained Model):

python test.py -m predcls -datasize large -data_path $DATAPATH -model_path $MODELPATH

For SGCLS (trained Model): :

python test.py -m sgcls -datasize large -data_path $DATAPATH -model_path $MODELPATH

For SGDET (trained Model): :

python test.py -m sgdet -datasize large -data_path $DATAPATH -model_path $MODELPATH

Citation

If our work is helpful for your research, please cite our publication:

@inproceedings{cong2021spatial,
  title={Spatial-Temporal Transformer for Dynamic Scene Graph Generation},
  author={Cong, Yuren and Liao, Wentong and Ackermann, Hanno and Rosenhahn, Bodo and Yang, Michael Ying},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={16372--16382},
  year={2021}
}

Help

When you have any question/idea about the code/paper. Please comment in Github or send us Email. We will reply as soon as possible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spatial-Temporal Transformer for Dynamic Scene Graph Generation

Requirements

Usage

Dataset

Train

Evaluation

Citation

Help

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.idea		.idea
data		data
dataloader		dataloader
fasterRCNN		fasterRCNN
lib		lib
LICENSE		LICENSE
README.md		README.md
test.py		test.py
train.py		train.py

License

yrcong/STTran

Folders and files

Latest commit

History

Repository files navigation

Spatial-Temporal Transformer for Dynamic Scene Graph Generation

Requirements

Usage

Dataset

Train

Evaluation

Citation

Help

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages