This repo supports training and evaluation of the Ego4D-NLQ dataset.
- Follow INSTALL.nd for installing necessary dependencies and compiling the code.Torch version recommand >=1.8.0
- GroundNLQ leverage the extracted egocentric InterVideo and EgoVLP features and CLIP textual token features, please refer to GroundNLQ.
- Download the object data, including files. Will release this data soon.
- ./libs/core: Parameter default configuration module.
- ./configs: Parameter file.
- ./ego4d_data: the annotation data.
- ./tools: Scripts for running,train_ego4d_finetune_head_.sh are for finetune,while train_ego4d_ are for scratch.
- ./libs/datasets: Data loader and IO module.
- ./libs/modeling: Our main model with all its building blocks.
- ./libs/utils: Utility functions for training, inference, and postprocessing.
We adopt distributed data parallel DDP and fault-tolerant distributed training with torchrun.
Training can be launched by running the following command:
bash <tools/*,scripts> CONFIG_FILE EXP_ID CUDA_DEVICE_ID
where CONFIG_FILE
is the config file for model/dataset hyperparameter initialization,
EXP_ID
is the model output directory name defined by yourself, CUDA_DEVICE_ID
is cuda device id.
The checkpoints and other experiment log files will be written into <output_folder>/OUTPUT_PATH
, output_folder is defined in the config file.
Training can be launched by running the following command:
bash tools/train ego4d_finetune_head_onegpu.sh CONFIG_FILE RESUME_PATH OUTPUT_PATH CUDA_DEVICE_ID
where RESUME_PATH
is the path of the pretrained model weights.
The config file is the same as scratch.
Once the model is trained, you can use the following commands for inference:
python eval_nlq.py CONFIG_FILE CHECKPOINT_PATH -gpu CUDA_DEVICE_ID <--save>
where CHECKPOINT_PATH
is the path to the saved checkpoint,save
is for controling the output .
- The results (Recall@K at IoU = 0.3 or 0.5) on the val. set should be similar to the performance of the below table reported in the main report.
Method | Dataset | R@1 IoU=0.3 | R@1 IoU=0.5 | R@5 IoU=0.3 | R@5 IoU=0.5 |
---|---|---|---|---|---|
ObjectNLQ | NLQ | 28.43 | 19.95 | 56.06 | 42.09 |
ObjectNLQ | GoalStep | 28.34 | 24.08 | 57.03 | 50.39 |
We conduct post-model prediction ensemble to enhance performance for leaderboard submission. The actual command used in the experiments is
python ensemble.py
or
python ensemble_more.py
If you are using our code, please consider citing our paper.
@article{feng2024objectnlq,
title={ObjectNLQ@ Ego4D Episodic Memory Challenge 2024},
author={Feng, Yisen and Zhang, Haoyu and Xie, Yuquan and Li, Zaijing and Liu, Meng and Nie, Liqiang},
journal={arXiv preprint arXiv:2406.15778},
year={2024}
}
This code is inspired by GroundNLQ. We use the same video and text feature as GroundNLQ. We thank the authors for their awesome open-source contributions.