ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding

Contributions

The training-free RETAKE is the first to jointly model temporal and knowledge redundancy for long video understanding, which enables 4x longer video sequence with less than 1% performance drop.
We propose a novel keyframe selection method DPSelect to reduce low-level temporal redundancy, and a novel KV cache compression method PivotKV to reduce high-level knowledge redundancy in long videos.

Prepare environment

# For GPU users
conda env create -f environment.yaml

# For NPU users
# conda env create -f environment_npu.yaml

apt-get install ffmpeg # NOTE: Quick demo does not require ffmpeg

Quick demo

Step 1: Change hf_qwen2vl7b_path in ./demo.py into your path to Qwen2-VL-7B-Instruct. Note that for NPU users, you need also change config_path into 'configs/retake_demo_npu.yaml'.
Step 2: Run demo

python demo.py

Reproduce ReTaKe

Prepare the datasets following the docs.
- Prepare VideoMME
- Prepare MLVU
- Prepare LVBench
Run script

bash scripts/infer_eval_retake.sh ${YOUR_PATH_TO_Qwen2-VL-7B-Instruct} configs/retake_videomme.yaml 8
bash scripts/infer_eval_retake.sh ${YOUR_PATH_TO_Qwen2-VL-7B-Instruct} configs/retake_mlvu.yaml 8
bash scripts/infer_eval_retake.sh ${YOUR_PATH_TO_Qwen2-VL-7B-Instruct} configs/retake_lvbench.yaml 8

The above script perform inference and evaluation all in one. Results can be found in ./results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding

Contributions

Prepare environment

Quick demo

Reproduce ReTaKe

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
configs		configs
docs		docs
misc		misc
retake		retake
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
environment.yaml		environment.yaml
environment_npu.yaml		environment_npu.yaml

License

SCZwangxiao/video-ReTaKe

Folders and files

Latest commit

History

Repository files navigation

ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding

Contributions

Prepare environment

Quick demo

Reproduce ReTaKe

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages