Download the weights

This repo contains a Vision Transformer video model based on the VideoMAE V2 paper and code, as well as examples for compiling the model using TensorRT and running inference using the built engine.

It is part of a blog post describing an issue with compiling this model using TensorRT - to get a working engine you'll need to find the Uncomment and change this to use the desired attention module line and use one of the working Attention layers.

Download the weights

Download the distilled checkpoint by running:

wget https://huggingface.co/OpenGVLab/VideoMAE2/resolve/main/distill/vit_s_k710_dl_from_giant.pth

How to use

After downloading the weight, you can run inference.

For the included video:

Running:

uv run python main.py infer

Should print something like:

making tea: 0.81
setting table: 0.01
opening door: 0.01

Commands

Note: be sure to use an Attention layer that works with TensorRT.

# Use faster settings for torch inference (half precision, torch.compile, ..)
uv run python main.py infer --fast

# Export the model to an ONNX file
uv run python main.py export_onnx

# and run inference using ONNX runtime
uv run python main.py infer_onnx

# Build a TensorRT engine from the ONNX
uv run python build_trt.py

# and run inference using the built engine
uv run python main.py infer_trt

Testing with different version of TensorRT

Checking with different TensorRT versions can be done using docker and NVIDIA's pytorch images:

$ docker run --gpus all --rm -it -v $(pwd):/code -w /code nvcr.io/nvidia/pytorch:24.12-py3 bash
$ root@cd60802e9604:/code# pip install "onnxruntime>=1.17.1" "pyav<14.0.0" "timm>=1.0.12"
$ root@cd60802e9604:/code# python ./main.py export_onnx && python ./build_trt.py  && python ./main.py infer_trt

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
alternative_attn_modules.py		alternative_attn_modules.py
build_trt.py		build_trt.py
label_map_k710.txt		label_map_k710.txt
main.py		main.py
pyproject.toml		pyproject.toml
tea.mp4		tea.mp4
tea_short.gif		tea_short.gif
uv.lock		uv.lock
video_mae.py		video_mae.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Download the weights

How to use

Commands

Testing with different version of TensorRT

About

Releases

Packages

Languages

License

ohadravid/vit-trt

Folders and files

Latest commit

History

Repository files navigation

Download the weights

How to use

Commands

Testing with different version of TensorRT

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages