Running VTimeLLM inference Offline

Please follow the instructions below to run the VTimeLLM inference on your local GPU machine.

Note: Our demo requires approximately 18 GB of GPU memory.

Clone the VTimeLLM repository

conda create --name=vtimellm python=3.10
conda activate vtimellm

git clone https://github.com/huangb23/VTimeLLM.git
cd VTimeLLM
pip install -r requirements.txt

Download weights

Download clip model and vtimellm model from the Tsinghua Cloud and place them into the 'checkpoints' directory.
Download Vicuna v1.5 or ChatGLM3-6b weights.

Run the inference code

python -m vtimellm.inference --model_base <path to the Vicuna v1.5 weights>

Alternatively, you can also choose to conduct multi-turn conversations in Jupyter Notebook. Similarly, you need to set 'args.model_base' to the path of Vicuna v1.5.

If you want to run the VTimeLLM-ChatGLM version, please refer to the code in inference_for_glm.ipynb.

Run the gradio demo

We have provided an offline gradio demo as follows:

cd vtimellm
python demo_gradio.py --model_base <path to the Vicuna v1.5 weights>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

offline_demo.md

offline_demo.md

Running VTimeLLM inference Offline

Clone the VTimeLLM repository

Download weights

Run the inference code

Run the gradio demo

Files

offline_demo.md

Latest commit

History

offline_demo.md

File metadata and controls

Running VTimeLLM inference Offline

Clone the VTimeLLM repository

Download weights

Run the inference code

Run the gradio demo