A prototype implementation of VOCAL-UDF, which is a self-enhancing video data management system that empowers users to flexibly issue and answer compositional queries, even when the modules necessary to answer those queries are unavailable. See the technical report for more details.
The project uses conda
to manage dependencies. To install conda, follow the instructions here.
# Clone the repository
git clone https://github.com/uwdb/VOCAL-UDF.git
cd VOCAL-UDF
# Create a conda environment (called vocal-udf) and install dependencies
conda env create -f environment.yml --name vocal-udf
conda activate vocal-udf
python -m pip install -e .
To use OpenAI models, follow the instructions here to create and export an API key.
# Export the API key as an environment variable
export OPENAI_API_KEY="your_api_key_here"
- Download the CLEVRER dataset from here. Place the videos in
data/clevrer/
. - Extract the frames from the videos using the following command. This will create a
video_frames
directory indata/clevrer/
.
cd data/clevrer
python extract_frames.py
- Prepare the database. Download the processed annotations from here and place them in
duckdb_dir/
. - Create relations and load data into the database.
cd duckdb_dir
python load_clevrer.py
- Extract the features from the frames using the following command. This will create a
features/clevrer_three_clips
directory induckdb_dir
.
cd featurestore
# Extract attribute features (about 16GB)
python extract_clevrer.py --method "attribute"
# Extract relationship features (about 113GB)
python extract_clevrer.py --method "relationship"
- Obtain the CityFlow-NL dataset from here (i.e., 2023 Track 2; For more information, see here). Place the videos in
data/cityflow/
. Next, runpython extract_vdo_frms.py
to extract the frames from the videos. The file structure should look like this:
data/cityflow/data/
├── extract_vdo_frms.py
├── test-queries.json
├── train-tracks.json
├── test-tracks.json
├── train/
│ ├── S01/
│ │ ├── c001/
│ │ │ ├── calibration.txt
│ │ │ ├── det/
│ │ │ ├── gt/
│ │ │ ├── img1/
│ │ │ ├── mtsc/
│ │ │ ├── roi.jpg
│ │ │ ├── segm/
│ │ │ └── vdo.avi
│ │ ├── c002/...
│ │ ├── ...
│ ├── S03/...
│ ├── S04/...
└── validation/...
- Prepare the database. Download the processed annotations from here and place them in
duckdb_dir/
. - Create relations and load data into the database.
cd duckdb_dir
python load_cityflow.py
- Extract the features from the frames using the following command. This will create a
features/cityflow_three_clips
directory induckdb_dir
.
cd featurestore
# Extract attribute features
python extract_cityflow.py --method "attribute"
# Extract relationship features
python extract_cityflow.py --method "relationship"
- Download the Charades dataset (scaled to 480p) from here. Place the videos in
data/charades/
. - Download Action Genome annotations from here. Place the annotations in
data/charades/
. - Extract the frames from the videos using the following command. This will create a
frames
directory indata/charades/
.
cd data/charades
python dump_frames.py
- Prepare the database. Download the processed annotations from here and place them in
duckdb_dir/
. - Create relations and load data into the database.
cd duckdb_dir
python load_charades.py
- Extract the features from the frames using the following command. This will create a
features/charades_five_clips
directory induckdb_dir
.
cd featurestore
# Extract relationship features (about 15GB; takes around 2 hours). Charades has no attribute features
python extract_cityflow.py --include_text_features
We provide an example of how to use VOCAL-UDF to process a query with three missing UDFs on the CLEVRER dataset.
- Generate UDFs
python experiments/async_main.py \
--num_missing_udfs 3 \
--run_id 0 \
--query_id 0 \
--dataset "clevrer" \
--query_filename "3_new_udfs_labels" \
--budget 20 \
--n_selection_samples 500 \
--num_interpretations 10 \
--allow_kwargs_in_udf \
--program_with_pixels \
--num_parameter_search 5 \
--num_workers 8 \
--save_labeled_data \
--n_train_distill 100 \
--selection_strategy "both" \
--llm_method "gpt" \
--is_async \
--openai_model_name "gpt-4o"
- Execute query with new UDFs
python experiments/run_query_executor.py \
--num_missing_udfs 3 \
--run_id 0 \
--query_id 0 \
--dataset "clevrer" \
--query_filename "3_new_udfs_labels" \
--budget 20 \
--n_selection_samples 500 \
--num_interpretations 10 \
--allow_kwargs_in_udf \
--program_with_pixels \
--num_parameter_search 5 \
--num_workers 8 \
--n_train_distill 100 \
--selection_strategy "both" \
--pred_batch_size 4096 \
--dali_batch_size 1 \
--llm_method "gpt"
The experiment scripts are located in the scripts/experiments
directory.