The main model is TGTSF_torch. Other versions are deprecated.
We will update the paper on arxiv from time to time. Keep updated here: https://arxiv.org/abs/2405.13522
- We have upload the toy dataset together with its generation scripts. You can use it to create your own dataset. The pre-embedding scripts are also included. Please do pre-embedding before training.
- Weather-captioned dataset is uploaded, including the time series of 10 years, all pre-embeddings files for the captions and hashtables for indexing the embeddings.
- We put the pre-embeddings as tarball and storage them on github with git-lfs. You may need to:
- Install git-lfs with
sudo apt-get install git-lfs
orbrew install git-lfs
- Run
git lfs install
in the repository - Run
git lfs pull
to download the pre-embedding files. - Unzip the tarball with
tar -xvf embeddings.tar
- We break the embeddings for weather-large into several parts due to the 2GB file size limit of GITHUB. You need to merge them with
cat openai_caption_emb_large_part_*.tar > openai_caption_emb_large.tar
and then untar it.
- Install git-lfs with
- We also upload all the scripts to generate such a dataset, including rawdata, captioning, pre-embedding, and indexing as a seperate repository. You can find it here: Weather Captioned Dataset
⚠ If you have trouble in downloading the pre-embedding files with git-lfs, we also provide google drive links for the pre-embedding files. Click Here
You can use gdown
to download the files from google drive.
Run scripts in the ./scripts folder.
Use visualize.ipynb to visualize the results. We may upload the checkpoint we trained later.