Paper | Project Page | Code
Check out our Large Driving Model Series!
Doe-1: Closed-Loop Autonomous Driving with Large World Model
Wenzhao Zheng*
$\dagger$ , Zetian Xia*, Yuanhui Huang, Sicheng Zuo, Jie Zhou, Jiwen Lu
* Equal contribution
Doe-1 is the first closed-loop autonomous driving model for unified perception, prediction, and planning.
- [2024/12/13] Evaluation code released.
- [2024/12/13] Paper released on arXiv.
- [2024/12/13] Demo released.
Doe-1 is a unified model to accomplish visual-question answering, future prediction, and motion planning.
We formulate autonomous driving as a unified next-token generation problem and use observation, description, and action tokens to represent each scene. Without additional fine-tuning, Doe-1 accomplishes various tasks by using different input prompts, including visual question-answering, controlled image generation, and end-to-end motion planning.
We explore a new closed-loop autonomous driving paradigm which combines end-to-end model and world model to construct a closed loop.
-
Download nuScenes V1.0 full dataset data HERE.
-
Download the annotations data_nusc from OmniDrive and unzip it.
-
Download the VQVAE weights from HERE and put them to the following directory as HERE:
Doe/
- model/
- lumina_mgpt/
- ckpts/
- chameleon/
- tokenizer/
- text_tokenizer.json
- vqgan.yaml
- vqgan.ckpt
- xllmx/
- ...
- Generate the conversation data for inference and set the max :
# max length: 1 for qa, 5 for planning
python dataset/gen_data.py \
--info_path path/to/infos_var.pkl \
--qa_path path/to/OmniDriveDataset \
--nusc_path path/to/nuscenes \
--save_path path/to/save/outputs \
--max_length 1
- Inference with a model ckpt:
# set split and id for multi gpus
CUDA_VISIBLE_DIVICES=0 python inference/eval.py \
--anno_path path/to/val_infos.pkl \
--nusc_path path/to/nuscenes \
--save_path path/to/save/output \
--model_path path/to/model/ckpt \
--data_path path/to/generated/data.json \
--task qa
Our code is based on the excellent work Lumina-mGPT.
If you find this project helpful, please consider citing the following paper:
@article{doe,
title={Doe-1: Closed-Loop Autonomous Driving with Large World Model},
author={Zheng, Wenzhao and Xia, Zetian and Huang, Yuanhui and Zuo, Sicheng and Zhou, Jie and Lu, Jiwen},
journal={arXiv preprint arXiv: 2412.09627},
year={2024}
}