Doe-1: Closed-Loop Autonomous Driving with Large World Model

Paper | Project Page | Code

Check out our Large Driving Model Series!

Doe-1: Closed-Loop Autonomous Driving with Large World Model

Wenzhao Zheng* $\dagger$, Zetian Xia*, Yuanhui Huang, Sicheng Zuo, Jie Zhou, Jiwen Lu

* Equal contribution $\dagger$ Project leader

Doe-1 is the first closed-loop autonomous driving model for unified perception, prediction, and planning.

News

[2024/12/13] Evaluation code released.
[2024/12/13] Paper released on arXiv.
[2024/12/13] Demo released.

Demo

Doe-1 is a unified model to accomplish visual-question answering, future prediction, and motion planning.

Overview

We formulate autonomous driving as a unified next-token generation problem and use observation, description, and action tokens to represent each scene. Without additional fine-tuning, Doe-1 accomplishes various tasks by using different input prompts, including visual question-answering, controlled image generation, and end-to-end motion planning.

Closed-Loop Autonomous Driving

We explore a new closed-loop autonomous driving paradigm which combines end-to-end model and world model to construct a closed loop.

Visualizations

Closed-Loop Autonomous Driving

Action-Conditioned Video Generation

Getting Started

Data Preparation

Download nuScenes V1.0 full dataset data HERE.
Download the annotations data_nusc from OmniDrive and unzip it.
Download the VQVAE weights from HERE and put them to the following directory as HERE:

Doe/
- model/
    - lumina_mgpt/
        - ckpts/
            - chameleon/
                - tokenizer/
                    - text_tokenizer.json
                    - vqgan.yaml
                    - vqgan.ckpt
    - xllmx/
- ...

Inference

Generate the conversation data for inference and set the max :

# max length: 1 for qa, 5 for planning
python dataset/gen_data.py \
--info_path path/to/infos_var.pkl \
--qa_path path/to/OmniDriveDataset \
--nusc_path path/to/nuscenes \
--save_path path/to/save/outputs \
--max_length 1

Inference with a model ckpt:

# set split and id for multi gpus
CUDA_VISIBLE_DIVICES=0 python inference/eval.py \
--anno_path path/to/val_infos.pkl \
--nusc_path path/to/nuscenes \
--save_path path/to/save/output \
--model_path path/to/model/ckpt \
--data_path path/to/generated/data.json \
--task qa

Related Projects

Our code is based on the excellent work Lumina-mGPT.

Citation

If you find this project helpful, please consider citing the following paper:

@article{doe,
    title={Doe-1: Closed-Loop Autonomous Driving with Large World Model},
    author={Zheng, Wenzhao and Xia, Zetian and Huang, Yuanhui and Zuo, Sicheng and Zhou, Jie and Lu, Jiwen},
    journal={arXiv preprint arXiv: 2412.09627},
    year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
configs		configs
dataset		dataset
inference		inference
model		model
.gitignore		.gitignore
LICENSE		LICENSE
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Doe-1: Closed-Loop Autonomous Driving with Large World Model

Paper | Project Page | Code

News

Demo

Overview

Closed-Loop Autonomous Driving

Visualizations

Closed-Loop Autonomous Driving

Action-Conditioned Video Generation

Getting Started

Data Preparation

Inference

Related Projects

Citation

About

Releases

Packages

Contributors 2

Languages

License

wzzheng/Doe

Folders and files

Latest commit

History

Repository files navigation

Doe-1: Closed-Loop Autonomous Driving with Large World Model

Paper | Project Page | Code

News

Demo

Overview

Closed-Loop Autonomous Driving

Visualizations

Closed-Loop Autonomous Driving

Action-Conditioned Video Generation

Getting Started

Data Preparation

Inference

Related Projects

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages