Siyuan Huang*, Zan Wang*, Puhao Li, Baoxiong Jia, Tengyu Liu, Yixin Zhu, Wei Liang, Song-Chun Zhu
This repository is the official implementation of paper "Diffusion-based Generation, Optimization, and Planning in 3D Scenes".
We introduce SceneDiffuser, a conditional generative model for 3D scene understanding. SceneDiffuser provides a unified model for solving scene-conditioned generation, optimization, and planning. In contrast to prior work, SceneDiffuser is intrinsically scene-aware, physics-based, and goal-oriented.
arXiv | Project | HuggingFace Demo | Checkpoints
We introduce SceneDiffuser, a conditional generative model for 3D scene understanding. SceneDiffuser provides a unified model for solving scene-conditioned generation, optimization, and planning. In contrast to prior works, SceneDiffuser is intrinsically scene-aware, physics-based, and goal-oriented. With an iterative sampling strategy, SceneDiffuser jointly formulates the scene-aware generation, physics-based optimization, and goal-oriented planning via a diffusion-based denoising process in a fully differentiable fashion. Such a design alleviates the discrepancies among different modules and the posterior collapse of previous scene-conditioned generative models. We evaluate SceneDiffuser with various 3D scene understanding tasks, including human pose and motion generation, dexterous grasp generation, path planning for 3D navigation, and motion planning for robot arms. The results show significant improvements compared with previous models, demonstrating the tremendous potential of SceneDiffuser for the broad community of 3D scene understanding.
-
Create a new
conda
environemnt and activate itconda create -n 3d python=3.8 conda activate 3d
-
Install dependent libraries with
pip
pip install -r pre-requirements.txt pip install -r requirements.txt
- We use
pytorch1.11
andcuda11.3
, modifypre-requirements.txt
to install other versions ofpytorch
- We use
You can use our pre-processed data or process the data by yourself following the instructions.
But, you also need to download some official released data assets which are not processed, see instructions. Please remember to use your own data path by modifying the path configuration in:
-
scene_model.pretrained_weights
inmodel/*.yaml
for the path of pre-trained scene encoder (if you use a pre-trained scene encoder) -
dataset.*_dir
/dataset.*_path
configurations intask/*.yaml
for the path of data assets
Download our pre-trained model and unzip them into a folder, e.g., ./outputs/
.
task | checkpoints | desc |
---|---|---|
Pose Generation | 2022-11-09_11-22-52_PoseGen_ddm4_lr1e-4_ep100 | |
Motion Generation | 2022-11-09_12-54-50_MotionGen_ddm_T200_lr1e-4_ep300 | w/o start position |
Motion Generation | 2022-11-09_14-28-12_MotionGen_ddm_T200_lr1e-4_ep300_obser | w/ start position |
Path Planning | 2022-11-25_20-57-28_Path_ddm4_LR1e-4_E100_REL |
-
Train with single gpu
bash scripts/pose_gen/train.sh ${EXP_NAME}
-
Train with 4 GPUs (modify
scripts/pose_gen/train_ddm.sh
to specify the visible GPUs)bash scripts/pose_gen/train_ddm.sh ${EXP_NAME}
bash scripts/pose_gen/test.sh ${CKPT} [OPT]
# e.g., bash scripts/pose_gen/test.sh ./outputs/2022-11-09_11-22-52_PoseGen_ddm4_lr1e-4_ep100/ OPT
[OPT]
is optional for optimization-guided sampling.
bash scripts/pose_gen/sample.sh ${CKPT} [OPT]
# e.g., bash scripts/pose_gen/sample.sh ./outputs/2022-11-09_11-22-52_PoseGen_ddm4_lr1e-4_ep100/ OPT
[OPT]
is optional for optimization-guided sampling.
The default configuration is motion generation without observation. If you want to explore the setting of motion generation with start observation, please change the task.has_observation
to true
in all the scripts in folder ./scripts/motion_gen/
.
-
Train with single gpu
bash scripts/motion_gen/train.sh ${EXP_NAME}
-
Train with 4 GPUs (modify
scripts/motion_gen/train_ddm.sh
to specify the visible GPUs)bash scripts/motion_gen/train_ddm.sh ${EXP_NAME}
bash scripts/motion_gen/test.sh ${CKPT} [OPT]
# e.g., bash scripts/motion_gen/test.sh ./outputs/2022-11-09_12-54-50_MotionGen_ddm_T200_lr1e-4_ep300/ OPT
[OPT]
is optional for optimization-guided sampling.
bash scripts/motion_gen/sample.sh ${CKPT} [OPT]
# e.g., bash scripts/motion_gen/sample.sh ./outputs/2022-11-09_12-54-50_MotionGen_ddm_T200_lr1e-4_ep300/ OPT
[OPT]
is optional for optimization-guided sampling.
coming soon.
-
Train with single gpu
bash scripts/path_planning/train.sh ${EXP_NAME}
-
Train with 4 GPUs (modify
scripts/path_planning/train_ddm.sh
to specify the visible GPUs)bash scripts/path_planning/train_ddm.sh ${EXP_NAME}
bash scripts/path_planning/plan.sh ${CKPT}
bash scripts/path_planning/sample.sh ${CKPT} [OPT] [PLA]
# e.g., bash scripts/path_planning/sample.sh ./outputs/2022-11-25_20-57-28_Path_ddm4_LR1e-4_E100_REL/ OPT PLA
- The program will generate trajectories with given start position and scene; rendering the results into images. (The results not the planning results, just use diffuser to generate diverse trajectories.)
[OPT]
is optional for optimization-guided sampling.[PLA]
is optional for planner-guided sampling.
coming soon.
If you find our project useful, please consider citing us:
@article{huang2023diffusion,
title={Diffusion-based Generation, Optimization, and Planning in 3D Scenes},
author={Huang, Siyuan and Wang, Zan and Li, Puhao and Jia, Baoxiong and Liu, Tengyu and Zhu, Yixin and Liang, Wei and Zhu, Song-Chun},
journal={arXiv preprint arXiv:2301.06015},
year={2023}
}
Some codes are borrowed from stable-diffusion, PSI-release, Pointnet2.ScanNet, point-transformer, and diffuser.