Haoyu Wu
(
@article{wu2025geometryforcing,
title={Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling},
author={Wu, Haoyu and Wu, Diankun and He, Tianyu and Guo, Junliang and Ye, Yang and Duan, Yueqi and Bian, Jiang},
journal={arXiv preprint arXiv:2507.07982},
year={2025}
}
Geometry Forcing (GF) Overview.
(a) Our proposed GF paradigm enhances video diffusion models by aligning with geometric features from VGGT~\citep{wang2025vggt}.
(b) Compared to DFoT~\citep{dfot}, our method generates more temporally and geometrically consistent videos.
(c) While baseline features fail to reconstruct meaningful 3D geometry, GF-learned features enable accurate 3D reconstruction.
- [2025/9/24] We release code and checkpoint.
- [2025/9/22] Geometry Forcing is accepted to NeurIPS 2025 NextVid Workshop as an Oral!
- [2025/7/10] We release the paper and the project.
conda create -n geometryforcing python=3.10 -y
conda activate geometryforcing
pip install -r requirements.txt
We use Weights & Biases for logging. Sign up if you don't have an account, and modify wandb.entity
in config.yaml
to your user/organization name.
- Download pretrained checkpiont using huggingface:
bash scripts/hf_download_checkpoints.sh
- Download pretrained checkpiont using modelscope:
bash scripts/ms_download_checkpoints.sh
- Download and process RealEstate10k dataset to
data/real-estate-10k
bash scripts/eval_geometry_forcing.sh
bash scripts/eval_geometry_forcing_rotation.sh
bash scripts/train_geometry_forcing.sh