SCIPaD

Official implementation of "SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning"(T-IV 2024).

[paper] [arxiv] [website]

An illustration of our proposed SCIPaD framework. Compared with the traditional PoseNet architecture, it comprises three main parts: (1) a confidence-aware feature flow estimator, (2) a spatial clue aggregator, and (3) a hierarchical positional embedding injector.

⚙️ Setup

Create a conda environment:

conda create -n scipad python==3.9
conda activate scipad

Install pytorch, torchvision and cuda:

conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia

Install the dependencies:

pip install -r requirements.txt

Note that we ran our experiments with PyTorch 2.0.1, CUDA 11.7, Python 3.9 and Ubuntu 20.04. It's acceptable to use a newer version of PyTorch, but the metrics might be slightly different from those reported in the paper (~0.01%).

🖼️ Pretrained Models

You can download weights for pretrained models from the Google Drive or Baidu Cloud Drive:

Methods	WxH	abs rel	sq rel	RMSE	RMSE log	$\delta < 1.25$	$\delta < 1.25^2$	$\delta < 1.25^3$
KITTI Raw	640x192	0.090	0.650	4.056	0.166	0.918	0.970	0.985

Methods	WxH	Seq09 $e_t$	Seq09 $e_r$	Seq09 ATE	Seq10 $e_t$	Seq10 $e_r$	Seq10 ATE
KITTI Odom	640x192	7.43	2.46	26.15	9.82	3.87	15.51

Create a ./checkpoints/ folder and place the pretrained models inside it.

💾 Dataset Preparation

KITTI Raw Dataset: You can download the KITTI raw dataset and convert the png images to jpeg following instructions in Monodepth2:

wget -i splits/kitti_archives_to_download.txt -P kitti_data/
cd kitti_data
unzip "*.zip"
cd ..
find kitti_data/ -name '*.png' | parallel 'convert -quality 92 -sampling-factor 2x2,1x1,1x1 {.}.png {.}.jpg && rm {}'

We also need pre-computed segmentation images provided by TriDepth for training (not needed for evaluation). Download them from here and organize the dataset as follows:

kitti_raw
├── 2011_09_26
│   ├── 2011_09_26_drive_0001_sync
│   ├── ...
│   ├── calib_cam_to_cam.txt
│   ├── calib_imu_to_velo.txt
│   └── calib_velo_to_cam.txt
├── ...
├── 2011_10_03
│   ├── ...
└── segmentation
    ├── 2011_09_26
    ├── ...
    └── 2011_10_03

KITTI Odometry Dataset: Download the KITTI odometry dataset (color, 65GB) and ground truth poses and organize the dataset as follows:

kitti_odom
├── poses
│   ├── 00.txt
│   ├── ...
│   └── 10.txt
└── sequences
    ├── 00
    ├── ...
    └── 21

⏳ Training

On KITTI Raw:

python train.py --config configs/kitti_raw.yaml

On KITTI Odometry:

python train.py --config configs/kitti_odom.yaml

🔧 Other training options

The yacs configuration library is used in this project. You can customize your configuration structure in ./utils/config/defaults.py and configuration value in ./configs/*.yaml.

📊 KITTI evaluation

First, download KITTI ground truth and improved ground truth from here, and put them into the ./split/eigen and ./split/eigen_benchmark, respectively. You can also obtain them following instructions provided in Monodepth2.

Depth estimation results using the KITTI Eigen split:

python evaluate_depth.py --config configs/kitti_raw.yaml  load_weights_folder checkpoints/KITTI eval.batch_size 1

Visual odometry evaluation pretrained weights to evaluate depth with eigen benchmark split:
Depth estimation results using the improved KITTI ground truth:

python evaluate_depth.py --config configs/kitti_raw.yaml \
  load_weights_folder checkpoints/KITTI \
  eval.batch_size 1 \
  eval.split eigen_benchmark

Visual odometry evaluation using Seqs. 09-10:

python evaluate_pose.py --config configs/kitti_odom.yaml \
  load_weights_folder checkpoints/KITTI_Odom \
  eval.split odom_09
python evaluate_pose.py --config configs/kitti_odom.yaml \
  load_weights_folder checkpoints/KITTI_Odom/ \
  eval.split odom_10

To compute evaluation results and visualize trajectory, run:

python ./utils/kitti_odom_eval/eval_odom.py --result=checkpoints/KITTI_Odom/ --align='7dof'

You can refer to ./eval.sh for more information.

Citation

If you find our work useful in your research please consider citing our paper:

@inproceedings{feng2024scipad,
  title={SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning},
  author={Feng, Yi and Guo, Zizhan and Chen, Qijun and Fan, Rui},
  journal={IEEE Transactions on Intelligent Vehicles},
  year={2024},
  publisher={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
configs		configs
datasets		datasets
networks		networks
splits		splits
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval.sh		eval.sh
evaluate_depth.py		evaluate_depth.py
evaluate_pose.py		evaluate_pose.py
requirements.txt		requirements.txt
train.py		train.py
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SCIPaD

⚙️ Setup

🖼️ Pretrained Models

💾 Dataset Preparation

⏳ Training

🔧 Other training options

📊 KITTI evaluation

Citation

About

Releases

Packages

Languages

License

fengyi233/SCIPaD

Folders and files

Latest commit

History

Repository files navigation

SCIPaD

⚙️ Setup

🖼️ Pretrained Models

💾 Dataset Preparation

⏳ Training

🔧 Other training options

📊 KITTI evaluation

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages