This is the PyTorch implementation for inference and training of the future prediction bird's-eye view network as described in:
FIERY: Future Instance Segmentation in Bird's-Eye view from Surround Monocular Cameras
Anthony Hu, Zak Murez, Nikhil Mohan, Sofía Dudas, Jeffrey Hawke, Vijay Badrinarayanan, Roberto Cipolla and Alex Kendall
Multimodal future predictions by our bird’s-eye view network.
Top two rows: RGB camera inputs. The predicted future trajectories and segmentations are projected to the ground plane in the images.
Bottom row: future instance prediction in bird’s-eye view in a 100m×100m capture size around the ego-vehicle, which is indicated by a black rectangle in the center.
If you find our work useful, please consider citing:
@inproceedings{fiery2021,
title = {{FIERY}: Future Instance Segmentation in Bird's-Eye view from Surround Monocular Cameras},
author = {Anthony Hu and Zak Murez and Nikhil Mohan and Sofía Dudas and
Jeffrey Hawke and Vijay Badrinarayanan and Roberto Cipolla and Alex Kendall},
booktitle = {Proceedings of the International Conference on Computer Vision ({ICCV})},
year = {2021}
}
- Create the conda environment by running
conda env create
.
Or locally:
- Download pre-trained weights.
- Run
python visualise.py --checkpoint ${CHECKPOINT_PATH}
. This will render predictions from the network and save them to anoutput_vis
folder.
- Download the NuScenes dataset. For detailed instructions, see DATASET.md.
- Download pre-trained weights.
- Run
python evaluate.py --checkpoint ${CHECKPOINT_PATH} --dataroot ${NUSCENES_DATAROOT}
.
All the configs are in the folder fiery/configs
Config and weights | Dataset | Past context | Future horizon | BEV size | IoU | VPQ |
---|---|---|---|---|---|---|
baseline.yml |
NuScenes | 1.0s | 2.0s | 100mx100m (50cm res.) | 36.7 | 29.9 |
lyft/baseline.yml |
Lyft | 0.8s | 2.0s | 100mx100m (50cm res.) | 36.3 | 29.2 |
literature/static_pon_setting.yml |
NuScenes | 0.0s | 0.0s | 100mx50m (25cm res.) | 37.7 | - |
literature/pon_setting.yml |
NuScenes | 1.0s | 0.0s | 100mx50m (25cm res.) | 39.9 | - |
literature/static_lss_setting.yml |
NuScenes | 0.0s | 0.0s | 100mx100m (50cm res.) | 35.8 | - |
literature/lift_splat_setting.yml |
NuScenes | 1.0s | 0.0s | 100mx100m (50cm res.) | 38.2 | - |
literature/fishing_setting.yml |
NuScenes | 1.0s | 2.0s | 32.0mx19.2m (10cm res.) | 57.6 | - |
To train the model from scratch on NuScenes:
- Download the NuScenes dataset. For detailed instructions, see DATASET.md.
- Run
python train.py --config fiery/configs/baseline.yml DATASET.DATAROOT ${NUSCENES_DATAROOT}
.
This will train the model on 4 GPUs, each with a batch of size 3. To train on single GPU add the flag GPUS 1
, and to change the batch
size use the flag BATCHSIZE ${DESIRED_BATCHSIZE}
.
Big thanks to Giulio D'Ippolito (@gdippolito) for the technical help on the gpu servers, Piotr Sokólski (@pyetras) for the panoptic metric implementation, and to Hannes Liik (@hannesliik) for the awesome future trajectory visualisation on the ground plane.