Torch implementation for learning a mapping from input images that contain dynamic objects in a city environment, such as vehicles and pedestrians, to output images which are static and suitable for localization and mapping.
Empty Cities: a Dynamic-Object-Invariant Space for Visual SLAM
Berta Bescos, Cesar Cadena, Jose Neira
- Linux or OSX
- NVIDIA GPU + CUDA CuDNN (CPU mode and CUDA without CuDNN may work with minimal modification, but untested)
- Install torch and dependencies from https://github.com/torch/distro
- Install torch packages
nngraph
anddisplay
luarocks install nngraph
luarocks install https://raw.githubusercontent.com/szym/display/master/display-scm-0.rockspec
- Clone this repo:
git clone https://github.com/BertaBescos/EmptyCities_SLAM.git
cd EmptyCities_SLAM
Pre-trained models can de downloaded from the folder checkpoints
in this link. You will find a README.md file inside this folder. Place the checkpoints
folder inside the project.
- We encourage you to keep your data in a folder of your choice
/path/to/data/
with three subfolderstrain
,test
andval
. The following command will run our model within all the images inside the foldertest
and keep the results in./results/
. Images within the foldertest
should be RGB images of any size.
DATA_ROOT=/path/to/data/ name=my_name th test.lua
- If you prefer to feed the dynamic/static binary masks, you should concatenate it to the RGB image. We provide a python script for this on https://github.com/bertabescos/EmptyCities.
DATA_ROOT=/path/to/data/ name=my_name mask=1 th test.lua
- The simplest case trains only with synthetic CARLA data with G(x,m) and D(x,y,m,n). In the subfolder
/path/to/synth/data/train/
there should be the concatenated (RGB | GT | Mask) images. The utilized masks come from this simulator too, and therefore do not use the semantic segmentation model.
DATA_ROOT=/path/to/synth/data/ name=my_name th train.lua
- If you want to use the ORB-features-based loss you should set
lossDetector
,lossOrientation
andlossDescriptor
to 0 in the command line.
DATA_ROOT=/path/to/synth/data/ name=my_name lossDetector=1 lossOrientation=1 lossDescriptor=1 th train.lua
- If you want to finetune your trained model with real-world data in your training, you should set
NSYNTH_DATA_ROOT
to this dataset path.
DATA_ROOT=/path/to/synth/data/ NSYNTH_DATA_ROOT=/path/to/non/synth/data/ continue_train=1 name=my_name th train.lua
- (Optionally) start the display server to view results as the model trains. ( See Display UI for more details):
th -ldisplay.start 8000 0.0.0.0
Models are saved by default to ./checkpoints/base_512x512
(can be changed by passing checkpoint_dir=your_dir
and name=your_name
in options.lua).
See options.lua
for additional training options.
Our synthetic dataset has been generated with CARLA 0.8.2 and is available in the zipped folder CARLA_dataset
in this link. Information on how this dataset has been generated can be found in here.
If you use this code for your research, please cite ours papers:
@article{bescos2019empty,
title={Empty Cities: a Dynamic-Object-Invariant Space for Visual SLAM},
author={Bescos, Berta and Cadena, Cesar and Neira, José},
journal={arXiv},
year={2019}
}
@article{bescos2018empty,
title={Empty Cities: Image Inpainting for a Dynamic-Object-Invariant Space},
author={Bescos, Berta and Neira, José and Siegwart, Roland and Cadena, Cesar},
journal={International Conference on Robotics and Automation (ICRA)},
year={2018}
}
Our code is heavily inspired by pix2pix, DCGAN and Context-Encoder.