Reference implementation of the publication 3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods published on the 33rd IEEE Intelligent Vehicles Symposium (IV 2022) in Aachen, Germany.
This repository implements a RNN-based compression framework for range images implemented in Tensorflow 2.9.0.
Furthermore, it implements preprocessing and inference nodes for ROS Noetic in order to perform the compression
task on a /points2
sensor data stream.
3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods
Till Beemelmanns, Yuchen Tao, Bastian Lampe, Lennart Reiher, Raphael van Kempen, Timo Woopen, and Lutz Eckstein
Institute for Automotive Engineering (ika), RWTH Aachen University
Abstract — Storing and transmitting LiDAR point cloud data is essential for many AV applications, such as training data collection, remote control, cloud services or SLAM. However, due to the sparsity and unordered structure of the data, it is difficult to compress point cloud data to a low volume. Transforming the raw point cloud data into a dense 2D matrix structure is a promising way for applying compression algorithms. We propose a new lossless and calibrated 3D-to-2D transformation which allows compression algorithms to efficiently exploit spatial correlations within the 2D representation. To compress the structured representation, we use common image compression methods and also a self-supervised deep compression approach using a recurrent neural network. We also rearrange the LiDAR’s intensity measurements to a dense 2D representation and propose a new metric to evaluate the compression performance of the intensity. Compared to approaches that are based on generic octree point cloud compression or based on raw point cloud data compression, our approach achieves the best quantitative and visual performance.
- Approach
- Range Image Compression
- Model Training
- Download of dataset, model weights and evaluation frames
- Inference & Evaluation Node
- Authors of this Repository
- Cite
- Acknowledgement
This learning framework serves the purpose of compressing range images projected from point clouds captured by Velodyne LiDAR sensor. The network architectures are based on the work proposed by Toderici et al. and implemented in this repository using Tensorflow 2.9.0.
The following architectures are implemented:
- Additive LSTM - Architecture that we used for the results in our paper.
- Additive LSTM Demo - Lightweight version of the Additive LSTM in order to test your setup
- Additive GRU - Uses Gated Recurrent Units instead of a LSTM cell. We achieved slightly worse results compared to the LSTM variant
- Oneshot LSTM - Does not use the additive reconstruction path in the network, as shown below.
The differences between the additive and the oneshot framework are visualized below.
This reconstruction framework adds the previous progressive reconstruction to the current reconstruction.
The current reconstruction is directly computed from the bitstream by the decoder RNN.
The range representations are originally (32, 1812)
single channel 16-bit images. During training, they are online
random cropped as (32, 32)
image patches. The label is the input image patch itself. Note that during validation,
the random crop is performed offline to keep the validation set constant.
Input: 16-bit image patch of shape (32, 1812, 1)
which will be random cropped to shape (32, 32, 1)
during training.
Output: reconstructed 16-bit image patch of shape (32, 32, 1)
.
Input | Label | Prediction |
---|---|---|
During inference, the model can be applied to range image with any width and height divisible to 32, as the following example shows:
A demo sample dataset can be found in range_image_compression/demo_samples
in order to quickly test your
environment.
The implementation is based on Tensorflow 2.9.0. All necessary dependencies can be installed with
# /range_image_compression >$
pip install -r requirements.txt
The architecture Oneshot LSTM
uses GDN layers proposed by Ballé et al.,
which is supported in TensorFlow Compression.
It can be installed with:
# /range_image_compression >$
pip3 install tensorflow-compression==2.9.0
A training with the lightweight demo network using the sample dataset can be started with the following command:
# /range_image_compression >$
python3 ./train.py \
--train_data_dir="demo_samples/training" \
--val_data_dir="demo_samples/validation" \
--train_output_dir="output" \
--model=additive_lstm_demo
If you downloaded the whole dataset (see below) then you would have to adapt the values for --train_data_dir
and
--val_data_dir
accordingly.
We also provide a docker environment to train the model. First build the docker image and then start the training:
# /docker >$
./docker_build.sh
./docker_train.sh
Parameters for the training can be set up in the configurations located in the directory range_image_compression/configs. It is recommended to use a potent multi-gpu setup for efficient training.
The following cosine learning rate scheduler is used:
Parameter | Description | Example |
---|---|---|
init_learning_rate |
initial learning rate | 2e-3 |
min_learning_rate |
minimum learning rate | 1e-7 |
max_learning_rate_epoch |
epoch where learning rate begins to decrease | 0 |
min_learning_rate_epoch |
epoch of minimum learning rate | 3000 |
Default parameters for the network architectures
Parameter | Description | Example |
---|---|---|
bottleneck |
bottleneck size of architecture | 32 |
num_iters |
maximum number of iterations | 32 |
Model Architecture | Filename | Batch Size | Validation MAE | Evaluation SNNRMSE iter=32 |
---|---|---|---|---|
Additive LSTM | additive_lstm_32b_32iter.hdf5 |
32 | 2.6e-04 |
0.03473 |
Additive LSTM Slim | additive_lstm_128b_32iter_slim.hdf5 |
128 | 2.3e-04 |
0.03636 |
Additive LSTM Demo | additive_lstm_128b_32iter_demo.hdf5 |
128 | 2.9e-04 |
0.09762 |
Oneshot LSTM | oneshot_lstm_b128_32iter.hdf5 |
128 | 2.9e-04 |
0.05137 |
Additive GRU | Will be uploaded soon | TBD | TBD | TBD |
The dataset to train the range image compression framework can be retrieved from https://rwth-aachen.sciebo.de/s/MLJ4UaDJuVkYtna. There you should be able to download the following files:
pointcloud_compression.zip
(1.8 GB): ZIP File which contains three directoriestrain
: 30813 range images for trainingval
: 6162 cropped32 x 32
range images for validationtest
: 1217 range images for testing
evaluation_frames.bag
(17.6 MB): ROS bag which contains the Velodyne package data in order to evaluate the model. Frames in this bag file were not used for training nor for validation.additive_lstm_32b_32iter.hdf5
(271 MB): Trained model with a bottleneck size of 32additive_lstm_128b_32iter_slim.hdf5
(67.9 MB): Trained model with a bottleneck size of 32oneshot_lstm_b128_32iter.hdf5
(68.1 MB): Trained model with a bottleneck size of 32additive_lstm_256b_32iter_demo.hdf5
(14.3 MB): Trained model with a bottleneck sizeof 32
The inference and evaluation is conducted with ROS. The inferences and preprocessing nodes can be found
under catkin_ws/src. The idea is to use recorded Velodyne package data which are stored in a .bag
file. This bag file is then played with rosbag play
and the preprocessing and inference nodes are applied to this raw
sensor data stream. In order to run the inference and evaluation you will need to execute the following steps.
To run the evaluation code we need to clone the Velodyne Driver from the official repository. The driver is integrated as a submodule to catkin_ws/src/velodyne. You can initialize the submodule with:
git submodule init
git submodule update
We provide a docker image which contains ROS Noetic and Tensorflow in order to execute the inference and evaluation nodes. You can pull this dockerfile from Docker Hub with the command:
docker pull tillbeemelmanns/pointcloud_compression:noetic
Download the model's weights as .hdf5
from the download link and copy this file to catkin_ws/models
. Then, in the
configuration file insert the
correct path for the parameter weights_path
. Note, that we will later mount the directory catkin_ws
into
the docker container to /catkin_ws
. Hence, the path will start with /catkin_ws
.
Perform the same procedure with the bag file. Copy the bag file into catkin_ws/rosbags
. The filename should be
evaluation_frames.bag
. This bag file will be used in this launch file.
Start the docker container using the script
# /docker >$
./docker_eval.sh
Calling this script for the first time will just start the container.
Then open a second terminal and execute ./docker_eval.sh
script again in order to open the container with bash.
# /docker >$
./docker_eval.sh
Run the following commands inside the container to build and launch the compression framework.
# /catkin_ws >$
catkin build
source devel/setup.bash
roslaunch pointcloud_to_rangeimage compression.launch
You should see the following RVIZ window which visualizes the reconstructed point cloud.
In case you have a capable GPU, try to change line 44 in docker_eval.sh and define your GPU ID
with flag -g
$DIR/run-ros.sh -g 0 $@
Till Beemelmanns and Yuchen Tao
Mail: till.beemelmanns (at) rwth-aachen.de
Mail: yuchen.tao (at) rwth-aachen.de
@INPROCEEDINGS{9827270,
author={Beemelmanns, Till and Tao, Yuchen and Lampe, Bastian and Reiher, Lennart and Kempen, Raphael van and Woopen, Timo and Eckstein, Lutz},
booktitle={2022 IEEE Intelligent Vehicles Symposium (IV)},
title={3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods},
year={2022},
volume={},
number={},
pages={345-351},
doi={10.1109/IV51971.2022.9827270}}
This research is accomplished within the project ”UNICARagil” (FKZ 16EMO0284K). We acknowledge the financial support for the project by the Federal Ministry of Education and Research of Germany (BMBF).