It's a collection for mapless robot navigation using RGB image as visual input. It contains the test
environment and motion planners, aiming at realizing all the three levels of mapless navigation:
1. memorizing efficiently;
2. from memorizing to reasoning;
3. more powerful reasoning
The experiment data is in ./materials/record folder.
I built the environment as benchmark for testing the algorithms.
- Diverse complexity.
- Gym-style Interface.
- Support ROS.
Quickstart example code to use this benckmark.
import env
maze0 = env.GazeboMaze(maze_id=0, continuous=True)
observation = maze0.reset()
done = False
while not done:
# Stochastic strategy
action = dict()
action['linear_vel'] = np.random.uniform(0, 1)
action['angular_vel'] = np.random.uniform(-1, 1)
observation, done, reward = maze0.execute(action)
print(action, reward)
The designed VAE strcture is shown in the lower left figure. Train it in maze1 and maze2. The kl_tolerace is set to 0.5 (We stop optimizing for KL loss term once it is lower than some level, rather than letting it go to near zero) and latent dim is 32, thus the total loss is trained as close as possible to 16.
The following results are tested in maze3 to verify the ability of generalization.
VAE-based planner & Baseline network structure
- Performance comparision
SPL | Benchmark | Proposed |
maze1 | 0.702 | 0.703 |
maze2 | 0.611 | 0.626 |
That is, the proposed motion planner not only has much better sample-efficience, but also it has better performance. Actually, the shortest path in two mazes are both found by proposed motion planner (26 timesteps in maze1 and 29 time steps in maze2 with acceleration in simulation).
Stacked LSTM
network structure
tensorflow: 1.5.0
OS: Ubuntu 16.04
Python: 2.7
OpenCV: 3
ROS: Kinetic
Gazebo: 7
# install tensorflow-gpu after cudnn and cuda are installed
pip install tensorflow-gpu==1.5.0
# or just use tensorflow-cpu if no Nvidia GPU, it can also work.
pip install tensorflow==1.5.0
# install OpenCV:
# install ROS:
# install Gazebo
sudo apt-get install gazebo7 libgazebo7-dev
# install old version that supports python2 of tensorforce form source
sudo apt-get install ros-kinetic-gazebo-ros-pkgs ros-kinetic-gazebo-ros-control
sudo apt-get install ros-kinetic-turtlebot-*
sudo apt-get remove ros-kinetic-turtlebot-description
sudo apt-get install ros-kinetic-kobuki-description
# change to catkin_ws/src
git clone
cd ..
source ./devel/setup.bash
# you can change the configure in
cd src/navbot/rl_nav/scripts
# run the proposed model for memorizing
# run the proposed model for reasoning
The default environment is maze1, you need to change maze_id in nav_gazebo.launch and if you want change the environment.
To execute to generate data, you need to comment the goal-related code in nav_gazebo.launch and
maze1 and maze2 are speeded up 10 times to train, if you want speed up other environments, just change
<max_step_size>0.001</max_step_size> <real_time_factor>1</real_time_factor>
<max_step_size>0.01</max_step_size> <!-- <real_time_factor>1</real_time_factor> -->
in the environment file in worlds.
To reproduce the result, please change the related parameters in according to config.txt.
PPO is not a deterministic policy gradient algorithm, the action at every timestep is sampled according to the distribution. It can be seen as "noise" and it's useful for explorations and generalizations. If you want to use the best strategy after the model is trained, just change 'deterministic = True' in and the performance will be improved.
If your find the work is helpful in your research, please cite the following papers:
- Using RGB Image as Visual Input for Mapless Robot Navigation
- Learning to Navigate in Indoor Environments: from Memorizing to Reasoning
