Dealing with uncertainty: balancing exploration and exploitation in deep recurrent reinforcement learning

Description

This repo contains an implementation of Double Dueling Deep Recurrent Q-Network which can be enhanced with several exploration strategies, like deterministic epsilon-greedy, adaptive epsilon-greedy (VDBE and BMC) [1], softmax, max-boltzmann exploration and VDBE-softmax, and an error masking strategy [2], [4].

Code Structure:

./AirsimEnv/: folder where the two environments ( AirsimEnv.py and AirsimEnv_9actions.py ) are stored; the former includes five steering angles and the latter nine steering angles. Further, this folder contains:
- DRQN_classes.py: implementation of agent, experience replay, exploration strategies, neural network and connection with AirSim NH are defined
- bayesian.py: a support for BMC epsilon-greedy
- final_reward_points.csv: a support for reward calculation (required for env scripts)
DRQN_airsim_training.py: contains training loop in which all files in the previous points are required (main script for training process)
DRQN_evaluation.py: contains training and test evaluation; each subset is defined with a different set of starting points to evaluate the model performance
The new implementation in Tensorflow 2.x is now available. You can check the implementation of all exploration strategies in the previous version while see updates of the neural network in the new code.

Prerequisites

Python 3.7.6
Tensorflow 2.5.0
Tornado 4.5.3
OpenCV 4.5.2.54
OpenAI Gym 0.18.3
Airsim 1.5.0

Hardware

2 GPU Tesla M60 with 8 Gb

References

[1] Gimelfarb, M., S. Sanner, and C.-G. Lee, 2020: ε-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning. CoRR

[2] Juliani A., 2016: Simple Reinforcement Learning with Tensorflow Part 6: Partial Observability and Deep Recurrent Q-Networks. URL: https://github.com/awjuliani/DeepRL-Agents

[3] Riboni, A., A. Candelieri, and M. Borrotti, 2021: Deep Autonomous Agents comparison for Self-Driving Cars. Proceedings of The 7th International Conference on Machine Learning, Optimization and Big Data - LOD

[4] Welcome to AirSim, https://microsoft.github.io/AirSim/

How to cite

Zangirolami, V. and M. Borrotti, 2024: Dealing with uncertainty: balancing exploration and exploitation in deep recurrent reinforcement learning. In: Knowledge-Based Systems 293. Paper

Acknowledgements

I acknowledge Data Science Lab of Department of Economics, Management and Statistics (DEMS) of University of Milan-Bicocca for providing a virtual machine.

DEMO

DRQN-bmc.mp4

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
AirsimEnv		AirsimEnv
tensorflow_2.x		tensorflow_2.x
DRQN_airsim_training.py		DRQN_airsim_training.py
DRQN_evaluation.py		DRQN_evaluation.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dealing with uncertainty: balancing exploration and exploitation in deep recurrent reinforcement learning

Description

Code Structure:

Prerequisites

Hardware

References

How to cite

Acknowledgements

DEMO

About

Releases

Packages

Languages

ValentinaZangirolami/DRL

Folders and files

Latest commit

History

Repository files navigation

Dealing with uncertainty: balancing exploration and exploitation in deep recurrent reinforcement learning

Description

Code Structure:

Prerequisites

Hardware

References

How to cite

Acknowledgements

DEMO

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages