Inspired by the Deep Reinforcement Learning Nanodegree course material.
- Switch backends between Tensorflow and PyTorch.
- Explore new environment via yaml config files (no coding required).
- Supported Algorithms
- Expected Sarsa
- Deep Q Learning
- Dueling Deep Q Learning
- Double Deep Q Learning
- Prioritized Experience Replay (with Importance Sampling)
- CNN variation of Deep Q Network and Dueling Deep Q Network
- Supported Environments
- Pre-trained Checkpoints for few environments.
brew install python3 swig && \
brew install opencv3 --with-python && \
pip3 install --upgrade pip setuptools wheel
sudo apt-get install swig python3 python3-pip && \
sudo pip3 install --upgrade pip setuptools wheel && \
sudo pip3 install virtualenv
virtualenv --no-site-packages -p python3 .venv && \
source .venv/bin/activate && \
pip install -r requirements.txt
source .venv/bin/activate
python3 train.py -c configs/cartpole.yaml
python3 watch.py -c configs/cartpole.yaml
By default PyTorch backend is used, if it is available in the runtime. In case you want to switch to:
export RL_BACKEND='tf'
export RL_BACKEND='torch'
- Type: UnityML
- Goal: The agents must learn to collect as many yellow bananas as possible while avoiding blue bananas.
- Reward:
- +1 for collecting yellow banana.
- -1 for collecting blue banana
- State Space: 37 dimensions which contains the agent's velocity, along with ray-based perception of objects around agent's forward direction.
- Action Space:
- 0 - move forward.
- 1 - move backward.
- 2 - turn left.
- 3 - turn right.
- Solved when: Agent gets an average score of +13 over 100 consecutive episodes.
- Download the environment from one of the links below to the checked out directory. You need only select the environment that matches your operating system:
- Linux: click here
- MacOS: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
- Edit configs/banana.yaml and change
filename
field according to your environment.- Linux:
Banana_Linux/Banana.x86_64
- MacOS:
Banana.app
- Windows (32-bit):
Banana_Windows_x86/Banana.exe
- Windows (64-bit):
Banana_Windows_x86_64/Banana.exe
- Linux:
python3 train.py -c configs/banana.yaml
python3 watch.py -c configs/banana.yaml