This repo is based on the Udacity Deep Reinforcement Learning Nano Degree project Continuous Control
- Environment: A double jointed arm with one end fixed, another hand (small blue sphere) moving freely, and a target location(big green sphere) randomly moves around it.
- Goal: Move the hand part of the arm towards the target location (so that it turns opaque green) and keep it there.
- Reward: Each agent gets a reward of +0.1 every step when the hand is in the target location.
- State space: 33 variables corresponding to position, rotation, velocity, and angular velocities of the two arms.
- Action space: 4 continuous values in range (-1, 1), corresponding to torque applicable to two joints.
- Agents: This version of the environment runs 20 simultaneous agents, very helpful for algorithm like PPO, A3C and D4PG.
- Solved: When all 20 agents together yield an average score of 30 for 100 consecutive episodes.
brew install python3 swig && \
brew install opencv3 --with-python && \
pip3 install --upgrade pip setuptools wheel
sudo apt-get install swig python3 python3-venv
python3 -m venv .venv && \
source .venv/bin/activate && \
pip install -r requirements.txt
Download the "Reacher" environment based on your machine, and copy it into env
directory.
source .venv/bin/activate
python3 train.py
Required checkpoints are already available in checkpoints/ directory.
python3 test.py