This repository contains the code for my bachelor thesis in multi-task reinforcement learning. It incorporates the idea from: AMT into IMPALA to increase the sample efficiency training.
- IMPALA with PopArt has been implemented.
- An additional self-attention mechanism has been adopted to each subnetwork.
- TensorFlow 1.13.0
- DeepMind Sonnet.
- Atari
There is a Dockerfile which serves as a reference for the pre-requisites and commands needed to run the code.
python atari_experiment.py --num_actors=10 --batch_size=5 \
--entropy_cost=0.01 --learning_rate=0.0006 \
--total_environment_frames=2000000000
Use a multiplexer to execute the following commands.
python atari_experiment.py --job_name=learner --task=0 --num_actors=30 \
--level_name=BreakoutNoFrameSkip-v4 --batch_size=32 --entropy_cost=0.01 \
--learning_rate=0.0006 \
--total_environment_frames=2000000000
for i in $(seq 0 29); do
python atari_experiment.py --job_name=actor --task=$i \
--num_actors=30 &
done;
wait
Test it across 10 episodes using:
python atari_experiment.py --mode=test --level_name=BreakoutNoFrameSkip-v4 \
--test_num_episodes=10