Implementating Reinforcement Learning from A to Z using keras only.
sh atari_breakout_run.sh --double=True --dueling=True
sh atari_breakout_run.sh -h
usage: main.py [-h] [--e E] [--double D] [--dueling B]
Some hyperparameters
optional arguments:
-h, --help show this help message and exit
--e E Total episodes
--double D Enable Double DQN
--dueling B Enable Dueling DQN
Technique | Problem | How to solve it |
---|---|---|
DQN | Non-stationary targets makes learning unstable | Fixed Q-targets |
Correlation between samples makes W biased | Replay Memory | |
Double ~ | Maximum estimator raises over-estimation | Using double estimators |
Dueling ~ | Some state may have inherently low value | Q(s, a) = V(s) + A(s, a) |