Batch_D3PG

PyTorch implementation of batch D3PG, which uses n-step Bellman update and parallel experience sampling. The Detail of this algorithm can be found in this paper:

Barth-Maron, Gabriel, et al. "Distributed distributional deterministic policy gradients." arXiv preprint arXiv:1804.08617 (2018).

Requirements

pytorch 1.4.0
tensorboard
numpy
tqdm
gym
baselines
pybullet (optional)

Setup

You can use the provided requirements.txt file to install necessary dependencies.

$ pip install -r requirements.txt

Training D3PG agents

For example, to train a d3pg agent using 12 processes for pybullet ant locomotion task as follows:

$ python train.py --task-id=AntBulletEnv-v0 --num-processes=12 --num-env-steps=5000000

You can also monitor the training process and perform hyper-parameters tuning using tensorboard:

$ tensorboard --logdir=log

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
arguments.py		arguments.py
d3pg.py		d3pg.py
environment.py		environment.py
get_returns.py		get_returns.py
model.py		model.py
requirements.txt		requirements.txt
storage.py		storage.py
train.py		train.py
train_jit.py		train_jit.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Batch_D3PG

Requirements

Setup

Training D3PG agents

About

Releases

Packages

Languages

License

fengredrum/batch_d3pg

Folders and files

Latest commit

History

Repository files navigation

Batch_D3PG

Requirements

Setup

Training D3PG agents

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages