Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reinforcement learning #995

Merged
merged 62 commits into from
Jun 11, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
a6652b0
work for tl2 tf2
quantumiracle May 16, 2019
08eca73
td3 added
quantumiracle May 18, 2019
4b104a4
sac added
quantumiracle May 18, 2019
149f0b9
add tutorial_wrappers
Officium May 18, 2019
d23bae6
add readme
quantumiracle May 19, 2019
e4d75f5
Merge branch 'reinforcement-learning' of https://github.com/tensorlay…
quantumiracle May 19, 2019
6a3950f
readme
quantumiracle May 20, 2019
782b71f
readme
quantumiracle May 20, 2019
54a7f16
bug
quantumiracle May 21, 2019
28565b0
Update README.md
quantumiracle May 21, 2019
120ddef
add double, dueling, noisy dqn
Officium May 21, 2019
63a1f93
Update README.md
quantumiracle May 21, 2019
38470ac
Update tutorial_sac.py
quantumiracle May 21, 2019
8d6afdb
add prioritized replay
May 22, 2019
974ff61
add c51
May 23, 2019
eb816ea
add retrace tutorial and update doc in preplay
Officium May 24, 2019
54a6c59
Update README.md
initial-h May 27, 2019
f0f3446
Update PG, DDPG, PPO, DPPO, TRPO
Tokarev-TT-33 May 27, 2019
7e332c4
Update README.md
initial-h May 27, 2019
5168140
readme
quantumiracle May 30, 2019
a4f3e4f
readme
quantumiracle May 31, 2019
3e4f01c
Add files via upload
Tokarev-TT-33 Jun 4, 2019
7b29928
change readme
quantumiracle Jun 4, 2019
2a89c43
Update README.md
zsdonghao Jun 4, 2019
fde26fb
Update README.md
zsdonghao Jun 4, 2019
4622397
Update README.md
zsdonghao Jun 4, 2019
9b1109c
Update README.md
zsdonghao Jun 4, 2019
1473289
Update README.md
zsdonghao Jun 4, 2019
bc26093
Update README.md
zsdonghao Jun 4, 2019
c392f97
Update README.md
initial-h Jun 4, 2019
e6c4532
remove model
quantumiracle Jun 4, 2019
45a64f4
Merge branch 'reinforcement-learning' of https://github.com/tensorlay…
quantumiracle Jun 4, 2019
bb9f919
add baselines
quantumiracle Jun 4, 2019
6a3b21c
readme
quantumiracle Jun 4, 2019
4335b0d
Update tutorial_TRPO.py
Tokarev-TT-33 Jun 6, 2019
e52e1e3
chang format
quantumiracle Jun 9, 2019
f6011db
tutorial format
quantumiracle Jun 9, 2019
485c8e2
tutorial format
quantumiracle Jun 9, 2019
cacb4a3
tutorial format 1
quantumiracle Jun 9, 2019
81c22db
readme
quantumiracle Jun 9, 2019
980990c
readme
quantumiracle Jun 9, 2019
7b8fc63
format 2
quantumiracle Jun 9, 2019
0e24c30
format 2
quantumiracle Jun 9, 2019
8e9400a
manual set random.seed in SAC and TD3, change format of tutorial C51
Officium Jun 9, 2019
a0d0c18
format ac a3c q-learning dqn sac td3
quantumiracle Jun 9, 2019
e88bc10
Merge branch 'reinforcement-learning' of https://github.com/tensorlay…
quantumiracle Jun 9, 2019
662412e
change format of DQN_variants
Officium Jun 10, 2019
f19b19d
change format of C51 and DQN_variants to follow the tutorial_format
Officium Jun 10, 2019
8b68349
format pg ddpo ppo dppo trpo
Tokarev-TT-33 Jun 11, 2019
eff0928
update authors
Tokarev-TT-33 Jun 11, 2019
a9ac6d6
update authors
Tokarev-TT-33 Jun 11, 2019
77d4a73
update formats of C51, DQN_variants, Retrace and PER, update README
Officium Jun 11, 2019
c00af6d
make format
Officium Jun 11, 2019
edaf61e
update annocation
Officium Jun 11, 2019
dc19fc5
modify readme
quantumiracle Jun 11, 2019
1cc15d3
readme
quantumiracle Jun 11, 2019
d6116de
readme
quantumiracle Jun 11, 2019
64da954
readme
quantumiracle Jun 11, 2019
560dbb8
Merge branch 'master' into reinforcement-learning
zsdonghao Jun 11, 2019
c838553
changelog
quantumiracle Jun 11, 2019
c579374
changelog
quantumiracle Jun 11, 2019
4793c66
yapf
quantumiracle Jun 11, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,11 +90,16 @@ To release a new version, please update the changelog as followed:
### Changed
- Add version_info in model.config. (PR #992)
- Replace tf.nn.func with tf.nn.func.\_\_name\_\_ in model config.
- Add Reinforcement learning tutorials. (PR #995)

### Fixed

### Contributors
- @warshallrho:
- @quantumiracle: #995
- @Tokarev-TT-33: #995
- @initial-h: #995
- @Officium: #995

## [2.0.2] - 2019-6-5

Expand Down Expand Up @@ -124,6 +129,8 @@ A maintain release.
- Layer
- `InstanceNorm`, `InstanceNorm1d`, `InstanceNorm2d`, `InstanceNorm3d` (PR #963)

* Reinforcement learning tutorials. (PR #995)

### Changed
- remove `tl.layers.initialize_global_variables(sess)` (PR #931)
- update `tutorial_generate_text.py`, `tutorial_ptb_lstm.py`. remove `tutorial_ptb_lstm_state_is_tuple.py` (PR #958)
Expand All @@ -144,8 +151,15 @@ A maintain release.
- @warshallrho: #PR966
- @zsdonghao: #931
- @yd-yin: #963
<<<<<<< HEAD
- @Tokarev-TT-33: # 995
- @initial-h: # 995
- @quantumiracle: #995
- @Officium: #995
=======
- @1FengL: #958
- @dvklopfenstein: #971
>>>>>>> 560dbb8a17963023a3b1d59a79e1c2752530114a


## [2.0.0] - 2019-05-04
Expand Down Expand Up @@ -371,8 +385,10 @@ To many PR for this update, please check [here](https://github.com/tensorlayer/t
- AtrousDeConv2dLayer added (PR #662)
- Fix bugs of using `tf.layers` in CNN (PR #686)
- Optimizer:

- AMSGrad Optimizer added based on `On the Convergence of Adam and Beyond (ICLR 2018)` (PR #636)
- Setup:

- Creation of installation flaggs `all`, `all_cpu`, and `all_gpu` (PR #660)
- Test:
- `test_utils_predict.py` added to reproduce and fix issue #288 (PR #566)
Expand Down
10 changes: 10 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -1 +1,11 @@
<br/>

<a href="https://join.slack.com/t/tensorlayer/shared_invite/enQtMjUyMjczMzU2Njg4LWI0MWU0MDFkOWY2YjQ4YjVhMzI5M2VlZmE4YTNhNGY1NjZhMzUwMmQ2MTc0YWRjMjQzMjdjMTg2MWQ2ZWJhYzc" target="\_blank">
<div align="center">
<img src="../img/join_slack.png" width="40%"/>
</div>
</a>

<br/>

# [Click Me](https://github.com/tensorlayer/awesome-tensorlayer)
8 changes: 3 additions & 5 deletions examples/basic_tutorials/tutorial_cifar10_cnn_static.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,10 @@
import time

import numpy as np
import tensorflow as tf

import tensorflow as tf
import tensorlayer as tl
from tensorlayer.layers import (BatchNorm, Conv2d, Dense, Flatten, Input,
LocalResponseNorm, MaxPool2d)
from tensorlayer.layers import (BatchNorm, Conv2d, Dense, Flatten, Input, LocalResponseNorm, MaxPool2d)
from tensorlayer.models import Model

# enable debug logging
Expand Down Expand Up @@ -74,15 +73,14 @@ def get_model_batchnorm(inputs_shape):
# get the network
net = get_model([None, 24, 24, 3])


# training settings
batch_size = 128
n_epoch = 50000
learning_rate = 0.0001
print_freq = 5
n_step_epoch = int(len(y_train) / batch_size)
n_step = n_epoch * n_step_epoch
shuffle_buffer_size = 128 # 100
shuffle_buffer_size = 128 # 100
# init_learning_rate = 0.1
# learning_rate_decay_factor = 0.1
# num_epoch_decay = 350
Expand Down
16 changes: 9 additions & 7 deletions examples/basic_tutorials/tutorial_mnist_mlp_dynamic.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
import time

import numpy as np
import tensorflow as tf

import tensorflow as tf
import tensorlayer as tl
from tensorlayer.layers import Dense, Dropout, Input
from tensorlayer.models import Model
Expand All @@ -13,18 +13,19 @@
## prepare MNIST data
X_train, y_train, X_val, y_val, X_test, y_test = tl.files.load_mnist_dataset(shape=(-1, 784))


## define the network
class CustomModel(Model):

def __init__(self):
super(CustomModel, self).__init__()

self.dropout1 = Dropout(keep=0.8)#(self.innet)
self.dense1 = Dense(n_units=800, act=tf.nn.relu, in_channels=784)#(self.dropout1)
self.dropout2 = Dropout(keep=0.8)#(self.dense1)
self.dense2 = Dense(n_units=800, act=tf.nn.relu, in_channels=800)#(self.dropout2)
self.dropout3 = Dropout(keep=0.8)#(self.dense2)
self.dense3 = Dense(n_units=10, act=tf.nn.relu, in_channels=800)#(self.dropout3)
self.dropout1 = Dropout(keep=0.8) #(self.innet)
self.dense1 = Dense(n_units=800, act=tf.nn.relu, in_channels=784) #(self.dropout1)
self.dropout2 = Dropout(keep=0.8) #(self.dense1)
self.dense2 = Dense(n_units=800, act=tf.nn.relu, in_channels=800) #(self.dropout2)
self.dropout3 = Dropout(keep=0.8) #(self.dense2)
self.dense3 = Dense(n_units=10, act=tf.nn.relu, in_channels=800) #(self.dropout3)

def forward(self, x, foo=None):
z = self.dropout1(x)
Expand All @@ -37,6 +38,7 @@ def forward(self, x, foo=None):
out = tf.nn.relu(out)
return out


MLP = CustomModel()

## start training
Expand Down
8 changes: 5 additions & 3 deletions examples/basic_tutorials/tutorial_mnist_mlp_dynamic_2.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
import time

import numpy as np
import tensorflow as tf

import tensorflow as tf
import tensorlayer as tl
from tensorlayer.layers import Dense, Dropout, Input, LayerList
from tensorlayer.models import Model
Expand All @@ -13,13 +13,14 @@
## prepare MNIST data
X_train, y_train, X_val, y_val, X_test, y_test = tl.files.load_mnist_dataset(shape=(-1, 784))


## define the network
class CustomModelHidden(Model):

def __init__(self):
super(CustomModelHidden, self).__init__()

self.dropout1 = Dropout(keep=0.8)#(self.innet)
self.dropout1 = Dropout(keep=0.8) #(self.innet)

self.seq = LayerList(
[
Expand All @@ -29,14 +30,15 @@ def __init__(self):
]
)

self.dropout3 = Dropout(keep=0.8)#(self.seq)
self.dropout3 = Dropout(keep=0.8) #(self.seq)

def forward(self, x):
z = self.dropout1(x)
z = self.seq(z)
z = self.dropout3(z)
return z


class CustomModelOut(Model):

def __init__(self):
Expand Down
12 changes: 8 additions & 4 deletions examples/basic_tutorials/tutorial_mnist_mlp_static.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
import time

import numpy as np
import tensorflow as tf

import tensorflow as tf
import tensorlayer as tl
from tensorlayer.layers import Dense, Dropout, Input
from tensorlayer.models import Model
Expand All @@ -21,14 +21,18 @@
def get_model(inputs_shape):
ni = Input(inputs_shape)
nn = Dropout(keep=0.8)(ni)
nn = Dense(n_units=800, act=tf.nn.relu)(nn) # in_channels is optional in this case as it can be inferred by the previous layer
nn = Dense(n_units=800,
act=tf.nn.relu)(nn) # in_channels is optional in this case as it can be inferred by the previous layer
nn = Dropout(keep=0.8)(nn)
nn = Dense(n_units=800, act=tf.nn.relu)(nn) # in_channels is optional in this case as it can be inferred by the previous layer
nn = Dense(n_units=800,
act=tf.nn.relu)(nn) # in_channels is optional in this case as it can be inferred by the previous layer
nn = Dropout(keep=0.8)(nn)
nn = Dense(n_units=10, act=tf.nn.relu)(nn) # in_channels is optional in this case as it can be inferred by the previous layer
nn = Dense(n_units=10,
act=tf.nn.relu)(nn) # in_channels is optional in this case as it can be inferred by the previous layer
M = Model(inputs=ni, outputs=nn, name="mlp")
return M


MLP = get_model([None, 784])
import pprint
pprint.pprint(MLP.config)
Expand Down
6 changes: 4 additions & 2 deletions examples/basic_tutorials/tutorial_mnist_mlp_static_2.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
import time

import numpy as np
import tensorflow as tf

import tensorflow as tf
import tensorlayer as tl
from tensorlayer.layers import Dense, Dropout, Input
from tensorlayer.models import Model
Expand All @@ -13,12 +13,12 @@
## prepare MNIST data
X_train, y_train, X_val, y_val, X_test, y_test = tl.files.load_mnist_dataset(shape=(-1, 784))


## define the network
# the softmax is implemented internally in tl.cost.cross_entropy(y, y_) to
# speed up computation, so we use identity here.
# see tf.nn.sparse_softmax_cross_entropy_with_logits()


def hidden_model(inputs_shape):
ni = Input(inputs_shape)
nn = Dropout(keep=0.8)(ni)
Expand All @@ -28,6 +28,7 @@ def hidden_model(inputs_shape):

return Model(inputs=ni, outputs=nn, name="mlp_hidden")


def get_model(inputs_shape, hmodel):
hidden = hmodel.as_layer()
ni = Input(inputs_shape)
Expand All @@ -37,6 +38,7 @@ def get_model(inputs_shape, hmodel):

return Model(inputs=ni, outputs=nn, name="mlp")


MLP_hidden = hidden_model([None, 784])
MLP = get_model([None, 784], MLP_hidden)
# MLP.print_layers()
Expand Down
4 changes: 2 additions & 2 deletions examples/basic_tutorials/tutorial_mnist_siamese.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@
import time

import numpy as np
import tensorflow as tf

import tensorflow as tf
import tensorlayer as tl
from tensorlayer.layers import Dense, Dropout, Flatten, Input
from tensorlayer.models import Model
Expand All @@ -33,7 +33,7 @@ def contrastive_loss(label, feature1, feature2):


def compute_accuracy(label, feature1, feature2):
eucd = tf.sqrt(tf.reduce_sum((feature1 - feature2) ** 2, axis=1))
eucd = tf.sqrt(tf.reduce_sum((feature1 - feature2)**2, axis=1))
pred = tf.cast(eucd < 0.5, label.dtype)
return tf.reduce_mean(tf.cast(tf.equal(pred, label), tf.float32))

Expand Down
3 changes: 2 additions & 1 deletion examples/basic_tutorials/tutorial_mnist_simple.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
#! /usr/bin/python
# -*- coding: utf-8 -*-

import numpy as np

import tensorflow as tf
import tensorlayer as tl
import numpy as np

tl.logging.set_verbosity(tl.logging.DEBUG)

Expand Down
1 change: 0 additions & 1 deletion examples/data_process/data/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
from __future__ import absolute_import

# from . import imagenet_classes
from . import *
4 changes: 2 additions & 2 deletions examples/data_process/tutorial_fast_affine_transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@
import multiprocessing
import time

import cv2
import numpy as np
import tensorflow as tf

import cv2
import tensorflow as tf
import tensorlayer as tl

# tl.logging.set_verbosity(tl.logging.DEBUG)
Expand Down
7 changes: 4 additions & 3 deletions examples/data_process/tutorial_tf_dataset_voc.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@
import time

import numpy as np
import tensorflow as tf

import tensorflow as tf
import tensorlayer as tl

# tf.logging.set_verbosity(tf.logging.DEBUG)
Expand Down Expand Up @@ -108,5 +108,6 @@ def _map_fn(filename, annotation):
## save all images
for i in range(len(im)):
print(ann[i][1])
tl.vis.draw_boxes_and_labels_to_image(im[i] * 255, ann[i][0], ann[i][1], [], classes,
True, save_name='_bbox_vis_%d.png' % i)
tl.vis.draw_boxes_and_labels_to_image(
im[i] * 255, ann[i][0], ann[i][1], [], classes, True, save_name='_bbox_vis_%d.png' % i
)
6 changes: 2 additions & 4 deletions examples/data_process/tutorial_tfrecord.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@
import os

import numpy as np
import tensorflow as tf
from PIL import Image

import tensorflow as tf
import tensorlayer as tl

## Save data ==================================================================
Expand Down Expand Up @@ -79,8 +79,7 @@ def read_and_decode(filename):
raw_dataset = tf.data.TFRecordDataset([filename]).shuffle(1000).batch(4)
for serialized_example in raw_dataset:
features = tf.io.parse_example(
serialized_example,
features={
serialized_example, features={
'label': tf.io.FixedLenFeature([], tf.int64),
'img_raw': tf.io.FixedLenFeature([], tf.string),
}
Expand All @@ -97,4 +96,3 @@ def read_and_decode(filename):
print("img_batch : %s" % img_batch.shape)
print("label_batch : %s" % label_batch.shape)
tl.visualize.images2d(img_batch, second=1, saveable=False, name='batch', dtype=None, fig_idx=2020121)

3 changes: 2 additions & 1 deletion examples/data_process/tutorial_tfrecord2.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@
import os

import numpy as np

# import matplotlib
# matplotlib.use('GTK')
import tensorflow as tf

import tensorlayer as tl

# Download data, and convert to TFRecord format, see ```tutorial_tfrecord.py```
Expand Down Expand Up @@ -77,6 +77,7 @@ def read_and_decode(filename):
label_batch = tf.cast(features['label'], tf.int32)
yield img_batch, label_batch


img_batch, label_batch = next(read_and_decode("train.tfrecords"))
print("img_batch : %s" % img_batch.shape)
print("label_batch : %s" % label_batch.shape)
Expand Down
2 changes: 1 addition & 1 deletion examples/data_process/tutorial_tfrecord3.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@
import os

import numpy as np
import tensorflow as tf
from PIL import Image

import tensorflow as tf
import tensorlayer as tl


Expand Down
1 change: 0 additions & 1 deletion examples/database/dispatch_tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
import time

import tensorflow as tf

import tensorlayer as tl

tl.logging.set_verbosity(tl.logging.DEBUG)
Expand Down
1 change: 0 additions & 1 deletion examples/database/task_script.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
"""Sample task script."""

import tensorflow as tf

import tensorlayer as tl

tf.logging.set_verbosity(tf.logging.DEBUG)
Expand Down
Loading