clarify updating state #224

rocknamx8 · 2020-09-21T09:18:13Z

I have marked all applicable categories:
- exception-raising fix
- algorithm implementation fix
- documentation modification
- new feature
If applicable, I have mentioned the relevant/related issue(s)

Less important but also useful:

I have visited the source website
I have searched through the issue tracker for duplicates

I have mentioned version numbers, operating system and environment, where applicable:

import tianshou, torch, sys
print(tianshou.__version__, torch.__version__, sys.version, sys.platform)

Add an indicator(i.e. self.learning) of learning will be convenient for distinguishing state of policy.
Meanwhile, the state of self.training will be undisputed in the training stage.
Related issue: #211

Others:

fix a bug in DDQN: target_q could not be sampled from np.random.rand
fix a bug in DQN atari net: it should add a ReLU before the last layer
fix a bug in collector timing

docs/tutorials/concepts.rst

rocknamx8

the action in _target_q of dqn may still controversial.

tianshou/policy/base.py

Add an indicator(i.e. `self.learning`) of learning will be convenient for distinguishing state of policy. Meanwhile, the state of `self.training` will be undisputed in the training stage. Related issue: thu-ml#211 Others: - fix a bug in DDQN: target_q could not be sampled from np.random.rand - fix a bug in DQN atari net: it should add a ReLU before the last layer - fix a bug in collector timing Co-authored-by: n+e <463003665@qq.com>

rocknamx8 added 2 commits September 21, 2020 17:01

add description of self.learning

b6bdedc

add self.learning for indicating learning state.

d6886fb

Trinkle23897 linked an issue Sep 21, 2020 that may be closed by this pull request

Why setting policy.train() before collect samples in offline_trainer? #203

Closed

8 tasks

Trinkle23897 added 2 commits September 21, 2020 17:19

Merge branch 'master' into add_learning_state

8f6b0d6

polish

1274369

Trinkle23897 changed the title ~~Add learning state~~ clarify learning state Sep 21, 2020

Trinkle23897 added 2 commits September 21, 2020 17:43

fix a bug in timing

93d3bbb

add table

958ec27

Trinkle23897 linked an issue Sep 21, 2020 that may be closed by this pull request

Potential bug caused by calling policy.eval() before collecting experience #211

Closed

8 tasks

Trinkle23897 reviewed Sep 21, 2020

View reviewed changes

docs/tutorials/concepts.rst Outdated Show resolved Hide resolved

Trinkle23897 added 2 commits September 21, 2020 18:27

remove exploration argument in forward

dd9a3d6

first try dqn

cfe5bd6

rocknamx8 commented Sep 21, 2020

View reviewed changes

fix doc

aa546c3

Trinkle23897 requested a review from duburcqa September 21, 2020 12:32

duburcqa reviewed Sep 21, 2020

View reviewed changes

tianshou/policy/base.py Outdated Show resolved Hide resolved

collecting state

f2d1bcc

Trinkle23897 changed the title ~~clarify learning state~~ clarify updating state Sep 22, 2020

Trinkle23897 added 2 commits September 22, 2020 09:35

updating state

df8ccf3

fix a bug in dqn

a048736

Trinkle23897 requested a review from duburcqa September 22, 2020 08:16

duburcqa approved these changes Sep 22, 2020

View reviewed changes

Trinkle23897 merged commit bf39b9e into thu-ml:master Sep 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clarify updating state #224

clarify updating state #224

rocknamx8 commented Sep 21, 2020 •

edited by Trinkle23897

Loading

rocknamx8 left a comment

clarify updating state #224

clarify updating state #224

Conversation

rocknamx8 commented Sep 21, 2020 • edited by Trinkle23897 Loading

rocknamx8 left a comment

Choose a reason for hiding this comment

rocknamx8 commented Sep 21, 2020 •

edited by Trinkle23897

Loading