0.3.0
Since at this point, the code has largely changed from v0.2.0, we release version 0.3 from now on.
API Change
- add policy.updating and clarify collecting state and updating state in training (#224)
- change
train_fn(epoch)
totrain_fn(epoch, env_step)
andtest_fn(epoch)
totest_fn(epoch, env_step)
(#229) - remove out-of-the-date API: collector.sample, collector.render, collector.seed, VectorEnv (#210)
Bug Fix
- fix a bug in DDQN: target_q could not be sampled from np.random.rand (#224)
- fix a bug in DQN atari net: it should add a ReLU before the last layer (#224)
- fix a bug in collector timing (#224)
- fix a bug in the converter of Batch: deepcopy a Batch in to_numpy and to_torch (#213)
- ensure buffer.rew has a type of float (#229)