There is no runs found (tensorboard) #405

IDayday · 2021-08-02T15:43:28Z

I have marked all applicable categories:
- exception-raising bug
- RL algorithm bug
- documentation request (i.e. "X is missing from the documentation.")
- new feature request
I have visited the source website
[1] I have searched through the issue tracker for duplicates

I have mentioned version numbers, operating system and environment, where applicable:

import tianshou, torch, numpy, sys
print(tianshou.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)

I follow the teaching of DQN in Docs, the code runs well but the logger couldn't open. I check the folder "./log/dqn" and some events.out named as "events.out.tfevents.1627916187.DESKTOP-NLRRDA0.49756.0" and it just 40Byte. When I opened the tensorboard to view the logger, it said " There are not any runs in the log folder."

BTW, I want to know how to change the "stop_fn". In the Docs,

stop_fn=lambda mean_rewards: mean_rewards >= env.spec.reward_threshold

I want to change the reward_threshold, but I can't find the params, so I just use

stop_fn=lambda mean_rewards: mean_rewards >= 500

it seems work ( run more epochs ). But the feedback seems to be faulty. One of them is says

Epoch #10: 10001it [00:06, 1518.47it/s, env_step=100000, len=200, loss=0.204, n/ep=0, n/st=16, rew=200.00]
Epoch #10: test_reward: 199.020000 ± 3.078896, best_reward: 200.000000 ± 0.000000 in #8

The reward can't be more than 200.

Trinkle23897 · 2021-08-02T15:47:17Z

I want to change the reward_threshold, but I can't find the params, so I just use

You can just avoid passing stop_fn into trainer, or set stop_fn=None

The reward can't be more than 200.

This is true, but if one of resulted sequence is [0, 200, 200, 200, ..., 200] with length=100, the mean and std are:

In [1]: import numpy as np

In [2]: a=np.array([0] + [200] * 99)

In [3]: a.mean(), a.std()
Out[3]: (198.0, 19.8997487421324)

BTW, the maximum of CartPole-v0 reward is 200, CartPole-v1's is 500. This is because gym's Timelimit wrapper: https://github.com/openai/gym/blob/334491803859eaa5a845f5f5def5b14c108fd3a9/gym/envs/__init__.py#L56
and each step the reward is always 1.

Syzygianinfern0 · 2021-08-02T15:47:44Z

For the Tensorboard issue, please try running this code and see if tensorboard results are logged: https://github.com/thu-ml/tianshou/blob/master/test/discrete/test_dqn.py
For solving the reward can't be more than 200
How to configure the cartpole environment such that it is not capped at 200? openai/gym#463

Trinkle23897 added the question Further information is requested label Aug 2, 2021

Trinkle23897 closed this as completed Aug 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There is no runs found (tensorboard) #405

There is no runs found (tensorboard) #405

IDayday commented Aug 2, 2021

Trinkle23897 commented Aug 2, 2021 •

edited

Loading

Syzygianinfern0 commented Aug 2, 2021 •

edited

Loading

There is no runs found (tensorboard) #405

There is no runs found (tensorboard) #405

Comments

IDayday commented Aug 2, 2021

Trinkle23897 commented Aug 2, 2021 • edited Loading

Syzygianinfern0 commented Aug 2, 2021 • edited Loading

Trinkle23897 commented Aug 2, 2021 •

edited

Loading

Syzygianinfern0 commented Aug 2, 2021 •

edited

Loading