Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is no runs found (tensorboard) #405

Closed
7 tasks
IDayday opened this issue Aug 2, 2021 · 2 comments
Closed
7 tasks

There is no runs found (tensorboard) #405

IDayday opened this issue Aug 2, 2021 · 2 comments
Labels
question Further information is requested

Comments

@IDayday
Copy link

IDayday commented Aug 2, 2021

  • I have marked all applicable categories:
    • exception-raising bug
    • RL algorithm bug
    • documentation request (i.e. "X is missing from the documentation.")
    • new feature request
  • I have visited the source website
  • [1] I have searched through the issue tracker for duplicates
  • I have mentioned version numbers, operating system and environment, where applicable:
    import tianshou, torch, numpy, sys
    print(tianshou.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)

I follow the teaching of DQN in Docs, the code runs well but the logger couldn't open. I check the folder "./log/dqn" and some events.out named as "events.out.tfevents.1627916187.DESKTOP-NLRRDA0.49756.0" and it just 40Byte. When I opened the tensorboard to view the logger, it said " There are not any runs in the log folder."

BTW, I want to know how to change the "stop_fn". In the Docs,

stop_fn=lambda mean_rewards: mean_rewards >= env.spec.reward_threshold

I want to change the reward_threshold, but I can't find the params, so I just use

stop_fn=lambda mean_rewards: mean_rewards >= 500

it seems work ( run more epochs ). But the feedback seems to be faulty. One of them is says

Epoch #10: 10001it [00:06, 1518.47it/s, env_step=100000, len=200, loss=0.204, n/ep=0, n/st=16, rew=200.00]
Epoch #10: test_reward: 199.020000 ± 3.078896, best_reward: 200.000000 ± 0.000000 in #8

The reward can't be more than 200.

@Trinkle23897
Copy link
Collaborator

Trinkle23897 commented Aug 2, 2021

I want to change the reward_threshold, but I can't find the params, so I just use

You can just avoid passing stop_fn into trainer, or set stop_fn=None

The reward can't be more than 200.

This is true, but if one of resulted sequence is [0, 200, 200, 200, ..., 200] with length=100, the mean and std are:

In [1]: import numpy as np

In [2]: a=np.array([0] + [200] * 99)

In [3]: a.mean(), a.std()
Out[3]: (198.0, 19.8997487421324)

BTW, the maximum of CartPole-v0 reward is 200, CartPole-v1's is 500. This is because gym's Timelimit wrapper: https://github.com/openai/gym/blob/334491803859eaa5a845f5f5def5b14c108fd3a9/gym/envs/__init__.py#L56
and each step the reward is always 1.

@Syzygianinfern0
Copy link

Syzygianinfern0 commented Aug 2, 2021

@Trinkle23897 Trinkle23897 added the question Further information is requested label Aug 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants