Learning from scratch without using pre-trained model #15

EnnaSachdeva · 2019-12-03T09:26:21Z

I tried running test.py (PPO.py) from scratch on LunarLander-v2 Environment, without using the pre-trained model, but it does not seem to learn till 15000episodes. The episodic returns are negative even after 15000 episodes. How many episodes did it take to get the trained model?

nikhilbarhate99 · 2019-12-03T11:44:48Z

Hey, have you tried training it multiple times? or did you change the hyper-parameters?
I have been able to train it within 1500 episodes on average (although it gets stuck in a local maxima sometimes) with the current hyper-parameters.
Also, I have added 2 commits to address some issues mentioned in #10 and #8 , and have not tested the algorithm after. Can you please try with the earlier version and let me know?

EnnaSachdeva · 2019-12-03T11:54:42Z

I am running the master branch test.py and PPO.py (I hope all the recent changes are pushed in these), and I ran the code as it is, Just commented on the "load_state_dict" line in the code, with no changes in hyperparameters. These are some of the rewards I am getting.

Episode: 14994 Reward: -51
Episode: 14995 Reward: -188
Episode: 14996 Reward: -214
Episode: 14997 Reward: -403
Episode: 14998 Reward: -169
Episode: 14999 Reward: -64
Episode: 15000 Reward: -252

Also, I am using this version of code with a small grid world environment, and it does not seem to learn at all there as well.

nikhilbarhate99 · 2019-12-03T12:03:12Z

Ahh...I see, The test.py file is NOT for training, it is a utility file to load and run pre trained policies. Please run the PPO.pyfile for training.

Also, I ran some tests now on the Lunar Lander env and it seems to train just fine.

EnnaSachdeva · 2019-12-03T12:09:12Z

Ohh, my bad.
I was using only PPO.py for my custom environment (with obvious hyperparameter changes), and it does not seem to work.
Anyway, Thanks!

EnnaSachdeva changed the title ~~Learning from scratch~~ Learning from scratch without using pre-trained model Dec 3, 2019

nikhilbarhate99 closed this as completed Dec 3, 2019

This was referenced Jul 8, 2020

loss.mean().backward() crash #31

Closed

in cuda train error expected dtype Double but got dtype Float #33

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Learning from scratch without using pre-trained model #15

Learning from scratch without using pre-trained model #15

EnnaSachdeva commented Dec 3, 2019 •

edited

Loading

nikhilbarhate99 commented Dec 3, 2019

EnnaSachdeva commented Dec 3, 2019 •

edited

Loading

nikhilbarhate99 commented Dec 3, 2019

EnnaSachdeva commented Dec 3, 2019 •

edited

Loading

Learning from scratch without using pre-trained model #15

Learning from scratch without using pre-trained model #15

Comments

EnnaSachdeva commented Dec 3, 2019 • edited Loading

nikhilbarhate99 commented Dec 3, 2019

EnnaSachdeva commented Dec 3, 2019 • edited Loading

nikhilbarhate99 commented Dec 3, 2019

EnnaSachdeva commented Dec 3, 2019 • edited Loading

EnnaSachdeva commented Dec 3, 2019 •

edited

Loading

EnnaSachdeva commented Dec 3, 2019 •

edited

Loading

EnnaSachdeva commented Dec 3, 2019 •

edited

Loading