Is it normal that running `train_line.py` renders samples from one mode only? #173

saleml · 2024-03-21T15:33:03Z

No description provided.

josephdviviano · 2024-03-31T16:17:43Z

I likely need to change the default options to enable off policy exploration

josephdviviano · 2024-03-31T16:35:13Z

Hmm, on my machine, the training isn't perfect (the default options undertrain the policy) but I definitely sample from both modes.

Were your experiments off of master or is this maybe related to changes we made RE: off policy training in the other open pr (#174)?

saleml · 2024-04-02T09:18:11Z

I'm investigating the issue. Indeed, different behavior on mater and on fix_off_policy.

One thing worth noting is that with the original number of trajectories (1.28e6), sure the samples are from one mode only, but are more accurate.

Edit1: And when running the code for slightly longer on the fix_off_policy branch (3e6 trajectories), I obtain a figure that is similar to the one obtained with master.

saleml · 2024-04-02T10:04:21Z

In fix_off_policy, I had forgotten a keyword. The problem is fixed in 89c72b5

saleml assigned josephdviviano Mar 21, 2024

josephdviviano mentioned this issue Mar 31, 2024

Train line bugfix #176

Merged

saleml closed this as completed Apr 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it normal that running `train_line.py` renders samples from one mode only? #173

Is it normal that running `train_line.py` renders samples from one mode only? #173

saleml commented Mar 21, 2024

josephdviviano commented Mar 31, 2024

josephdviviano commented Mar 31, 2024

saleml commented Apr 2, 2024 •

edited

Loading

saleml commented Apr 2, 2024

Is it normal that running train_line.py renders samples from one mode only? #173

Is it normal that running train_line.py renders samples from one mode only? #173

Comments

saleml commented Mar 21, 2024

josephdviviano commented Mar 31, 2024

josephdviviano commented Mar 31, 2024

saleml commented Apr 2, 2024 • edited Loading

saleml commented Apr 2, 2024

Is it normal that running `train_line.py` renders samples from one mode only? #173

Is it normal that running `train_line.py` renders samples from one mode only? #173

saleml commented Apr 2, 2024 •

edited

Loading