Update OpenAI Lander example #252

Warosaurus · 2022-09-19T11:17:11Z

This PR updates the OpenAI Lander example. It addresses changes made in the upstream lander code to make this example work again.

markste-in · 2022-10-01T05:34:30Z

examples/openai-lander/evolve.py

                    score += reward
                    env.render()
-                    if done:
+                    if terminated:


I would use the truncated state since it seems more to behave like the old "done".
According to the docstring truncated means:

truncated (bool): whether a truncation condition outside the scope of the MDP is satisfied. Typically a timelimit, but could also be used to indicate agent physically going out of bounds. Can be used to end the episode prematurely before a `terminal state` is reached.

It looks like the correct action here would have been if terminated or truncated.

markste-in · 2022-10-01T05:35:30Z

examples/openai-lander/evolve.py

                data.append(np.hstack((observation, action, reward)))

-                if done:
+                if terminated:


See comment for line 223 -> same topic

coveralls · 2022-10-02T01:57:52Z

Coverage decreased (-0.05%) to 95.16% when pulling 36dcd31 on Warosaurus:update_openai_lander_example into 4928381 on CodeReclaimers:master.

coveralls · 2022-10-02T01:57:52Z

Coverage decreased (-0.05%) to 95.16% when pulling 36dcd31 on Warosaurus:update_openai_lander_example into 4928381 on CodeReclaimers:master.

coveralls · 2022-10-02T01:57:52Z

Coverage decreased (-0.05%) to 95.16% when pulling 36dcd31 on Warosaurus:update_openai_lander_example into 4928381 on CodeReclaimers:master.

ntraft · 2023-08-12T17:12:21Z

examples/openai-lander/evolve.py

            step = 0
            data = []
            while 1:
                step += 1
                if step < 200 and random.random() < 0.2:
                    action = env.action_space.sample()
                else:
-                    output = net.activate(observation)
+                    output = net.activate(observation_init_vals)


Isn't this wrong? Shouldn't you have named this observation? Now it just feeds the initial observation every time through the loop, and the observation never changes. Same issue below!

i think you are right. I created another PR ... maybe have a look at it and feel free to comment if u find something
#274

markste-in · 2023-08-13T17:37:15Z

Follow up:
#274

Update OpenAI Lander example

36dcd31

markste-in reviewed Oct 1, 2022

View reviewed changes

SoniaRaskolnikov mentioned this pull request Jun 26, 2023

Error when running LunarLander example - "Expected 8 inputs, got 2" #268

Closed

CodeReclaimers merged commit 37bc8bb into CodeReclaimers:master Jul 28, 2023

ntraft reviewed Aug 12, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update OpenAI Lander example #252

Update OpenAI Lander example #252

Warosaurus commented Sep 19, 2022

markste-in Oct 1, 2022

ntraft Aug 12, 2023

markste-in Oct 1, 2022

coveralls commented Oct 2, 2022

coveralls commented Oct 2, 2022

coveralls commented Oct 2, 2022

ntraft Aug 12, 2023

markste-in Aug 13, 2023 •

edited

Loading

markste-in commented Aug 13, 2023

Update OpenAI Lander example #252

Update OpenAI Lander example #252

Conversation

Warosaurus commented Sep 19, 2022

markste-in Oct 1, 2022

Choose a reason for hiding this comment

ntraft Aug 12, 2023

Choose a reason for hiding this comment

markste-in Oct 1, 2022

Choose a reason for hiding this comment

coveralls commented Oct 2, 2022

coveralls commented Oct 2, 2022

coveralls commented Oct 2, 2022

ntraft Aug 12, 2023

Choose a reason for hiding this comment

markste-in Aug 13, 2023 • edited Loading

Choose a reason for hiding this comment

markste-in commented Aug 13, 2023

markste-in Aug 13, 2023 •

edited

Loading