Question regarding imagination process #272

anthony0727 · 2024-04-25T06:44:01Z

I can't fully understand the imagination process
in https://github.com/Eclectic-Sheep/sheeprl/blob/main/notebooks/dreamer_v3_imagination.ipynb,

from the "context" of beginning of imagination

    if i == initial_steps - imagination_steps - 1:
        stochastic_state = player.stochastic_state.clone()
        recurrent_state = player.recurrent_state.clone()

the imagination is performed

        # imagination step
        imagined_stochastic_state, recurrent_state = world_model.rssm.imagination(
            stochastic_state, recurrent_state, actions
        )

but doesn't stochastic_state_{t-1} have to be fed into world_model.rssm.imagination to output stochastic_state_{t}?
i.e.

        # imagination step
        stochastic_state, recurrent_state = world_model.rssm.imagination(
            stochastic_state, recurrent_state, actions
        )

Really appreciate if you could help me understand this process!

The text was updated successfully, but these errors were encountered:

belerico · 2024-04-26T16:51:10Z

Hi @anthony0727, the notebook has been thought like this. Supopose that we want our agent to play for 200 sptes and while imagining for 45 steps (as by default in the notebook). Our final objective is to compare how the imagination differs from the real behaviour: in our example we want to compare the last 45 steps. So:

First the agent plays for initial_steps while saving everything in the rb_initial buffer
At initial_steps - imagination_steps - 1-th step we save the recurrent and stochastic states: this will be used as a starting point for the imagination
Then we imagine for imagination_steps. During this time one can choose to really imagine actions or take the already played ones, with those actions that are used to compute the next stochastic and recurrent state from the world model, which are then used to reconstruct the image from the decoder. At the same time we reconstruct the image from the stochastic and recurrent states really played, so that we can also compare the reconstruction of the frames played

Is it more clear now?

anthony0727 · 2024-04-26T20:21:05Z

Thanks for the reply!

but my qusestion is,

why isn't "next" stochastic used for next next stochastic like shown in behavior learning?

sheeprl/sheeprl/algos/dreamer_v3/dreamer_v3.py

Line 236 in 2bae379

 imagined_prior, recurrent_state = world_model.rssm.imagination(imagined_prior, recurrent_state, actions) 

vs

        # imagination step
        imagined_stochastic_state, recurrent_state = world_model.rssm.imagination(
            stochastic_state, recurrent_state, actions
        )

belerico · 2024-04-27T16:08:16Z

You're super right! Thank you both @anthony0727 and @michele-milesi for spotting that! I thought i was going blind!
I'll fix it up right now

belerico · 2024-04-27T16:11:10Z

Opened here a new branch: can you try it out?

anthony0727 · 2024-04-29T13:27:56Z

Yup I think the fix is correct! We can close this issue!

belerico closed this as completed Apr 29, 2024

anthony0727 mentioned this issue May 12, 2024

dv3 Imagination notebook re-visited #285

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question regarding imagination process #272

Question regarding imagination process #272

anthony0727 commented Apr 25, 2024

belerico commented Apr 26, 2024

anthony0727 commented Apr 26, 2024 •

edited

Loading

belerico commented Apr 27, 2024 •

edited

Loading

belerico commented Apr 27, 2024

anthony0727 commented Apr 29, 2024

Question regarding imagination process #272

Question regarding imagination process #272

Comments

anthony0727 commented Apr 25, 2024

belerico commented Apr 26, 2024

anthony0727 commented Apr 26, 2024 • edited Loading

belerico commented Apr 27, 2024 • edited Loading

belerico commented Apr 27, 2024

anthony0727 commented Apr 29, 2024

anthony0727 commented Apr 26, 2024 •

edited

Loading

belerico commented Apr 27, 2024 •

edited

Loading