An easy to understand/use implementation of the deterministic world model presented in the paper "Model-Based Reinforcement Learning for Atari" as compared to the official implementation. Can be used to incorporate the model easily in your experiments for Atari or other environments with image-based state space. Currently consists of the most basic deterministic model presented in the paper.
The code is written in Tensorflow, so get that if you don't have it already. To run it on your environment, change the make_env/_thunk
function in utils.py
. The rest of the parameters of the network/experiment can be changed in config.py
.
As compared to the original code, we just take the single frame as the input. This was done to efficiently generate rollouts and trajectories. We train the world model on observations of multiple agents exploring the same environment differently. This is done by acting randomly, however a policy can be used to generate the data by modiflying the function generate_data
in world_model.py
. The number of agents can be changed by changing n_envs
in config.py
. The next observation can be predicted easily by calling EnvModel.imagine()
.
Original Tensor2Tensor code
I2A Code
I just saved around 3 days of your time. T2T's code-base is huuuuuge, and very hard to modify. Thank me! ~ and wipe my tears.