Some Considerations on Learning to Explore via Meta-Reinforcement Learning

Bradly C. Stadie, Ge Yang, Rein Houthooft, Xi Chen, Yan Duan, Yuhuai Wu, Pieter Abbeel, Ilya Sutskever

Abstract

We interpret meta-reinforcement learning as the problem of learning how to quickly find a good sampling distribution in a new environment.

This interpretation leads to the development of two new meta-reinforcement learning algorithms: E-MAML and E-RL2.

Results are presented on a new environment we call ‘Krazy World’: a difficult high-dimensional gridworld which is designed to highlight the importance of correctly differentiating through sampling distributions in meta-reinforcement learning.

Further results are presented on a set of maze environments.

We show E-MAML and E-RL2 deliver better performance than baseline algorithms on both tasks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

E2.md

E2.md

Some Considerations on Learning to Explore via Meta-Reinforcement Learning

Abstract

Files

E2.md

Latest commit

History

E2.md

File metadata and controls

Some Considerations on Learning to Explore via Meta-Reinforcement Learning

Abstract