- Learning to Reinforcement Learn (2016) Jane X Wang, Z Kurth-Nelson, D Tirumala, H Soyer, JZ Leibo, R Munois, C Blundell, D Kumaran, M Botvinick. [arXiv] (recurrent meta-RL)
- RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning (2016) Yan Duan, John Schulman, Xi Chen, Peter Bartlett. [arXiv] Algorithm: RL^2.
- A Simple Neural Attentive Meta-Learner (2017) Nikhil Mishra, Mostafa Rohaninejad, Xi Chen, Pieter Abbeel. [arXiv] Algorithm: SNAIL. (soft attention)
- Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks (2017) Chelsea Finn, Pieter Abbeel, Sergey Levine. [arXiv] [GitHub] Algorithm: MAML. (gradient-based meta-RL)
- ProMP: Proximal Meta-Policy Search (2018) Jonas Rothfuss, Dennis Lee, Ignasi Clavera, Tamin Asfouir, Pieter Abbeel. [arXiv] [GitHub] Algorithm: ProMP. (gradient-based meta-RL)
- Meta-Learning Structured Exploration Strategies (2018) Abhishek Gupta, Russell Mendonca, Yuxuan Liu, Pieter Abbeel, Sergey Levine. [arXiv] [GitHub] Algorithm: MAESN. (gradient-based meta-RL, exploration with latent variables)
- Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables (2019) Kate Rakelly, Aurick Zhou, Deirdre Quillen, Chelsea Finn, Sergey Levine. [arXiv] [GitHub] Algorithm: PEARL. (off-policy meta-RL with posterior sampling for exploration)
- VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning (2019) Luisa Zintgraf, Kyriacos Shialis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon Whiteson. [arXiv] [GitHub] Algorithm: variBAD. (PEARL + update the latent state every timestep)
- Generalizing Skills with Semi-Supervised Reinforcement Learning (2017) Chelsea Finn, Tianhe Yu, Justin Fu, Pieter Abbeel, Sergey Levine. [arXiv] [GitHub]
- Learning Latent Plans from Play (2019) Corey Lynch, Mohi Khansari, Ted Xiao, Vikash Kumar, Janathan Tompson, Sergey Levine, Pierre Sermanet. [arXiv]
- Deep Variational Reinforcement Learning for POMDPs (2018) Maximilian Igl, Luisa Zintgraf, Tuan Anh Le, Frank Wood, Shimon Whiteson. [arXiv] [GitHub] Algorithm: DVRL. (variational inference for POMDPs)
- Some Considerations on Learning to Explore with Meta-RL (2018) Bradly C. Stadie, Ge Yang, Rein Houthooft, Xi Chen, Yan Duan, Yuhuai Wu, Pieter Abbeel, Ilya Sutskever. [arXiv] [GitHub] Algorithm: E-MAML & E-RL2. (treat the adaptation step as part of the unknown dynamics of environment)
- Learning to Explore via Meta-Policy Gradient (2018) Tianbing Xu, Qiang Liu, Liang Zhao, Jian Peng. [arXiv] (learn the exploration policy in single task algorithms such as DDPG)
- Guided Meta-Policy Search (2019) Russell Mendonca, Abhishek Gupta, Rosen Kralev, Pieter Abbeel, Sergey Levine, Chelsea Finn. [arXiv] [GitHub]
- End-to-End Robotic Reinforcement Learning without Reward Engineering (2019) Avi Singh, Larry Yang, Kristian Hartikainen, Chelsea Finn, Sergey Levine. [arXiv] [GitHub]
- Task-Agnostic Dynamics Priors for Deep Reinforcement Learning (2019) Yilun Du, Karthik Narasimhan. [arXiv] [GitHub] Algorithm: SpatialNet.
- Meta Reinforcement Learning with Task Embedding and Shared Policy(2019) Lin Lan, Zhenguo Li, Xiaohong Guan, Pinghui Wang. [arXiv]
- Adaptive Guidance and Integrated Navigation with Reinforcement Meta-Learning (2019) Brian Gaudet, Richard Linares, Roberto Furfaro. [arXiv]
- Learning Latent State Representation for Speeding Up Exploration (2019) Giulia Vezzani, Abhishek Gupta, Lorenzo Natale, Pieter Abbeel. [arXiv]
- Beyond Exponentially Discounted Sum: Automatic Learning of Return Function (2019) Yufei Wang, Qiwei Ye, Tie-Yan Liu. [arXiv]
- Learning Efficient and Effective Exploration Policies with Counterfactual Meta Policy (2019) Ruihan Yang, Qiwei Ye, Tie-Yan Liu. [arXiv]
- NoRML: No-Reward Meta Learning (2019) Yuxiang Yang, Ken Caluwaerts, Atil Iscen, Jie Tan, Chelsea Finn. [arXiv] [GitHub] Algorithm: NoRML. (MAML + environment dynamics)
- Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning (2018) Anuesha Nagabandi, Ignasi Clavera, Simin Liu, Ronald S. Fearing, Pieter Abbeel, Sergey Levine, Chelsea Finn. [arXiv] [GitHub]
- Few-Shot Goal Inference for Visuomotor Learning and Planning (2018) Annie Xie, Avi Singh, Sergey Levine, Chelsea Finn. [arXiv] [GitHub]
- One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL (2018) Tom Le Paine, Sergio Gomez Colmenarejo, Ziyu Wang, Scott Reed, Yusuf Aytar, Tobias Pfaff, Matt W. Hoffman, Gabriel Barth-Maron, Serkan Cabi, David Budden, Nando de Freitas. [arXiv] Algorithm: MetaMimic.
- Watch, Try, Learn: Meta-Learning from Demonstrations and Reward (2019) Allan Zhou, Eric Jang, Daniel Kappler, Alex Herzog, Mohi Khansari, Paul Wohlhart, Yunfei Bai, Mrinal Kalakrishnan, Sergey Levine, Chelsea Finn. [arXiv] [GitHub] (demonstrateion + trial-and-error)
- Unsupervised Meta-Learning for Reinforcement Learning (2018) Abhishek Gupta, Benjamin Eysenbach, Chelseas Finn, Sergey Levine. [arXiv]
- Skew-Fit: State-Covering Self-Supervised Reinforcement Learning (2019) Vitchyr H. Pong, Murtaza Dalal, Steven Lin, Ascvin Nair, Shikhar Bahl, Sergey Levine. [arXiv] Algorithm: Skew-Fit. (maximize entropy)
- Gradient Episodic Memory for Continual Learning (2017) David Lopez-Paz, Marc Aurelio Ranzato. [arXiv]
- Deep Online Learning via Meta-Learning (2019) Nagabandi, Finn, Levine. [arXiv]
- Stanford CS330: Multi-Task and Meta-Learning Chelsea Finn.
- Meta Reinforcement Learning Michaël Trazzi.
- Meta Reinforcement Learning Lilian Weng.
Contributions to this repo are welcome.