Upside-Down Reinforcement Learning

Landing a Spaceship using Upside-Down Reinforcement Learning (a.k.a ⅂ꓤ)

This research is based on the paper Training Agents using Upside-Down Reinforcement Learning submitted on 5 Dec 2019 by Rupesh Kumar Srivastava, Pranav Shyam, Filipe Mutz, Wojciech Jaśkowski and Jürgen Schmidhuber

See project research and my implementation, solving OpenAI Gym LunarLander v2 environment.

I also wrote a Medium article trying to demystify this algorithm, explaining the paper with my own words.

Abstract

Traditional Reinforcement Learning (RL) algorithms either predict rewards with value functions or maximize them using policy search. We study an alternative: Upside-Down Reinforcement Learning (Upside-Down RL or UDRL), that solves RL problems primarily using supervised learning techniques. Many of its main principles are outlined in a companion report. Here we present the first concrete implementation of UDRL and demonstrate its feasibility on certain episodic learning problems. Experimental results show that its performance can be surprisingly competitive with, and even exceed that of traditional baseline algorithms developed over decades of research.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
images		images
rocket_lander_gym		rocket_lander_gym
videos		videos
.gitignore		.gitignore
README.md		README.md
Upside-Down_RL--RocketLander.ipynb		Upside-Down_RL--RocketLander.ipynb
Upside-Down_RL.ipynb		Upside-Down_RL.ipynb
behavior.pth		behavior.pth
behavior_rocket.pth		behavior_rocket.pth
buffer.npy		buffer.npy
buffer_rocket.npy		buffer_rocket.npy
history.npy		history.npy
history_rocket.npy		history_rocket.npy
setup.py		setup.py
test_env.py		test_env.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Upside-Down Reinforcement Learning

Abstract

Results

About

Releases

Packages

Languages

jscriptcoder/Upside-Down-Reinforcement-Learning

Folders and files

Latest commit

History

Repository files navigation

Upside-Down Reinforcement Learning

Abstract

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages