Skip to content

Latest commit

 

History

History
11 lines (6 loc) · 759 Bytes

README.md

File metadata and controls

11 lines (6 loc) · 759 Bytes

RL Engine

An RL engine (monte carlo, SARSA, Qlearning included) for solving reinforcement learning problems.

We get back a 4-tuple from openai gyms, which is of the form: (observation, reward, done, info). We will use observations, reward, and done to generate episodes via an off-policy algorithm.

And then, we will feed the episodes to a target policy that will run improvements upon it. Simple idea.

Details about implementation is covered in the blog about Monte Carlo, and the blog about TD methods.

Try running the python files in examples. They are examples on how to use the MC model.