Skip to content

Fibration/reinforcement-learning-intro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

reinforcement-learning-intro

Study of Sutton and Barto Reinforcement Learning: An Introduction.

Tabular Methods

Bandits!

Performance of bandit algorithms: Performance of bandit algorithms

Gradient bandits and greedy bandits with optimistic expectations generally perform the best. Gradient bandits take longer to converge but have the potential to reach a higher performance.

About

Study of Barto and Sutton

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages