This repo shows how to implement first visit monte carlo for both prediction and control using the blackjack OpenAI gym environment. This implementation is based off the algorithms describe in Reinforcement Learning: An Introduction by Sutton and Barto, and the following repositories
- https://github.com/dennybritz/reinforcement-learning/blob/master/MC/MC%20Prediction%20Solution.ipynb
- https://github.com/udacity/deep-reinforcement-learning
You can read the full explanation of the algorithms in the accompanying medium article found here. https://towardsdatascience.com/learning-to-win-blackjack-with-monte-carlo-methods-61c90a52d53e