Deep Reinforcement Learning Nanodegree Program

This repository contains most of my projects submissions and exercises answers for the Deep Reinforcement Learning Nanodegree Program.

temporal_difference/ contains implementations of the SARSA, SARSAMAX and Expected-SARSA algorithm to solve Sutton's cliff-walking environment.
taxi/ solves the 'Taxi-v3' environment. The agent obtains a best average reward of 8.83, putting it 6th on the leaderboard.
lunar_lander/ solves the 'LunarLander-v2' environment in 600 episodes (6th on the leaderboard).
navigation/ is my submission for the Banana Unity ML environment (modified version by Udacity). The agent solves the game in ~450 episodes.

Provide feedback