Courses on RL offered by University of Alberta on Coursera
(Recommend to make a copy of the assignment notebook and work on it without modifying anything)
[1.1] Multi-Arm Bandits: Exploration & Exploitation - Go to assignment
[1.2] DP: Bellman Equation - Go to assignment
[2.1] Cliff Walking Environment & TD Agent - Go to assignment
[2.2] TD: Q-Learning & Expected Sarsa - Go to assignment
[2.3] Planning: Dyna-Q & Dyna-Q+ - Go to assignment
[3.1] VFA: Semi-gradient TD(0) with State Aggregation - Go to assignment
[3.2] VFA: Semi-gradient TD with NN - Go to assignment
[3.3] Function Approximation and Control - Go to assignment
[3.4] Average Reward Softmax Actor-Critic - Go to assignment