GitHub - scott-moura/rl: Introduction to Reinforcement Learning: A Short Course

Introduction to Reinforcement Learning

Welcome! This course is jointly taught by UC Berkeley and the Tsinghua-Berkeley Shenzhen Institute (TBSI).

Instructors

Prof. Scott Moura (UC Berkeley) <smoura [at] berkeley.edu>
Co-Instructor Saehong Park (UC Berkeley) <sspark [at] berkeley.edu>
TA Xinyi Zhou (TBSI) <zxyyx48 [at] 163.com>

Course Schedule

China Time	California Time
July 7, 8, 9, 10 (Tu-F)	July 6, 7, 8, 9 (M-Th)
July 14, 15, 16, 17 (Tu-F);	July 13, 14, 15, 16, 17 (M-Th)
all at 08:30-10:05 China Time	all at 5:30pm PT - 7:05pm PT

Add to Google Calendar: ()

Day-by-Day Schedule

Day	Topic	Speaker	Pre-recorded Lecture	Slides / Notes	Real-time Lecture Recordings
1	1a. Introduction - Course Org	Scott Moura	Zoom Recording PW: 1e*OV@Re	LEC1a Slides	Recording Link PW: 9L%JePa=
	1b. Introduction – History of RL	Scott Moura	Zoom Recording PW: 1k.E69^o	LEC1a Slides
	1c. Optimal Control Intro	Scott Moura	Zoom Recording PW: 2B&=2@*@
2	2a. Dynamic Programming	Scott Moura	Zoom Recording PW: 3F*1rg%?	LEC2a Notes	Recording Link PW: 8Q?#51=J
	2b. Case Study: Linear Quadratic Regulator (LQR)	Scott Moura	Zoom Recording PW: 5Y#4=58&	LEC2b Notes
3	3a. Policy Evaluation & Policy Improvement	Scott Moura	Zoom Recording PW: 9N@%H4&@	LEC3a Notes	Recording Link PW: 1A@@0G63
	3b. Policy Iteration Algo	Scott Moura	Zoom Recording PW: 6y+!+6#9	LEC3b Notes
	3c. Case Study: LQR	Scott Moura	Zoom Recording PW: 6D@YkC&=	LEC3c Notes
4	4a. Approximate DP: TD Error & Value Function Approx.	Scott Moura	Zoom Recording PW: 6v&78$We	LEC4a Notes	Recording Link PW: 4t=#ye7T
	4b. Case Study: LQR	Scott Moura	Zoom Recording PW: 1O^fh.8+	LEC4b Notes	Installation Recording PW: 2s+83!eQ
	4c. Online RL with ADP	Scott Moura	Zoom Recording PW: 0q=.4378	LEC4c Notes
5	5a. Actor-Critic Method	Scott Moura	Zoom Recording PW: 2y!@@#$7	LEC5a Notes	Recording Link PW: 1Z^6B28+
	5b. Case Study: Offshore Wind	Scott Moura		LEC5b Notes
6	6a. Markov Decision Process	Saehong Park	Zoom Recording PW:5L=*%&2i	LEC6 Notes	Recording Link PW: 4L*=91?@
	6b. Q-Learning	Saehong Park	Zoom Recording PW: 3K!+fj^V
7	7a. Policy Optimization	Saehong Park	Zoom Recording PW: 0W$fa0$M	LEC7a Notes	Recording Link PW: 9j++=3$5
	7b. Policy Gradient	Saehong Park	Zoom Recording PW: 2N++5&I3	LEC7b Notes
	7c. Policy Gradient	Saehong Park	Zoom Recording PW: 3j%n80**	LEC7c Notes
8	8a. Actor Critic	Saehong Park	Zoom Recording PW: 2F!WI9$8	LEC8a Notes	Recording Link PW: 0W$+=9P*
	8b. Actor Critic	Saehong Park	Zoom Recording PW: 9r$HH%59	LEC8b Notes
	8c. RL for Energy Systems: Battery Fast-charging	Saehong Park	Zoom Recording PW: 9r$HH%59	Slides

Topic Outline

Optimal Control
Dynamic Programming
1. Principal of Optimality & Value Functions
  - Case Study: Linear Quadratic Regulator (LQR)
Policy Evaluation & Policy Improvement
1. Policy Iteration Algo & Variants
- Case Study: LQR
Approximate Dynamic Programming (ADP)
1. Temporal Difference (TD) Error
2. Value Function Approximation
  - Case Study: LQR
3. Online RL with ADP
4. Actor-Critic Method
  - Case Study: Offshore Wind
Q-Learning
1. Q-learning algorithm
2. Advanced Q-learning algorithm, i.e., DQN
Policy Gradient
1. Policy Optimization
2. Vanilla policy gradient (REINFORCE)
Actor-Critic using Policy Gradient
1. Actor-Critic using Policy Gradient
2. Advanced Actor-Critic algorithm, i.e., DDPG
RL for energy systems
1. Case Study: Battery Fast-charging

Lectures Notes

2020 Lecture Notes [Updated 2020-7-16]
2019 Lecture Notes

Jupyter Notebook

Tenrsorflow review [Updated 2020-7-12]
Homework [Updated 2020-7-16]

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
LEC		LEC
Notes		Notes
.DS_Store		.DS_Store
HW1_PG_Result_Sample.png		HW1_PG_Result_Sample.png
LICENSE		LICENSE
LectureNotes_2020.pdf		LectureNotes_2020.pdf
README.md		README.md
TF_1.X_review-Shared.ipynb		TF_1.X_review-Shared.ipynb
_config.yml		_config.yml
hw1-PolicyGrad.ipynb		hw1-PolicyGrad.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction to Reinforcement Learning

Instructors

Course Schedule

Day-by-Day Schedule

Topic Outline

Lectures Notes

Jupyter Notebook

About

Releases

Packages

Contributors 2

Languages

License

scott-moura/rl

Folders and files

Latest commit

History

Repository files navigation

Introduction to Reinforcement Learning

Instructors

Course Schedule

Day-by-Day Schedule

Topic Outline

Lectures Notes

Jupyter Notebook

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages