-
-
Notifications
You must be signed in to change notification settings - Fork 109
Update report #448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update report #448
Conversation
Hi @pilgrimygy , Could you update the format a bit to make the blog correctly rendered? |
For citation, use the For figures, use |
This is my mistake. I will update it as soon as possible. |
I'll merge this first to help you sync your progress. We can address the gpu related issue later. |
Oops... I merged the wrong PR |
This reverts commit e05ed4e.
- [Chapter13 Short Corridor.jl](/blog/notebooks_for_reinforcement_learning_an_introduction/Chapter13_Short_Corridor.jl) | ||
|
||
- [Phase 1 Technical Report of Enriching Offline Reinforcement Learning Algorithms in ReinforcementLearning.jl](/blog/offline_reinforcement_learning_algorithm_phase1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this line to the top of this paragraph.
- `Project Information` | ||
- `Project Schedule` | ||
- `Future Plan` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this line into plain text to avoid the formatting issue
"authors": [ | ||
"author":"Guoyu Yang", | ||
"authorURL":"https://github.com/pilgrimygy" | ||
"affiliation":"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a link to your university here.
This technical report is the first evaluation report of Project "Enriching Offline Reinforcement Learning Algorithms in ReinforcementLearning.jl" in OSPP. It includes three components: project information, project schedule, future plan. | ||
## Project Information | ||
- Project name: Enriching Offline Reinforcement Learning Algorithms in ReinforcementLearning.jl | ||
- Scheme Description: Recent advances in offline reinforcement learning make it possible to turn reinforcement learning into a data-driven discipline, such that many effective methods from the supervised learning field could be applied. Until now, the only offline method provided in ReinforcementLearning.jl is behavior cloning. We'd like to have more algorithms added like BCQ, CQL. It is expected to implement at least three to four modern offline RL algorithms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add link to those concepts like BCQ
, CQL
,
##### Variational Auto-Encoder (VAE) | ||
In offline reinforcement learning tasks, VAE is often used to learn from datasets to approximate behavior policy. | ||
|
||
VAE\dcite{DBLP:journals/corr/KingmaW13} ([link](https://github.com/pilgrimygy/ReinforcementLearning.jl/blob/framework/src/ReinforcementLearningCore/src/policies/q_based_policies/learners/approximators/neural_network_approximator.jl)) consists of two neural network: `encoder` and `decoder`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the link to the first reference in the above line?
PR Checklist