Skip to content

Update report #448

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 15, 2021
Merged

Conversation

pilgrimygy
Copy link
Member

PR Checklist

  • Update NEWS.md?

@findmyway
Copy link
Member

@findmyway
Copy link
Member

findmyway commented Aug 13, 2021

For citation, use the \dcite{dayan2009dopamine} format to include entries in the bib file.

For figures, use \dfig{body;2021-02-20_17_41_54-draft.pptx_-_PowerPoint.png; A general workflow between policy and environment.} this format.

You may refer https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/blob/master/docs/homepage/blog/an_introduction_to_reinforcement_learning_jl_design_implementations_thoughts/index.md

@pilgrimygy
Copy link
Member Author

This is my mistake. I will update it as soon as possible.

@findmyway
Copy link
Member

I'll merge this first to help you sync your progress. We can address the gpu related issue later.

@findmyway findmyway merged commit e05ed4e into JuliaReinforcementLearning:master Aug 15, 2021
@findmyway
Copy link
Member

Oops...

I merged the wrong PR

findmyway added a commit that referenced this pull request Aug 15, 2021
@findmyway findmyway mentioned this pull request Aug 15, 2021
findmyway added a commit that referenced this pull request Aug 15, 2021
- [Chapter13 Short Corridor.jl](/blog/notebooks_for_reinforcement_learning_an_introduction/Chapter13_Short_Corridor.jl)

- [Phase 1 Technical Report of Enriching Offline Reinforcement Learning Algorithms in ReinforcementLearning.jl](/blog/offline_reinforcement_learning_algorithm_phase1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this line to the top of this paragraph.

Comment on lines +4 to +6
- `Project Information`
- `Project Schedule`
- `Future Plan`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this line into plain text to avoid the formatting issue

"authors": [
"author":"Guoyu Yang",
"authorURL":"https://github.com/pilgrimygy"
"affiliation":"",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a link to your university here.

This technical report is the first evaluation report of Project "Enriching Offline Reinforcement Learning Algorithms in ReinforcementLearning.jl" in OSPP. It includes three components: project information, project schedule, future plan.
## Project Information
- Project name: Enriching Offline Reinforcement Learning Algorithms in ReinforcementLearning.jl
- Scheme Description: Recent advances in offline reinforcement learning make it possible to turn reinforcement learning into a data-driven discipline, such that many effective methods from the supervised learning field could be applied. Until now, the only offline method provided in ReinforcementLearning.jl is behavior cloning. We'd like to have more algorithms added like BCQ, CQL. It is expected to implement at least three to four modern offline RL algorithms.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add link to those concepts like BCQ, CQL,

##### Variational Auto-Encoder (VAE)
In offline reinforcement learning tasks, VAE is often used to learn from datasets to approximate behavior policy.

VAE\dcite{DBLP:journals/corr/KingmaW13} ([link](https://github.com/pilgrimygy/ReinforcementLearning.jl/blob/framework/src/ReinforcementLearningCore/src/policies/q_based_policies/learners/approximators/neural_network_approximator.jl)) consists of two neural network: `encoder` and `decoder`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the link to the first reference in the above line?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants