-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does VIN naturally work with reinforcement learning? #11
Comments
Hi,
The VIN is a mapping from observation to probability over actions,
therefore it can be directly used as a policy representation in either
supervised learning or RL algorithms. For policy gradient type algorithms,
this is immediate. For Q-learning, you can think of the output of the VIN
as approximating a Q value.
Indeed, the VIN formulation is most suitable for problems where the
underlying planning computation can be represented as a finite (and small)
MDP. Many problems have this property - see for example the continuous
control domain in the paper, where, although the problem was continuous,
the essential planning computation could be done on .a grid. However, there
are many problems where this does not hold. For example, in many Atari
games the planning problem is not naturally represented on a grid/graph (at
least not in a trivial way). Extending the idea of deep networks that
perform a planning computation to such domains is still an active research
area. Recent papers along this direction include Value Prediction Networks
by Oh et al, and Imagination Augmented Agents from Deepmind.
Aviv
…On Fri, Nov 24, 2017 at 7:15 PM, xinleipan ***@***.***> wrote:
From my view of the paper, examples shown in the main paper were mainly
aimed at supervised learning (imitation learning), though there are some
examples using reinforcement learning. So the question is does VIN
naturally works with RL? In addition, almost all examples involve
extracting some high level grid world representation of the state space, it
is not clear how this model may be applied to a more realistic domain where
representing all states may be infeasible?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#11>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AOeQNc5TwTyQq_06v_VqkutVym0HW7u-ks5s54Y3gaJpZM4QqSem>
.
|
xinleipan
changed the title
does VIN naturally works with reinforcement learning?
does VIN naturally work with reinforcement learning?
Nov 28, 2017
xinleipan
changed the title
does VIN naturally work with reinforcement learning?
Does VIN naturally work with reinforcement learning?
Nov 28, 2017
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
From my view of the paper, examples shown in the main paper were mainly aimed at supervised learning (imitation learning), though there are some examples using reinforcement learning. So the question is does VIN naturally work with RL? In addition, almost all examples involve extracting some high level grid world representation of the state space, it is not clear how this model may be applied to a more realistic domain where representing all states may be infeasible?
The text was updated successfully, but these errors were encountered: