Does VIN naturally work with reinforcement learning? #11

xinleipan · 2017-11-25T03:15:02Z

From my view of the paper, examples shown in the main paper were mainly aimed at supervised learning (imitation learning), though there are some examples using reinforcement learning. So the question is does VIN naturally work with RL? In addition, almost all examples involve extracting some high level grid world representation of the state space, it is not clear how this model may be applied to a more realistic domain where representing all states may be infeasible?

avivt · 2017-11-26T01:16:26Z

Hi, The VIN is a mapping from observation to probability over actions, therefore it can be directly used as a policy representation in either supervised learning or RL algorithms. For policy gradient type algorithms, this is immediate. For Q-learning, you can think of the output of the VIN as approximating a Q value. Indeed, the VIN formulation is most suitable for problems where the underlying planning computation can be represented as a finite (and small) MDP. Many problems have this property - see for example the continuous control domain in the paper, where, although the problem was continuous, the essential planning computation could be done on .a grid. However, there are many problems where this does not hold. For example, in many Atari games the planning problem is not naturally represented on a grid/graph (at least not in a trivial way). Extending the idea of deep networks that perform a planning computation to such domains is still an active research area. Recent papers along this direction include Value Prediction Networks by Oh et al, and Imagination Augmented Agents from Deepmind. Aviv

…

On Fri, Nov 24, 2017 at 7:15 PM, xinleipan ***@***.***> wrote: From my view of the paper, examples shown in the main paper were mainly aimed at supervised learning (imitation learning), though there are some examples using reinforcement learning. So the question is does VIN naturally works with RL? In addition, almost all examples involve extracting some high level grid world representation of the state space, it is not clear how this model may be applied to a more realistic domain where representing all states may be infeasible? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#11>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AOeQNc5TwTyQq_06v_VqkutVym0HW7u-ks5s54Y3gaJpZM4QqSem> .

xinleipan changed the title ~~does VIN naturally works with reinforcement learning?~~ does VIN naturally work with reinforcement learning? Nov 28, 2017

xinleipan changed the title ~~does VIN naturally work with reinforcement learning?~~ Does VIN naturally work with reinforcement learning? Nov 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does VIN naturally work with reinforcement learning? #11

Does VIN naturally work with reinforcement learning? #11

xinleipan commented Nov 25, 2017 •

edited

Loading

avivt commented Nov 26, 2017 via email

Does VIN naturally work with reinforcement learning? #11

Does VIN naturally work with reinforcement learning? #11

Comments

xinleipan commented Nov 25, 2017 • edited Loading

avivt commented Nov 26, 2017 via email

xinleipan commented Nov 25, 2017 •

edited

Loading