Reinforcement Learning Examples #284

ozanarkancan · 2018-03-12T07:04:46Z

Hi,

I am starting this pull request for reinforcement learning examples. I have implemented followings:

Value Iteration, Policy Evaluation, Policy Iteration
REINFORCE for both discrete and continuous action spaces

My TODO list

DQN
Actor Critic

CarloLucibello · 2018-03-12T17:42:58Z

examples/reinforcement-learning/reinforce_continuous.jl

+        push!(rewards, reward[1])
+        push!(actions, action[1])
+
+        if o["render"]


o["render"] && render(env) for conciseness?

CarloLucibello · 2018-03-12T17:43:21Z

examples/reinforcement-learning/reinforce_continuous.jl

+            render(env)
+        end
+
+        if done


done && break

CarloLucibello · 2018-03-12T17:48:34Z

examples/reinforcement-learning/reinforce_continuous.jl

+    return f1 * x + f2 * abs.(x)
+end
+
+function predict_linear(w, ob)


predict_probs or probs or predict instead of predict_linear?

CarloLucibello · 2018-03-12T17:51:11Z

examples/reinforcement-learning/reinforce_continuous.jl

+
+function sample_action(μ; σ=1.0)
+    μ = convert(Array{Float32}, μ)
+    a = μ + randn() * σ


randn(size(μ)) instead of randn()

also use .+ and .*

CarloLucibello · 2018-03-12T21:39:04Z

the package Gym should be submitted to METADATA before merging this

CarloLucibello · 2018-03-12T21:41:30Z

I coded here some actor-critic examples https://github.com/CarloLucibello/DeepRLexamples.jl, feel free to utilize any piece of code in case you find it of some use

ereday · 2018-03-18T13:49:43Z

There is a typo in all examples (like this and this) . I guess pkb should be pkg

ozanarkancan · 2018-03-18T13:53:22Z

fixed

ereday · 2018-03-18T14:05:05Z

Now, actor_critic_discrete.jl complains about Gym initialization.

ERROR: LoadError: InitError: PyError (ccall(@pysym(:PyImport_ImportModule), PyPtr, (Cstring,), name)
Stacktrace:
 [1] pyerr_check at /home/erenay/.julia/v0.6/PyCall/src/exception.jl:56 [inlined]
 [2] pyerr_check at /home/erenay/.julia/v0.6/PyCall/src/exception.jl:61 [inlined]
 [3] macro expansion at /home/erenay/.julia/v0.6/PyCall/src/exception.jl:81 [inlined]
 [4] pyimport(::String) at /home/erenay/.julia/v0.6/PyCall/src/PyCall.jl:374
 [5] __init__() at /home/erenay/.julia/v0.6/Gym/src/Gym.jl:10
 [6] _include_from_serialized(::String) at ./loading.jl:157
 [7] _require_from_serialized(::Int64, ::Symbol, ::String, ::Bool) at ./loading.jl:200
 [8] _require(::Symbol) at ./loading.jl:498
 [9] require(::Symbol) at ./loading.jl:405
 [10] include_from_node1(::String) at ./loading.jl:576
 [11] include(::String) at ./sysimg.jl:14
 [12] process_options(::Base.JLOptions) at ./client.jl:305
 [13] _start() at ./client.jl:371
during initialization of module Gym
while loading /home/erenay/.julia/v0.6/Knet/examples/actor_critic_discrete.jl, in expression starting on line 22

ereday · 2018-03-18T14:17:53Z

It works now. @denizyuret examples work. It can be merged.

denizyuret · 2018-03-18T17:49:56Z

Notebooks get too big with plots, in Knet I am committing versions of all notebooks with text outputs but without plots (2.5M -> 28K). I did a fresh copy into master as we don't want the 2M files in git history either. (Another alternative is to save with plots under its own package in KnetML). Please continue development under current master.
In the dp notebook GYM_ENVS should be in quotes -- fixed.
In the dp notebook I get a "WARNING: special characters "#{}()[]<>|&*?~;" should now be quoted in commands" during Pkg.build("Gym"), don't know if it hurts anything.
In the dp notebook env = GymEnv("FrozenLake-v0") gives error 'No module named scipy.stats', is there any way scipy can be automatically installed during Pkg.build? Please test with fresh python without preinstalled packages.
In reinforce I get the following in the last box: 'WARN: gym.spaces.Box autodetected dtype as <type 'numpy.float32'>. Please provide explicit dtype.', is this something we can fix?
I haven't been able to run dqn yet, it hangs at DQN.main("--help"), maybe I am not giving it enough memory?

ozanarkancan · 2018-03-18T18:10:05Z

I guess "WARNING: special characters "#{}()[]<>|&*?~;" is related with a julia issue
scipy is in the requirements of gym but somehow it is not triggered during the installation. I will investigate it.
'WARN: gym.spaces.Box autodetected dtype as <type 'numpy.float32'> is coming from gym even if you use it in python.
The last one might be a notebook issue, julia dqn.jl --printinfo should work.

denizyuret · 2018-03-18T18:42:46Z

OK, please check the dqn notebook, I can run all the others.

…

On Sun, Mar 18, 2018 at 9:10 PM Ozan Arkan Can ***@***.***> wrote: - I guess "WARNING: special characters "#{}()[]<>|&*?~;" is related with a julia issue <JuliaLang/julia#20482> - scipy is in the requirements of gym but somehow it is not triggered during the installation. I will investigate it. - 'WARN: gym.spaces.Box autodetected dtype as <type 'numpy.float32'> is coming from gym even if you use it in python. - The last one might be a notebook issue, julia dqn.jl --printinfo should work. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#284 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABvNpp5INLGNxnzCdb21jorgTvDr--UDks5tfqL-gaJpZM4SmJ3_> .

fix #284 - dqn-notebook

ozanarkancan added 10 commits March 7, 2018 00:29

reinforce - CartPole

0525de9

reinforce cont.

f9bb118

reinforce cont.

2a140dd

reinforce discrete & continuous

2a60e71

zerograd

be9c27f

zerograd

d2ec403

dp notebook

09f4d12

doc string

78b60ec

delete

0d6b41f

reinforce notebook

f94f9bd

ozanarkancan changed the title Rl Reinforcement Learning Examples Mar 12, 2018

CarloLucibello reviewed Mar 12, 2018

View reviewed changes

updates

98a5d6f

ozanarkancan added 8 commits March 14, 2018 23:09

dqn

4fb6eda

delete

832b12b

actor-critic

c635682

Gym Clone

7fddf43

reshape

c32e0e5

examples readme

649e1be

Pkg installed does not work

d3fde7d

Merge branch 'rl' of github.com:ozanarkancan/Knet.jl into rl

39e535a

fix typo

6ccadc9

denizyuret added the docs&examples label Mar 18, 2018

ozanarkancan mentioned this pull request Mar 19, 2018

fix #284 - dqn-notebook #287

Merged

ozanarkancan closed this Mar 19, 2018

denizyuret added a commit that referenced this pull request Mar 26, 2018

Merge pull request #287 from ozanarkancan/dqn-notebook

f974eb5

fix #284 - dqn-notebook

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reinforcement Learning Examples #284

Reinforcement Learning Examples #284

ozanarkancan commented Mar 12, 2018 •

edited

Loading

CarloLucibello Mar 12, 2018

CarloLucibello Mar 12, 2018

CarloLucibello Mar 12, 2018

CarloLucibello Mar 12, 2018

CarloLucibello Mar 12, 2018

CarloLucibello commented Mar 12, 2018

CarloLucibello commented Mar 12, 2018 •

edited

Loading

ereday commented Mar 18, 2018

ozanarkancan commented Mar 18, 2018

ereday commented Mar 18, 2018

ereday commented Mar 18, 2018 •

edited

Loading

denizyuret commented Mar 18, 2018

ozanarkancan commented Mar 18, 2018

denizyuret commented Mar 18, 2018 via email

Reinforcement Learning Examples #284

Reinforcement Learning Examples #284

Conversation

ozanarkancan commented Mar 12, 2018 • edited Loading

CarloLucibello Mar 12, 2018

Choose a reason for hiding this comment

CarloLucibello Mar 12, 2018

Choose a reason for hiding this comment

CarloLucibello Mar 12, 2018

Choose a reason for hiding this comment

CarloLucibello Mar 12, 2018

Choose a reason for hiding this comment

CarloLucibello Mar 12, 2018

Choose a reason for hiding this comment

CarloLucibello commented Mar 12, 2018

CarloLucibello commented Mar 12, 2018 • edited Loading

ereday commented Mar 18, 2018

ozanarkancan commented Mar 18, 2018

ereday commented Mar 18, 2018

ereday commented Mar 18, 2018 • edited Loading

denizyuret commented Mar 18, 2018

ozanarkancan commented Mar 18, 2018

denizyuret commented Mar 18, 2018 via email

ozanarkancan commented Mar 12, 2018 •

edited

Loading

CarloLucibello commented Mar 12, 2018 •

edited

Loading

ereday commented Mar 18, 2018 •

edited

Loading