Use of ProgressMeter.jl and added a `reset_callback` to DPWSolver #69

mossr · 2020-08-11T08:37:13Z

Being an "anytime" algorithm, it's nice to know how long MCTS will take :)
Therefore, I added support for using @showprogress from ProgressMeter.jl within action_info for the DPWPlanner. Note the progress meter is off by default.

I also added a reset_callback(mdp, s) option for the DWPSolver so you can specify a callback function to reset your MDP to a given state. I used these changes for a recent DASC 2020 paper.

…state space progressive widening (defaults to true, i.e. previous behavior)

zsunberg · 2020-08-11T18:44:33Z

@mossr , thanks for the contribution!

Do you know if the progress meter addition could have any performance consequences? There is a very wide variety of ways this package is used - some people use it to plan for minutes or hours; some use it to plan for 0.01 seconds. I am a bit concerned that if it is used for 0.01 seconds, the progress meter might take a significant portion of that. I would be more comfortable using code where we will be sure that julia will optimize out all progress meter code.

(In the long run, it would be better to have a package that is designed for different use cases than just running POMDPs.jl simulations like this one 😄)

I am not sure I understand the need for reset_callback. Is it because you have a simulator where the state is not actually separate from the mdp?

(In the long run, the solution to this would be to have a better package that uses an interface that does not assume that the mdp model and state are completely separate objects, like CommonRLInterface.jl 😄. I think with all the julia experience we have acquired, we could make a much much better MCTS package that is easier for people to use and extend for many different use cases. If you happen to be interested in working on this, let me know and we can zoom about it! I have wanted to work on it for a long time, but have had to focus on other things.)

findmyway · 2020-08-14T03:02:02Z

I think with all the julia experience we have acquired, we could make a much much better MCTS package that is easier for people to use and extend for many different use cases

Count me in! 😉

findmyway · 2020-08-14T03:10:09Z

src/dpw.jl

@@ -15,9 +15,10 @@ POMDPs.action(p::DPWPlanner, s) = first(action_info(p, s))
 """
 Construct an MCTSDPW tree and choose the best action. Also output some information.
 """
-function POMDPModelTools.action_info(p::DPWPlanner, s; tree_in_info=false)
+function POMDPModelTools.action_info(p::DPWPlanner, s; tree_in_info=false, show_progress=false)


I'd suggest using an Union{Progress, Nothing} here. So that @zsunberg will have no concern about performance😁

mossr · 2020-08-17T22:04:17Z

@zsunberg Great point regarding performance. I modified my branch based on the suggestion from @findmyway to use the Progress type instead of @showprogress. This will optimize the loop to skip any additional computation when show_progress=false (which is the default). I attempted a solution using Requires.jl, but found an issue with that package that I filed here (if you're curious): JuliaPackaging/Requires.jl#88 (tl;dr POMDPSimulators loads ProgressMeter which triggers MCTS to think that it has direct access to ProgressMeter).

Regarding reset_callback, your explanation is correct. We use this feature for adaptive stress testing (POMDPStressTesting.jl) when the simulator state is not truly separate from the MDP state.
I updated the description of reset_callback and comments around where it's used to clarify this.

I am interested in a general MCTS package, but—like you—don't have the time at the moment. Maybe in 6-12 months time we can reconvene about this! 😄

codecov · 2020-08-17T23:03:12Z

Codecov Report

Merging #69 into master will increase coverage by 0.20%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #69      +/-   ##
==========================================
+ Coverage   85.30%   85.51%   +0.20%     
==========================================
  Files          11       11              
  Lines         415      421       +6     
==========================================
+ Hits          354      360       +6     
  Misses         61       61

Impacted Files	Coverage Δ
src/MCTS.jl	`100.00% <ø> (ø)`
src/dpw.jl	`96.19% <100.00%> (+0.19%)`	⬆️
src/dpw_types.jl	`83.87% <100.00%> (+0.53%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1434f7e...a3e9c2b. Read the comment docs.

zsunberg · 2020-08-24T16:30:43Z

Sorry for not responding for a while. I need to request a few more changes:

For consistency we should have the progress option available in the solver constructor
Should we need to call finish! on the progress option after the loop in case it terminates early because of time?
I don't think this code branch

MCTS.jl/src/dpw.jl

Lines 97 to 99 in c006ef2

if sol.reset_callback !== nothing

sol.reset_callback(dpw.mdp, s) # Optional: used to reset/reinitialize MDP to a given state.

end

can be optimized out because the compiler does not know what type reset_callback is. Can you either verify that it is optimized out or make reset_callback inferrable (probably by adding it to DPWPlanner)? (or if you have another argument about why it is ok, I am happy to hear it)

…ed finish! on progress meter after timeout

mossr · 2020-09-04T07:17:25Z

Hey @zsunberg, to counter your delay I delayed 🙃

Regarding your comments (which were all warranted):

Agreed regarding show_progress—I moved the option to the solver.
Done. I now call finish! inside the timeout conditional right before we break.
This one was fun to prove! I changed the type of reset_callback from Any to Function with a default of (mdp, s)->false. Beautifully, Julia optimizes out that call as shown in my test example below.

mutable struct Obj value end # Some random structure

obj = Obj(0) # With an instantiation

# Function under test, which includes an inner call to `callback`
function perf_test(obj, callback::Function)
    callback(obj)
    return (10,20)
end

First, it's cleaner to show the @code_typed output for the two cases: 1) when the callback input function is our default (i.e. blindly returns false) and 2) when it actually performs some operation on its input:

julia> @code_typed perf_test(obj, o->false)
CodeInfo(
1 ─     return (10, 20)
) => Tuple{Int64,Int64}

julia> @code_typed perf_test(obj, o->o.value+=1)
CodeInfo(
1 ─     invoke callback(_2::Obj)::Any
└──     return (10, 20)
) => Tuple{Int64,Int64}

Now to compare using @code_native, which is what's ultimately run by the machine (and thus harder to initially follow), I wanted to first create another function called perf_test_empty that simply does not call the callback internally:

function perf_test_empty(obj, callback::Function)
    return (10,20)
end

And running both functions, where each of them takes in the blind false callback:

julia> @code_native perf_test(obj, o->false)
        .text
; ┌ @ REPL[6]:2 within `perf_test'
        pushq   %rbp
        movq    %rsp, %rbp
; │ @ REPL[6]:3 within `perf_test'
        movabsq $736989064, %rax        # imm = 0x2BED8F88
        vmovups (%rax), %xmm0
        vmovups %xmm0, (%rcx)
        movq    %rcx, %rax
        popq    %rbp
        retq
        nopl    (%rax,%rax)
; └

julia> @code_native perf_test_empty(obj, o->false)
        .text
; ┌ @ REPL[7]:2 within `perf_test_empty'
        pushq   %rbp
        movq    %rsp, %rbp
        movabsq $736990256, %rax        # imm = 0x2BED9430
        vmovups (%rax), %xmm0
        vmovups %xmm0, (%rcx)
        movq    %rcx, %rax
        popq    %rbp
        retq
        nopl    (%rax,%rax)
; └

We can also show what perf_test and perf_test_empty look like when they both take in the callback that modifies its inputs:

julia> @code_native perf_test(obj, o->o.value+=1)
        .text
; ┌ @ REPL[6]:2 within `perf_test'
        pushq   %rbp
        movq    %rsp, %rbp
        pushq   %rsi
        subq    $40, %rsp
        movq    %rcx, %rsi
        movq    %rdx, -16(%rbp)
        movabsq $"japi1_#27_17172", %rax
        leaq    -16(%rbp), %rdx
        movl    $811572056, %ecx        # imm = 0x305F9B58
        movl    $1, %r8d
        callq   *%rax
; │ @ REPL[6]:3 within `perf_test'
        movabsq $736989448, %rax        # imm = 0x2BED9108
        vmovups (%rax), %xmm0
        vmovups %xmm0, (%rsi)
        movq    %rsi, %rax
        addq    $40, %rsp
        popq    %rsi
        popq    %rbp
        retq
        nopw    (%rax,%rax)
; └

julia> @code_native perf_test_empty(obj, o->o.value+=1)
        .text
; ┌ @ REPL[7]:2 within `perf_test_empty'
        pushq   %rbp
        movq    %rsp, %rbp
        movabsq $736990128, %rax        # imm = 0x2BED93B0
        vmovups (%rax), %xmm0
        vmovups %xmm0, (%rcx)
        movq    %rcx, %rax
        popq    %rbp
        retq
        nopl    (%rax,%rax)
; └

Therefore, I now just call p.solver.reset_callback(p.mdp, s) without wrapping the !== nothing conditional and Julia should optimize it out! Note, I did the same exercise with the !== nothing conditional and—as you suspected—Julia did not optimize that branch out. So I'm pretty happy with the above (and committed) solution.

mossr · 2020-09-04T10:34:13Z

Note I relaxed the Colors.jl lower bound version requirement due to conflicts between Flux v0.10 and MCTS v0.4.3 (without issue)

zsunberg · 2020-09-07T21:02:46Z

Ok, we are close... Thanks for being thorough! Unfortunately, I don't think reset_callback will optimize it out in MCTS. The reason it was able to do so in the example you provided is because julia compiles a specialized version of perf_test for the concrete argument types, so essentially you have a function barrier there. This won't work in MCTS because the solver type does not contain information about the concrete type of reset_callback.

I went ahead and fixed it by putting it into the planner object with a type parameter. (I am not sure if this is the absolute best pattern, but it is what I have been using and I am just maintaining consistency)

I also added Colors 0.12 back into the compatibility in addition to 0.11. I don't think there is any reason to limit it to just 0.11.

If the tests pass, I will merge this.

mossr · 2020-09-07T21:25:18Z

Interesting and good to know—thanks for going ahead and fixing it!

zsunberg · 2020-09-07T21:54:48Z

Ok, @mossr is there anything else to add before we register a new version?

mossr · 2020-09-07T22:03:23Z

@zsunberg Nothing code-wise, but I thought we could address the Github linguist issue I opened beforehand: #71

mossr added 5 commits February 14, 2020 14:27

Added DPWSolver parameter 'enable_state_pw' for explicit enabling of …

c669087

…state space progressive widening (defaults to true, i.e. previous behavior)

Added 'reset_callback' option to DPWSolver

56080c4

ProgressMeter: added @showprogress around DPW simulate

863a70a

Added show_progress=false kwarg to DWP action_info

2cc8cf7

Merge remote-tracking branch 'upstream/master'

dd4f21f

findmyway reviewed Aug 14, 2020

View reviewed changes

mossr added 3 commits August 17, 2020 14:23

Removed use of isnothing for Julia v1.0 compatibility

98db9f5

Ensure no performance degradation when not using ProgressMeter

0a8ef3c

Clarify use of reset_callback

9f810ab

Merge remote-tracking branch 'upstream/master'

3b48768

mossr added 2 commits August 17, 2020 16:26

Added DWP test for ProgressMeter and reset_callback

e204118

Removed left-over progress dt

c006ef2

mossr added 2 commits September 3, 2020 23:43

Optimized out reset_callback, moved show_progress to solver, and call…

50673ba

…ed finish! on progress meter after timeout

Fixed show_progress test

bb6c6cb

Relax Colors lower bound version requirement

844f055

put reset_callback into planner

a3e9c2b

zsunberg merged commit 6782345 into JuliaPOMDP:master Sep 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of ProgressMeter.jl and added a `reset_callback` to DPWSolver #69

Use of ProgressMeter.jl and added a `reset_callback` to DPWSolver #69

mossr commented Aug 11, 2020 •

edited

Loading

zsunberg commented Aug 11, 2020

findmyway commented Aug 14, 2020

findmyway Aug 14, 2020

mossr commented Aug 17, 2020

codecov bot commented Aug 17, 2020 •

edited

Loading

zsunberg commented Aug 24, 2020

mossr commented Sep 4, 2020

mossr commented Sep 4, 2020

zsunberg commented Sep 7, 2020

mossr commented Sep 7, 2020

zsunberg commented Sep 7, 2020

mossr commented Sep 7, 2020

Use of ProgressMeter.jl and added a reset_callback to DPWSolver #69

Use of ProgressMeter.jl and added a reset_callback to DPWSolver #69

Conversation

mossr commented Aug 11, 2020 • edited Loading

zsunberg commented Aug 11, 2020

findmyway commented Aug 14, 2020

findmyway Aug 14, 2020

Choose a reason for hiding this comment

mossr commented Aug 17, 2020

codecov bot commented Aug 17, 2020 • edited Loading

Codecov Report

zsunberg commented Aug 24, 2020

mossr commented Sep 4, 2020

mossr commented Sep 4, 2020

zsunberg commented Sep 7, 2020

mossr commented Sep 7, 2020

zsunberg commented Sep 7, 2020

mossr commented Sep 7, 2020

Use of ProgressMeter.jl and added a `reset_callback` to DPWSolver #69

Use of ProgressMeter.jl and added a `reset_callback` to DPWSolver #69

mossr commented Aug 11, 2020 •

edited

Loading

codecov bot commented Aug 17, 2020 •

edited

Loading