Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of input independent functions #14

Closed
raminammour opened this issue Mar 24, 2020 · 8 comments
Closed

Performance of input independent functions #14

raminammour opened this issue Mar 24, 2020 · 8 comments

Comments

@raminammour
Copy link

Hello,

First thanks for the package, it came very handy in code that I needed. However, I noticed this performance hit that I couldn't understant:

using FunctionWrappers
fwrap=FunctionWrappers.FunctionWrapper{Float64,Tuple{Float64,Float64}}
Fvec=fwrap[(i,j)->0.5;(i,j)->i-i+0.5]

using BenchmarkTools
function foo1(r1,r2,Fvec)
    Fvec[1].(r1,r2)
end
function foo2(r1,r2,Fvec)
    Fvec[2].(r1,r2)
end

r1,r2=rand(10),rand(1,11)
@btime foo1($r1,r2,$Fvec)
@btime foo2($r1,$r2,$Fvec)

  2.813 μs (221 allocations: 4.42 KiB)
  671.712 ns (1 allocation: 1008 bytes)

As you can see the two functions are the same, the second one is made to look as if it is a function of its inputs and the performance is very different.

Is this a bug, or am I missing something?

Cheers!

@yuyichao
Copy link
Owner

foo1($r1,r2,$Fvec)

Missing $

Also 2 will be intrinsically slower though it shouldn't matter for allocation.

@raminammour
Copy link
Author

Adding $ changes nothing:

@btime foo1($r1,$r2,$Fvec)
@btime foo2($r1,$r2,$Fvec)
  2.738 μs (221 allocations: 4.42 KiB)
  655.563 ns (1 allocation: 1008 bytes)

I don't understand why the issue was closed. foo2 is faster, not slower than foo1 as should be expected, and foo1 allocates more, inexplicably.

@yuyichao
Copy link
Owner

Adding $ changes nothing:

In that case it seems to be a Base cfunction codegen issue.

@raminammour
Copy link
Author

So should I file an issue against julia Base?

@yuyichao
Copy link
Owner

Yes, the cfunction code generation appears to be inefficient and non-repeatable.

julia> using BenchmarkTools

julia> f3(x, y) = x - x + 0.5
f3 (generic function with 1 method)

julia> g3(f, x, y) = f3(x, y)
g3 (generic function with 1 method)

julia> p3 = @cfunction(g3, Float64, (Ref{typeof(f3)}, Float64, Float64))
Ptr{Nothing} @0x00007f8405b31650

julia> @benchmark ccall($p3, Float64, (Ref{typeof(f3)}, Float64, Float64), f3, 1.0, 2.0)
BenchmarkTools.Trial:
  memory estimate:  48 bytes
  allocs estimate:  3
  --------------
  minimum time:     66.488 ns (0.00% GC)
  median time:      68.980 ns (0.00% GC)
  mean time:        74.425 ns (0.45% GC)
  maximum time:     510.097 ns (73.69% GC)
  --------------
  samples:          10000
  evals/sample:     977

julia> p3 = @cfunction(g3, Float64, (Ref{typeof(f3)}, Float64, Float64))
Ptr{Nothing} @0x00007f8405b42ec0

julia> @benchmark ccall($p3, Float64, (Ref{typeof(f3)}, Float64, Float64), f3, 1.0, 2.0)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     4.527 ns (0.00% GC)
  median time:      4.554 ns (0.00% GC)
  mean time:        4.965 ns (0.00% GC)
  maximum time:     44.207 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

@yuyichao
Copy link
Owner

(note that the case I posted above may or may not be the direct cause of the issue but it is standalone and reliable enough to reproduce the issue on 1.3.0.)

@raminammour
Copy link
Author

Thanks for the MWE! One last question :)
Do you have any hint about why the slowdown/allocation happens in the example I posted above for the function that does not depend on inputs. While it doesn't for the other function?

@yuyichao
Copy link
Owner

I don't know what's the condition that triggers it. From minimal testing it appears to be just sensitive to random stuff including running a second time as shown above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants