Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference/dispatch penalty of Union return types #23338

Closed
quinnj opened this issue Aug 18, 2017 · 6 comments
Closed

Inference/dispatch penalty of Union return types #23338

quinnj opened this issue Aug 18, 2017 · 6 comments
Assignees
Labels
compiler:inference Type inference missing data Base.missing and related functionality performance Must go faster

Comments

@quinnj
Copy link
Member

quinnj commented Aug 18, 2017

I'm seeing a 2x slowdown for Union return types vs. doing a manual type check:

julia> f(::Void) = Int8(0)
f (generic function with 3 methods)

julia> f(x) = x
f (generic function with 4 methods)

julia> function f(v::Vector{Union{Void, Int8}}, n)
         s = Int8(0)
         @inbounds for i = 1:n
           t = Base.arrayref(v, i)
           s += f(t)
         end
         return s
       end
f (generic function with 4 methods)

julia> function f2(v::Vector{Union{Void, Int8}}, n)
         s = Int8(0)
         @inbounds for i = 1:n
           t = Base.arrayref(v, i)
           s += t isa Void ? f(t) : f(t)
         end
         return s
       end
f2 (generic function with 1 method)

julia> using BenchmarkTools

julia> n = 100
100

julia> V = Vector{Union{Void, Int8}}(n);

julia> b = @benchmarkable f($(V), $n)
Benchmark(evals=1, seconds=5.0, samples=10000)

julia> tune!(b)
Benchmark(evals=362, seconds=5.0, samples=10000)

julia> run(b)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     260.655 ns (0.00% GC)
  median time:      276.033 ns (0.00% GC)
  mean time:        293.677 ns (0.00% GC)
  maximum time:     854.251 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     362

julia> V = Vector{Union{Void, Int8}}(n);

julia> b2 = @benchmarkable f2($(V), $n)
Benchmark(evals=1, seconds=5.0, samples=10000)

julia> tune!(b2)
Benchmark(evals=944, seconds=5.0, samples=10000)

julia> run(b2)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     96.913 ns (0.00% GC)
  median time:      105.480 ns (0.00% GC)
  mean time:        116.619 ns (0.00% GC)
  maximum time:     505.381 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     944

julia> @code_warntype f(V, n)
Variables:
  #self#::#f
  v::Array{Union{Void, Int8},1}
  n::Int64
  s::Int8
  #temp#@_5::Int64
  i::Int64
  t::Union{Void, Int8}
  y::Int8
  back::Int64
  #temp#@_10::Int8

Body:
  begin
      $(Expr(:inbounds, false))
      # meta: location sysimg.jl Type 103
      # meta: location int.jl convert 459
      # meta: location int.jl checked_trunc_sint 437
      y::Int8 = (Base.trunc_int)(Int8, 0)::Int8
      #= line 438 =#
      back::Int64 = (Base.sext_int)(Int64, y::Int8)::Int64
      #= line 439 =#
      SSAValue(2) = (0 === back::Int64)::Bool
      unless SSAValue(2) goto 12
      goto 14
      12:
      $(Expr(:invoke, MethodInstance for throw_inexacterror(::Symbol, ::Type{Int8}, ::Int64), :(Base.throw_inexacterror), :(:trunc), Int8, 0))::Union{}
      14:
      # meta: pop location
      # meta: pop location
      # meta: pop location
      $(Expr(:inbounds, :pop))
      s::Int8 = y::Int8
      #= line 3 =#
      $(Expr(:inbounds, true))
      $(Expr(:inbounds, false))
      # meta: location range.jl colon 5
      # meta: pop location
      $(Expr(:inbounds, :pop))
      SSAValue(4) = (Base.select_value)((Base.sle_int)(1, n::Int64)::Bool, n::Int64, (Base.sub_int)(1, 1)::Int64)::Int64
      #temp#@_5::Int64 = 1
      28:
      unless (Base.not_int)((#temp#@_5::Int64 === (Base.add_int)(SSAValue(4), 1)::Int64)::Bool)::Bool goto 53
      SSAValue(5) = #temp#@_5::Int64
      SSAValue(6) = (Base.add_int)(#temp#@_5::Int64, 1)::Int64
      i::Int64 = SSAValue(5)
      #temp#@_5::Int64 = SSAValue(6)
      #= line 4 =#
      t::Union{Void, Int8} = (Base.arrayref)(v::Array{Union{Void, Int8},1}, i::Int64)::Union{Void, Int8}
      #= line 5 =#
      SSAValue(3) = Main.f
      unless (t::Union{Void, Int8} isa Void)::Bool goto 41
      #temp#@_10::Int8 = $(Expr(:invoke, MethodInstance for f(::Void), SSAValue(3), :(t)))::Int8
      goto 49
      41:
      unless (t::Union{Void, Int8} isa Int8)::Bool goto 45
      #temp#@_10::Int8 = $(Expr(:invoke, MethodInstance for f(::Int8), SSAValue(3), :(t)))::Int8
      goto 49
      45:
      goto 47
      47:
      (Base.error)("fatal error in type inference (type bound)")::Any
      49:
      s::Int8 = (Base.add_int)(s::Int8, #temp#@_10::Int8)::Int8
      51:
      goto 28
      53:
      $(Expr(:inbounds, :pop))
      #= line 7 =#
      return s::Int8
  end::Int8

julia> @code_warntype f2(V, n)
Variables:
  #self#::#f2
  v::Array{Union{Void, Int8},1}
  n::Int64
  s::Int8
  #temp#@_5::Int64
  i::Int64
  t::Union{Void, Int8}
  #temp#@_8::Int8
  y@_9::Int8
  back@_10::Int64
  y@_11::Int8
  back@_12::Int64

Body:
  begin
      $(Expr(:inbounds, false))
      # meta: location sysimg.jl Type 103
      # meta: location int.jl convert 459
      # meta: location int.jl checked_trunc_sint 437
      y@_9::Int8 = (Base.trunc_int)(Int8, 0)::Int8
      #= line 438 =#
      back@_10::Int64 = (Base.sext_int)(Int64, y@_9::Int8)::Int64
      #= line 439 =#
      SSAValue(2) = (0 === back@_10::Int64)::Bool
      unless SSAValue(2) goto 12
      goto 14
      12:
      $(Expr(:invoke, MethodInstance for throw_inexacterror(::Symbol, ::Type{Int8}, ::Int64), :(Base.throw_inexacterror), :(:trunc), Int8, 0))::Union{}
      14:
      # meta: pop location
      # meta: pop location
      # meta: pop location
      $(Expr(:inbounds, :pop))
      s::Int8 = y@_9::Int8
      #= line 3 =#
      $(Expr(:inbounds, true))
      $(Expr(:inbounds, false))
      # meta: location range.jl colon 5
      # meta: pop location
      $(Expr(:inbounds, :pop))
      SSAValue(4) = (Base.select_value)((Base.sle_int)(1, n::Int64)::Bool, n::Int64, (Base.sub_int)(1, 1)::Int64)::Int64
      #temp#@_5::Int64 = 1
      28:
      unless (Base.not_int)((#temp#@_5::Int64 === (Base.add_int)(SSAValue(4), 1)::Int64)::Bool)::Bool goto 68
      SSAValue(5) = #temp#@_5::Int64
      SSAValue(6) = (Base.add_int)(#temp#@_5::Int64, 1)::Int64
      i::Int64 = SSAValue(5)
      #temp#@_5::Int64 = SSAValue(6)
      #= line 4 =#
      t::Union{Void, Int8} = (Base.arrayref)(v::Array{Union{Void, Int8},1}, i::Int64)::Union{Void, Int8}
      #= line 5 =#
      unless (t::Union{Void, Int8} isa Main.Void)::Bool goto 62
      $(Expr(:inbounds, false))
      # meta: location REPL[90] f 1
      $(Expr(:inbounds, false))
      # meta: location sysimg.jl Type 103
      # meta: location int.jl convert 459
      # meta: location int.jl checked_trunc_sint 437
      y@_11::Int8 = (Base.trunc_int)(Int8, 0)::Int8
      #= line 438 =#
      back@_12::Int64 = (Base.sext_int)(Int64, y@_11::Int8)::Int64
      #= line 439 =#
      SSAValue(3) = (0 === back@_12::Int64)::Bool
      unless SSAValue(3) goto 51
      goto 53
      51:
      $(Expr(:invoke, MethodInstance for throw_inexacterror(::Symbol, ::Type{Int8}, ::Int64), :(Base.throw_inexacterror), :(:trunc), Int8, 0))::Union{}
      53:
      # meta: pop location
      # meta: pop location
      # meta: pop location
      $(Expr(:inbounds, :pop))
      # meta: pop location
      $(Expr(:inbounds, :pop))
      #temp#@_8::Int8 = y@_11::Int8
      goto 64
      62:
      #temp#@_8::Int8 = t::Int8
      64:
      s::Int8 = (Base.add_int)(s::Int8, #temp#@_8::Int8)::Int8
      66:
      goto 28
      68:
      $(Expr(:inbounds, :pop))
      #= line 7 =#
      return s::Int8
  end::Int8
@kshyatt kshyatt added compiler:inference Type inference performance Must go faster labels Aug 20, 2017
@vtjnash vtjnash self-assigned this Sep 5, 2017
@vtjnash
Copy link
Member

vtjnash commented Sep 5, 2017

Just waiting for linear IR to do this

@quinnj
Copy link
Member Author

quinnj commented Dec 11, 2017

Friendly bump now that linear IR has been merged. Is there more to do here or should linearization fix this itself?

@quinnj quinnj added the triage This should be discussed on a triage call label Dec 21, 2017
@StefanKarpinski
Copy link
Member

@quinnj, I'm not sure you know what the "triage" label is for.

@StefanKarpinski StefanKarpinski removed the triage This should be discussed on a triage call label Dec 21, 2017
@quinnj
Copy link
Member Author

quinnj commented Dec 21, 2017

Hey, I qualified that people should feel free to remove if they want; I just wanted to say that this plus precompile are two of the biggest performance problems I know of in the package ecosystem on current master.

@StefanKarpinski
Copy link
Member

Definitely a high priority, but not a blocker.

@quinnj quinnj mentioned this issue Jan 13, 2018
3 tasks
Keno added a commit that referenced this issue Apr 25, 2018
This adds union splitting optimizations to the inliner.
This works a bit differently from the old optimizer, which
did union splitting on the bail out path. Instead, we try
to run the full inliner on any union split case that's a
dispatch tuple (and record what to do in that case). Doing
that allows an effective resolution of #23338, though this
PR doesn't quite do that yet, since some further (minor) fixes
are needed.
Keno added a commit that referenced this issue Apr 27, 2018
This adds union splitting optimizations to the inliner.
This works a bit differently from the old optimizer, which
did union splitting on the bail out path. Instead, we try
to run the full inliner on any union split case that's a
dispatch tuple (and record what to do in that case). Doing
that allows an effective resolution of #23338, though this
PR doesn't quite do that yet, since some further (minor) fixes
are needed.
Keno added a commit that referenced this issue Apr 27, 2018
This adds union splitting optimizations to the inliner.
This works a bit differently from the old optimizer, which
did union splitting on the bail out path. Instead, we try
to run the full inliner on any union split case that's a
dispatch tuple (and record what to do in that case). Doing
that allows an effective resolution of #23338, though this
PR doesn't quite do that yet, since some further (minor) fixes
are needed.
Keno added a commit that referenced this issue Apr 27, 2018
This adds union splitting optimizations to the inliner.
This works a bit differently from the old optimizer, which
did union splitting on the bail out path. Instead, we try
to run the full inliner on any union split case that's a
dispatch tuple (and record what to do in that case). Doing
that allows an effective resolution of #23338, though this
PR doesn't quite do that yet, since some further (minor) fixes
are needed.
Keno added a commit that referenced this issue Apr 27, 2018
This adds union splitting optimizations to the inliner.
This works a bit differently from the old optimizer, which
did union splitting on the bail out path. Instead, we try
to run the full inliner on any union split case that's a
dispatch tuple (and record what to do in that case). Doing
that allows an effective resolution of #23338, though this
PR doesn't quite do that yet, since some further (minor) fixes
are needed.
Keno added a commit that referenced this issue May 1, 2018
This adds union splitting optimizations to the inliner.
This works a bit differently from the old optimizer, which
did union splitting on the bail out path. Instead, we try
to run the full inliner on any union split case that's a
dispatch tuple (and record what to do in that case). Doing
that allows an effective resolution of #23338, though this
PR doesn't quite do that yet, since some further (minor) fixes
are needed.
Keno added a commit that referenced this issue May 1, 2018
This adds union splitting optimizations to the inliner.
This works a bit differently from the old optimizer, which
did union splitting on the bail out path. Instead, we try
to run the full inliner on any union split case that's a
dispatch tuple (and record what to do in that case). Doing
that allows an effective resolution of #23338, though this
PR doesn't quite do that yet, since some further (minor) fixes
are needed.
@Keno
Copy link
Member

Keno commented May 14, 2018

Fixed by new optimizer.

@Keno Keno closed this as completed May 14, 2018
@nalimilan nalimilan added the missing data Base.missing and related functionality label May 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:inference Type inference missing data Base.missing and related functionality performance Must go faster
Projects
None yet
Development

No branches or pull requests

6 participants