Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

range function not eliding with constant arguments #35022

Closed
MasonProtter opened this issue Mar 5, 2020 · 8 comments
Closed

range function not eliding with constant arguments #35022

MasonProtter opened this issue Mar 5, 2020 · 8 comments

Comments

@MasonProtter
Copy link
Contributor

foo(N) = range(-1, 1, length=N)
bar()  = foo(1000)
julia> @btime bar();
  118.334 ns (0 allocations: 0 bytes)

I've tried this on the master branch and 1.4.2-rc2. I'd expect this function call to be completely elided, but it's not which can induce a pretty hefty performance cost sometimes in hot loops.

julia> @code_typed foo(1000)
CodeInfo(
1 ── %1  = Base.sle_int(1, 1)::Bool
└───       goto #3 if not %1
2 ── %3  = Base.sle_int(1, 0)::Bool
└───       goto #4
3 ──       nothing::Nothing
4 ┄─ %6  = φ (#2 => %3, #3 => false)::Bool
└───       goto #6 if not %6
5 ──       invoke Base.getindex(()::Tuple, 1::Int64)::Union{}
└───       $(Expr(:unreachable))::Union{}
6 ┄─       goto #7
7 ──       goto #8
8 ──       goto #9
9 ──       goto #10
10%14 = invoke Base._linspace(Float64::Type{Float64}, -1::Int64, 1::Int64, _2::Int64, 1::Int64)::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}
└───       goto #12
11$(Expr(:meta, :nkw, 2))
12 ┄       goto #13
13return %14
) => StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}

I wonder if there's anything we can do to make this infer more easily.

@haampie on Slack tracked the problem down to function Base._linspace(::Type{T}, start_n::Integer, stop_n::Integer, len::Integer, den::Integer) where T<:IEEEFloat which he speculated is overwhelming the optimizer with its very complex logic.

@haampie
Copy link
Contributor

haampie commented Mar 5, 2020

I don't think this will be optimized easily, many things are not inlined. Your foo does not inline Base._linspace(Float64, -1, 1, 1000, 1), which in turn does not inline

invoke Base._linspace1(%3::Type{Float64}, %6::Float64, %9::Float64, _5::Int64)::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}
invoke Base.steprangelen_hp(%14::Type{Float64}, %15::Tuple{Int64,Int64}, %16::Tuple{Int64,Int64}, 0::Int64, _5::Int64, 1::Int64)::Core.Compiler.PartialStruct(StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}, Any[Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64, Core.Compiler.Const(1, false)])
invoke Base.steprangelen_hp(%66::Type{Float64}, %61::Tuple{Int128,Int128}, %65::Tuple{Int128,Int128}, %105::Int64, _5::Int64, %47::Int64)::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}

then steprangelen_hp does not inline

invoke Base.splitprec(Float64::Type{Float64}, %1::Int128)::Tuple{Float64,Float64}
invoke Base.:/(%9::Base.TwicePrecision{Float64}, %2::Int128)::Base.TwicePrecision{Float64}

etc etc. It's just very complex code. Maybe @timholy can comment on whether it could be simplified to make it easier on the compiler.

@mbauman
Copy link
Member

mbauman commented Mar 5, 2020

What are you doing with the range inside your loop? Why would you expect this to get elided?

Edit: Oh, I think you're after constant propagation to move the constructor to compile-time?

@KristofferC
Copy link
Member

I don't think we need a separate issue for every function / constructor that doesn't constant propagate.

@haampie
Copy link
Contributor

haampie commented Mar 5, 2020

Maybe it's not so much about constant propagation, but about the complexity of the range constructor?

julia> function f()
         total = 0.0
         for x = range(-1, 1, length=5), y = range(-1, 1, length=5), z = range(-1, 1, length=5)
           total += x + y + z
         end
         total
       end

julia> function g()
         total = 0.0
         xs = range(-1, 1, length=5)
         ys = range(-1, 1, length=5)
         zs = range(-1, 1, length=5)
         for x = xs, y = ys, z = zs
           total += x + y + z
         end
         total
       end

julia> @btime f()
  10.060 μs (0 allocations: 0 bytes)
0.0

julia> @btime g()
  1.917 μs (0 allocations: 0 bytes)
0.0

Probably if the range function was more trivial f and g would compile the same.

(it's very niche)

@mbauman
Copy link
Member

mbauman commented Mar 5, 2020

Ah, the words you're after there are LICM (loop-invariant code motion) or hoisting.

@KristofferC
Copy link
Member

KristofferC commented Mar 5, 2020

Ref #26770.

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Mar 5, 2020

Ah, the words you're after there are LICM (loop-invariant code motion) or hoisting.

I guess the thing that catches me off guard is that those are all literals and StepRangeLen isn't a mutable type or anything so I would have hoped this could just be moved to compile time, but it doesn't happen. I don't see why anything here needs to happen at runtime at all, so it's less about LICM than it is about compile time evaluation.

I don't think we need a separate issue for every function / constructor that doesn't constant propagate.

Fair enough, feel free to close if this is inappropriate.

@timholy
Copy link
Member

timholy commented Mar 5, 2020

It's complicated because some people really want exact arithmetic. Use LinRange if you want a simple floating-point range. EDIT: demo

julia> collect(-0.1:0.1:0.3)
5-element Array{Float64,1}:
 -0.1
  0.0
  0.1
  0.2
  0.3

julia> collect(LinRange(-0.1, 0.3, 5))
5-element Array{Float64,1}:
 -0.1                   
 -1.3877787807814457e-17
  0.09999999999999999   
  0.19999999999999998   
  0.3                   

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants