Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JET does not know that concatenated vectors results in a vector? #413

Open
Krastanov opened this issue Nov 13, 2022 · 4 comments
Open

JET does not know that concatenated vectors results in a vector? #413

Krastanov opened this issue Nov 13, 2022 · 4 comments

Comments

@Krastanov
Copy link

Here are two false positives that I have trouble understanding:

using JET

function f(idx)
    N = 4
    indices_flat = [idx...;]
    Int[i for i=1:N if i ∉ indices_flat]
end

first false positive:

@assert f([[1,2],[3]]) == [4]
@report_call f([[1,2],[3]])
═════ 2 possible errors found ═════
┌ @ REPL[2]:4 collect(Int, Base.Generator(identity, Base.Filter(#3, 1 : N)))
│┌ @ array.jl:642 Base._collect(T, itr, Base.IteratorSize(itr))
││┌ @ array.jl:644 Base._array_for(T, isz, Base._similar_shape(itr, isz))
│││┌ @ array.jl:674 Base._similar_shape(itr, isz)
││││┌ @ array.jl:659 axes(itr)
│││││┌ @ abstractarray.jl:98 size(A)
││││││ no matching method found `size(::Base.HasLength)`: size(A::Base.HasLength)
│││││└───────────────────────
││││┌ @ array.jl:658 length(itr)
│││││ no matching method found `length(::Base.HasLength)`: length(itr::Base.HasLength)
││││└────────────────

second false positive:

@assert f([[],[]]) == [1,2,3,4]
@report_call f([[],[]])
═════ 1 possible error found ═════
┌ @ REPL[2]:4 collect(Int, Base.Generator(identity, Base.Filter(#3, 1 : N)))
│┌ @ array.jl:642 Base._collect(T, itr, Base.IteratorSize(itr))
││┌ @ array.jl:648  = iterate(itr)
│││┌ @ generator.jl:44 y = iterate(tuple(g.iter), s...)
││││┌ @ iterators.jl:510 goto %11 if not f.flt(y[1])
│││││ non-boolean `Missing` found in boolean context (1/2 union split): goto %11 if not (f::Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}).flt::var"#3#4"{Vector{Any}}((y::Tuple{Int64, Int64})[1]::Int64)::Union{Missing, Bool}
││││└────────────────────
julia> versioninfo()
Julia Version 1.9.0-DEV.1657
Commit 35d12890aba (2022-10-25 07:47 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
  Threads: 4 on 8 virtual cores

(ttt) pkg> st
Status `/tmp/ttt/Project.toml`
  [c3a54625] JET v0.6.14
@aviatesk
Copy link
Owner

It looks like the former example is because Base.Generator seems to lose a precise type information (see ::Base.Generator{_A, typeof(identity)} where _A in the second line of abstract stack trace):

julia> callf(f, args...) = f(args...)
callf (generic function with 1 method)

julia> @report_call annotate_types=true callf(f, [[1,2],[3]])
═════ 2 possible errors found ═════
┌ @ REPL[6]:1 f::typeof(f)(args::Tuple{Vector{Vector{Int64}}}...)
│┌ @ REPL[2]:4 collect(Int, Base.Generator(identity, Base.Filter(#3, 1 : N::UnitRange{Int64})::Base.Iterators.Filter{_A, UnitRange{Int64}} where _A)::Base.Generator{_A, typeof(identity)} where _A)
││┌ @ array.jl:642 Base._collect(T::Type{Int64}, itr::Base.Generator{_A, typeof(identity)} where _A, Base.IteratorSize(itr::Base.Generator{_A, typeof(identity)} where _A)::Any)
│││┌ @ array.jl:644 Base._array_for(T::Type{Int64}, isz::Union{Base.HasLength, Base.HasShape}, Base._similar_shape(itr::Base.Generator{_A, typeof(identity)} where _A, isz::Union{Base.HasLength, Base.HasShape})::Any)
││││┌ @ array.jl:674 Base._similar_shape(itr::Base.HasLength, isz::Any)
│││││┌ @ array.jl:659 axes(itr::Base.HasLength)
││││││┌ @ abstractarray.jl:98 size(A::Base.HasLength)
│││││││ no matching method found `size(::Base.HasLength)`: size(A::Base.HasLength)
││││││└───────────────────────
│││││┌ @ array.jl:658 length(itr::Base.HasLength)
││││││ no matching method found `length(::Base.HasLength)`: length(itr::Base.HasLength)
│││││└────────────────

The second example isn't false positive though, since the vector is typed as Vector{Any}, from the type inference point of view, it may return missing element so the non-boolean condition can actually happen.

julia> @report_call annotate_types=true callf(f, [[],[]])
═════ 1 possible error found ═════
┌ @ REPL[6]:1 f::typeof(f)(args::Tuple{Vector{Vector{Any}}}...)
│┌ @ REPL[2]:4 collect(Int, Base.Generator(identity, Base.Filter(#3, 1 : N::UnitRange{Int64})::Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}})::Base.Generator{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}, typeof(identity)})
││┌ @ array.jl:642 Base._collect(T::Type{Int64}, itr::Base.Generator{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}, typeof(identity)}, Base.IteratorSize(itr::Base.Generator{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}, typeof(identity)})::Base.SizeUnknown)
│││┌ @ array.jl:648  = iterate(itr::Base.Generator{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}, typeof(identity)})
││││┌ @ generator.jl:44 y = iterate(tuple((g::Base.Generator{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}, typeof(identity)}).iter::Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}})::Tuple{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}}, s::Tuple{}...)
│││││┌ @ iterators.jl:514 goto %12 if not (f::Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}).flt::var"#3#4"{Vector{Any}}((y::Tuple{Int64, Int64})[1]::Int64)::Union{Missing, Bool}
││││││ non-boolean `Missing` found in boolean context (1/2 union split): goto %12 if not (f::Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}).flt::var"#3#4"{Vector{Any}}((y::Tuple{Int64, Int64})[1]::Int64)::Union{Missing, Bool}
│││││└────────────────────


julia> callf(f, [Any[missing],Any[missing]])
ERROR: TypeError: non-boolean (Missing) used in boolean context
Stacktrace:
 [1] iterate
   @ ./iterators.jl:514 [inlined]
 [2] iterate
   @ ./generator.jl:44 [inlined]
 [3] _collect(#unused#::Type{Int64}, itr::Base.Generator{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}, typeof(identity)}, isz::Base.SizeUnknown)
   @ Base ./array.jl:648
 [4] collect
   @ ./array.jl:642 [inlined]
 [5] f(idx::Vector{Vector{Any}})
   @ Main ./REPL[2]:4
 [6] callf(f::Function, args::Vector{Vector{Any}})
   @ Main ./REPL[6]:1
 [7] top-level scope
   @ REPL[9]:1

@Krastanov
Copy link
Author

Probably a silly question: why was the callf indirection necessary? I do not think I understand this trick.

Is it feasible to help Generator not lose this type information? Should I be filing an issue with julialang/julia?

@aviatesk
Copy link
Owner

why was the callf indirection necessary?

Ah, I just wanted to check if @report_call generate a reasonable type information about input function call e.g. f::typeof(f)(args::Tuple{Vector{Vector{Int64}}}...) (, which we can get with annotate_types option enabled).

Is it feasible to help Generator not lose this type information? Should I be filing an issue with julialang/julia?

Yes, this seems to be a general type inference issue within Julia base. It looks like we can't have a precise return type inference because of this:

julia> @code_typed f([[1,2],[3]])
CodeInfo(
1%1 = Core._apply_iterate(Base.iterate, Base.vcat, idx)::Union{Vector{Any}, Vector{Int64}}%2 = Core.typeof(%1)::Union{Type{Vector{Any}}, Type{Vector{Int64}}}%3 = Core.apply_type(Main.:(var"#3#4"), %2)::Type{var"#3#4"{_A}} where _A
│   %4 = %new(%3, %1)::var"#3#4"%5 = Base.Filter(%4, $(QuoteNode(1:4)))::Base.Iterators.Filter{_A, UnitRange{Int64}} where _A
│   %6 = Base.Generator(Base.identity, %5)::Base.Generator{_A, typeof(identity)} where _A
│   %7 = Base.collect(Main.Int, %6)::AbstractArray
└──      return %7
) => AbstractArray

@aviatesk
Copy link
Owner

Should be fixed by: JuliaLang/julia#47628

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants