optimizer: fully support inlining of union-split, partially constant-prop' callsite #43347

aviatesk · 2021-12-06T13:05:05Z

Makes full use of constant-propagation, by addressing this TODO.
Here is a performance improvement from #43287:

ulia> using BenchmarkTools

julia> X = rand(ComplexF32, 64, 64);

julia> dst = reinterpret(reshape, Float32, X);

julia> src = copy(dst);

julia> @btime copyto!($dst, $src);
  50.819 μs (1 allocation: 32 bytes) # v1.6.4
  41.081 μs (0 allocations: 0 bytes) # this commit

fixes #43287

aviatesk · 2021-12-06T13:08:07Z

@nanosoldier runbenchmarks("broadcast" || "sparse" || "array" || "union" || "string" || "tuple", vs=":master")

oscardssmith · 2021-12-06T18:56:59Z

Does this need TTFP benchmarking or is it good to merge?

KristofferC · 2021-12-07T12:44:13Z

This needs a manual backport to the 1.7 backport branch (#43297). Would be good if that could happen kind of quickly if possible, @aviatesk

aviatesk · 2021-12-07T12:53:46Z

I'm happy to do the backporting, but I also would like to run nanosolider before merging in order to assert this commit isn't introducing another performance regression as I did in #42841.

aviatesk · 2021-12-07T12:55:03Z

Does this need TTFP benchmarking or is it good to merge?

Instead of TTFP, I'd propose we should switch to using JuliaCI/BaseBenchmarks.jl#288 in order to track latency regressions.

aviatesk · 2021-12-17T06:49:59Z

@nanosoldier runbenchmarks("broadcast" || "sparse" || "array" || "union" || "string" || "tuple", vs=":master")

johnnychen94 · 2021-12-19T23:50:25Z

bump; will this be included in Julia 1.7.1?

…prop' callsite Makes full use of constant-propagation, by addressing this [TODO](https://github.com/JuliaLang/julia/blob/00734c5fd045316a00d287ca2c0ec1a2eef6e4d1/base/compiler/ssair/inlining.jl#L1212). Here is a performance improvement from #43287: ```julia ulia> using BenchmarkTools julia> X = rand(ComplexF32, 64, 64); julia> dst = reinterpret(reshape, Float32, X); julia> src = copy(dst); julia> @Btime copyto!($dst, $src); 50.819 μs (1 allocation: 32 bytes) # v1.6.4 41.081 μs (0 allocations: 0 bytes) # this commit ``` fixes #43287

… earlier

aviatesk · 2022-01-05T06:57:57Z

@nanosoldier runbenchmarks("broadcast" || "sparse" || "array" || "union" || "string" || "tuple", vs=":master")

nanosoldier · 2022-01-05T10:58:37Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk · 2022-01-05T11:16:33Z

The benchmark result looks good. I'm also running another benchmark on mit cluster now and would like to get this merged once I confirm there is no obvious regressions in that result tool.

aviatesk · 2022-01-05T14:02:34Z

@nanosoldier runbenchmarks("linalg", vs=":master")

aviatesk · 2022-01-05T14:11:02Z

On amdci2, I got

!("scalar")

Benchmark Report

Job Properties

Commits: https://github.com/JuliaLang/julia@a79e40d61d4d0861c8fcbf15709588fe18ee8f74 vs https://github.com/JuliaLang/julia@85a6990a9c1d49dd5aeaffeb4b38f881dc120823

Comparison Diff: link

Triggered By: link

Tag Predicate: !scalar

Results

Note: If Chrome is your browser, I strongly recommend installing the Wide GitHub
extension, which makes the result table easier to read.

Below is a table of this job's results, obtained by running the benchmarks found in
JuliaCI/BaseBenchmarks.jl. The values
listed in the ID column have the structure [parent_group, child_group, ..., key],
and can be used to index into the BaseBenchmarks suite to retrieve the corresponding
benchmarks.

The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID	time ratio	memory ratio
`["array", "any/all", ("all", "Vector{Float64} generator")]`	1.21 (5%) ❌	1.00 (1%)
`["array", "any/all", ("all", "Vector{Float64}")]`	1.21 (5%) ❌	1.00 (1%)
`["array", "equality", ("==", "BitArray")]`	1.65 (5%) ❌	1.00 (1%)
`["array", "equality", ("==", "UnitRange{Int64}")]`	1.19 (5%) ❌	1.00 (1%)
`["array", "equality", ("isequal", "Vector{Bool}")]`	0.71 (5%) ✅	1.00 (1%)
`["array", "equality", ("isequal", "Vector{Int16}")]`	0.67 (5%) ✅	1.00 (1%)
`["array", "equality", ("isequal", "Vector{Int64} isequal Vector{Int16}")]`	0.90 (5%) ✅	1.00 (1%)
`["array", "index", "2d"]`	1.07 (5%) ❌	1.00 (1%)
`["array", "index", ("sumlogical", "SubArray{Int32, 2, Array{Int32, 3}, Tuple{Int64, Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true}")]`	2.71 (50%) ❌	1.00 (1%)
`["array", "reductions", ("maxabs", "Float64")]`	1.13 (5%) ❌	1.00 (1%)
`["array", "reductions", ("norminf", "Int64")]`	0.95 (5%) ✅	1.00 (1%)
`["array", "subarray", ("lucompletepivCopy!", 1000)]`	0.89 (5%) ✅	1.00 (1%)
`["collection", "initialization", ("Vector", "Any", "iterator")]`	1.30 (25%) ❌	1.00 (1%)
`["collection", "queries & updates", ("Vector", "Int", "in", "false")]`	1.38 (25%) ❌	1.00 (1%)
`["collection", "queries & updates", ("Vector", "Int", "in", "true")]`	0.53 (25%) ✅	1.00 (1%)
`["dates", "arithmetic", ("Date", "Year")]`	0.92 (5%) ✅	1.00 (1%)
`["dates", "parse", ("DateTime", "RFC1123Format", "Lowercase")]`	0.95 (5%) ✅	1.00 (1%)
`["dates", "parse", ("DateTime", "RFC1123Format", "Titlecase")]`	0.95 (5%) ✅	1.00 (1%)
`["find", "findall", ("> q0.5", "Vector{Float32}")]`	1.08 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.5", "Vector{Int64}")]`	1.09 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.5", "Vector{UInt64}")]`	1.10 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.5", "Vector{UInt8}")]`	1.32 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.8", "Vector{Bool}")]`	1.06 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.8", "Vector{Float32}")]`	1.05 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.8", "Vector{Float64}")]`	1.54 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.8", "Vector{Int8}")]`	1.06 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.8", "Vector{UInt8}")]`	1.28 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.95", "Vector{Bool}")]`	1.07 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.95", "Vector{Float32}")]`	1.05 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.95", "Vector{Float64}")]`	1.11 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.95", "Vector{Int8}")]`	1.09 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.95", "Vector{UInt8}")]`	1.06 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.99", "Vector{Bool}")]`	1.07 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.99", "Vector{Float32}")]`	1.18 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.99", "Vector{Float64}")]`	1.06 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.99", "Vector{Int8}")]`	1.07 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.99", "Vector{UInt64}")]`	1.41 (5%) ❌	1.00 (1%)
`["find", "findall", ("> q0.99", "Vector{UInt8}")]`	1.07 (5%) ❌	1.00 (1%)
`["find", "findall", ("BitVector", "10-90")]`	1.27 (5%) ❌	1.00 (1%)
`["find", "findall", ("Vector{Bool}", "10-90")]`	1.42 (5%) ❌	1.00 (1%)
`["find", "findall", ("Vector{Bool}", "50-50")]`	1.18 (5%) ❌	1.00 (1%)
`["find", "findall", ("ispos", "Vector{Float32}")]`	1.21 (5%) ❌	1.00 (1%)
`["find", "findall", ("ispos", "Vector{Int8}")]`	1.08 (5%) ❌	1.00 (1%)
`["find", "findnext", ("ispos", "Vector{Bool}")]`	0.92 (5%) ✅	1.00 (1%)
`["find", "findnext", ("ispos", "Vector{Float32}")]`	0.84 (5%) ✅	1.00 (1%)
`["find", "findnext", ("ispos", "Vector{Int64}")]`	0.90 (5%) ✅	1.00 (1%)
`["find", "findnext", ("ispos", "Vector{Int8}")]`	0.91 (5%) ✅	1.00 (1%)
`["find", "findnext", ("ispos", "Vector{UInt64}")]`	0.87 (5%) ✅	1.00 (1%)
`["find", "findnext", ("ispos", "Vector{UInt8}")]`	0.89 (5%) ✅	1.00 (1%)
`["find", "findprev", ("ispos", "Vector{Bool}")]`	0.92 (5%) ✅	1.00 (1%)
`["find", "findprev", ("ispos", "Vector{Float32}")]`	0.82 (5%) ✅	1.00 (1%)
`["find", "findprev", ("ispos", "Vector{Float64}")]`	0.84 (5%) ✅	1.00 (1%)
`["find", "findprev", ("ispos", "Vector{UInt64}")]`	1.39 (5%) ❌	1.00 (1%)
`["find", "findprev", ("ispos", "Vector{UInt8}")]`	0.94 (5%) ✅	1.00 (1%)
`["inference", "abstract interpretation", "rand(Float64)"]`	0.95 (5%) ✅	1.00 (1%)
`["inference", "abstract interpretation", "sin(42)"]`	0.95 (5%) ✅	1.00 (1%)
`["inference", "optimization", "abstract_call_gf_by_type"]`	1.08 (5%) ❌	1.07 (1%) ❌
`["io", "serialization", ("serialize", "Matrix{Float64}")]`	0.86 (5%) ✅	1.00 (1%)
`["linalg", "arithmetic", ("*", "Matrix", "Vector", 256)]`	0.48 (45%) ✅	1.00 (1%)
`["linalg", "arithmetic", ("*", "typename(LinearAlgebra.LowerTriangular)", "Vector", 1024)]`	1.62 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("*", "typename(LinearAlgebra.LowerTriangular)", "Vector", 256)]`	2.31 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("*", "typename(LinearAlgebra.LowerTriangular)", "typename(LinearAlgebra.LowerTriangular)", 256)]`	1.72 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("*", "typename(LinearAlgebra.SymTridiagonal)", "typename(LinearAlgebra.SymTridiagonal)", 256)]`	0.46 (45%) ✅	1.00 (1%)
`["linalg", "arithmetic", ("*", "typename(LinearAlgebra.Tridiagonal)", "Vector", 1024)]`	0.33 (45%) ✅	1.00 (1%)
`["linalg", "arithmetic", ("*", "typename(LinearAlgebra.Tridiagonal)", "typename(LinearAlgebra.Tridiagonal)", 256)]`	0.29 (45%) ✅	1.00 (1%)
`["linalg", "arithmetic", ("+", "typename(LinearAlgebra.Bidiagonal)", "typename(LinearAlgebra.Bidiagonal)", 256)]`	0.45 (45%) ✅	1.00 (1%)
`["linalg", "arithmetic", ("+", "typename(LinearAlgebra.Tridiagonal)", "typename(LinearAlgebra.Tridiagonal)", 1024)]`	1.45 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("+", "typename(LinearAlgebra.UpperTriangular)", "typename(LinearAlgebra.UpperTriangular)", 256)]`	0.47 (45%) ✅	1.00 (1%)
`["linalg", "arithmetic", ("-", "typename(LinearAlgebra.Tridiagonal)", "typename(LinearAlgebra.Tridiagonal)", 1024)]`	0.52 (45%) ✅	1.00 (1%)
`["linalg", "arithmetic", ("/", "Matrix", "Matrix", 1024)]`	1.90 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("/", "Matrix", "Matrix", 256)]`	95.01 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("/", "typename(LinearAlgebra.LowerTriangular)", "typename(LinearAlgebra.LowerTriangular)", 256)]`	2.23 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("/", "typename(LinearAlgebra.UpperTriangular)", "typename(LinearAlgebra.UpperTriangular)", 1024)]`	1.59 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("/", "typename(LinearAlgebra.UpperTriangular)", "typename(LinearAlgebra.UpperTriangular)", 256)]`	3.90 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("\\", "Matrix", "Vector", 1024)]`	2.16 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("\\", "Matrix", "Vector", 256)]`	0.05 (45%) ✅	1.00 (1%)
`["linalg", "arithmetic", ("\\", "typename(LinearAlgebra.Diagonal)", "Vector", 256)]`	1.48 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("\\", "typename(LinearAlgebra.Diagonal)", "typename(LinearAlgebra.Diagonal)", 1024)]`	1.65 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("\\", "typename(LinearAlgebra.UpperTriangular)", "typename(LinearAlgebra.UpperTriangular)", 1024)]`	1.52 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("cumsum!", "Int32", 256)]`	1.57 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("exp", "typename(LinearAlgebra.Hermitian)", 1024)]`	1.95 (45%) ❌	1.00 (1%)
`["linalg", "arithmetic", ("log", "typename(LinearAlgebra.Hermitian)", 1024)]`	2.27 (45%) ❌	1.00 (1%)
`["linalg", "blas", "gemm"]`	0.44 (40%) ✅	1.00 (1%)
`["linalg", "blas", "gemv"]`	1.93 (40%) ❌	1.00 (1%)
`["linalg", "blas", "syrk"]`	0.41 (40%) ✅	1.00 (1%)
`["linalg", "factorization", ("eigen", "Matrix", 256)]`	0.53 (45%) ✅	1.00 (1%)
`["linalg", "factorization", ("lu", "Matrix", 256)]`	0.01 (45%) ✅	1.00 (1%)
`["linalg", "factorization", ("svd", "Matrix", 1024)]`	0.35 (45%) ✅	1.00 (1%)
`["linalg", "factorization", ("svd", "typename(LinearAlgebra.Bidiagonal)", 1024)]`	4.49 (45%) ❌	1.00 (1%)
`["linalg", "factorization", ("svd", "typename(LinearAlgebra.UpperTriangular)", 1024)]`	2.44 (45%) ❌	1.00 (1%)
`["linalg", "factorization", ("svd", "typename(LinearAlgebra.UpperTriangular)", 256)]`	0.27 (45%) ✅	1.00 (1%)
`["linalg", "small exp #29116"]`	0.33 (5%) ✅	1.00 (1%)
`["micro", "randmatmul"]`	0.88 (5%) ✅	1.00 (1%)
`["misc", "allocation elision view", "conditional"]`	0.78 (5%) ✅	1.00 (1%)
`["misc", "allocation elision view", "no conditional"]`	0.77 (5%) ✅	1.00 (1%)
`["misc", "bitshift", ("Int", "Int")]`	0.86 (5%) ✅	1.00 (1%)
`["misc", "bitshift", ("Int", "UInt")]`	0.86 (5%) ✅	1.00 (1%)
`["misc", "bitshift", ("UInt", "UInt")]`	0.86 (5%) ✅	1.00 (1%)
`["misc", "foldl", "foldl(+, filter(...))"]`	0.41 (5%) ✅	1.00 (1%)
`["misc", "iterators", "zip(1:1, 1:1, 1:1, 1:1)"]`	0.87 (5%) ✅	1.00 (1%)
`["misc", "iterators", "zip(1:1000)"]`	1.19 (5%) ❌	1.00 (1%)
`["misc", "repeat", (200, 1, 24)]`	0.94 (5%) ✅	1.00 (1%)
`["misc", "repeat", (200, 24, 1)]`	0.85 (5%) ✅	1.00 (1%)
`["problem", "simplex", "simplex"]`	1.29 (5%) ❌	1.00 (1%)
`["problem", "spellcheck", "spellcheck"]`	0.94 (5%) ✅	1.00 (1%)
`["random", "types", ("rand!", "RandomDevice", "Int64")]`	1.85 (25%) ❌	1.00 (1%)
`["shootout", "binary_trees"]`	0.91 (5%) ✅	1.00 (1%)
`["simd", ("Cartesian", "axpy!", "Float32", 2, 64)]`	1.29 (20%) ❌	1.00 (1%)
`["simd", ("Cartesian", "axpy!", "Float64", 2, 63)]`	0.78 (20%) ✅	1.00 (1%)
`["simd", ("Cartesian", "axpy!", "Float64", 2, 64)]`	0.62 (20%) ✅	1.00 (1%)
`["simd", ("Cartesian", "manual_example!", "Float64", 2, 63)]`	1.41 (20%) ❌	1.00 (1%)
`["simd", ("Cartesian", "manual_example!", "Float64", 2, 64)]`	1.22 (20%) ❌	1.00 (1%)
`["simd", ("Cartesian", "manual_example!", "Int64", 2, 64)]`	1.37 (20%) ❌	1.00 (1%)
`["simd", ("CartesianPartition", "manual_example!", "Float64", 2, 63)]`	1.30 (20%) ❌	1.00 (1%)
`["simd", ("CartesianPartition", "manual_example!", "Float64", 2, 64)]`	1.41 (20%) ❌	1.00 (1%)
`["simd", ("CartesianPartition", "manual_example!", "Int64", 2, 64)]`	1.21 (20%) ❌	1.00 (1%)
`["simd", ("CartesianPartition", "two_reductions", "Int32", 2, 31)]`	1.22 (20%) ❌	1.00 (1%)
`["simd", ("CartesianPartition", "two_reductions", "Int32", 2, 32)]`	1.22 (20%) ❌	1.00 (1%)
`["simd", ("CartesianPartition", "two_reductions", "Int32", 2, 63)]`	1.22 (20%) ❌	1.00 (1%)
`["simd", ("CartesianPartition", "two_reductions", "Int32", 2, 64)]`	1.22 (20%) ❌	1.00 (1%)
`["simd", ("Linear", "auto_axpy!", "Float32", 4096)]`	1.25 (20%) ❌	1.00 (1%)
`["simd", ("Linear", "axpy!", "Float32", 4096)]`	1.22 (20%) ❌	1.00 (1%)
`["sparse", "arithmetic", ("unary minus", "(20000, 20000)")]`	0.69 (30%) ✅	1.00 (1%)
`["sparse", "constructors", ("Bidiagonal", 10)]`	0.94 (5%) ✅	1.00 (1%)
`["sparse", "constructors", ("Bidiagonal", 100)]`	0.73 (5%) ✅	1.00 (1%)
`["sparse", "constructors", ("Diagonal", 1000)]`	0.95 (5%) ✅	1.00 (1%)
`["sparse", "constructors", ("IJV", 1000)]`	0.75 (5%) ✅	1.00 (1%)
`["sparse", "constructors", ("Tridiagonal", 100)]`	0.91 (5%) ✅	1.00 (1%)
`["sparse", "index", ("spmat", "OneTo", 10)]`	1.34 (30%) ❌	1.00 (1%)
`["sparse", "index", ("spmat", "col", "OneTo", 10)]`	1.36 (30%) ❌	1.00 (1%)
`["sparse", "index", ("spmat", "col", "range", 1000)]`	1.32 (30%) ❌	1.00 (1%)
`["sparse", "index", ("spvec", "integer", 10000)]`	1.34 (30%) ❌	1.00 (1%)
`["sparse", "sparse matvec", "adjoint"]`	0.83 (5%) ✅	1.00 (1%)
`["sparse", "sparse solves", "least squares (default), matrix rhs"]`	0.40 (5%) ✅	1.00 (1%)
`["sparse", "sparse solves", "least squares (qr), matrix rhs"]`	1.14 (5%) ❌	1.00 (1%)
`["sparse", "sparse solves", "least squares (qr), vector rhs"]`	1.34 (5%) ❌	1.00 (1%)
`["sparse", "sparse solves", "square system (default), matrix rhs"]`	1.32 (5%) ❌	1.00 (1%)
`["sparse", "sparse solves", "square system (default), vector rhs"]`	1.18 (5%) ❌	1.00 (1%)
`["sparse", "sparse solves", "square system (lu), matrix rhs"]`	0.58 (5%) ✅	1.00 (1%)
`["sparse", "sparse solves", "square system (lu), vector rhs"]`	1.23 (5%) ❌	1.00 (1%)
`["sparse", "transpose", ("transpose", "(20000, 10000)")]`	0.66 (30%) ✅	1.00 (1%)
`["sparse", "transpose", ("transpose", "(600, 400)")]`	0.69 (30%) ✅	1.00 (1%)
`["string", "==(::SubString, ::String)", "different"]`	0.94 (5%) ✅	1.00 (1%)
`["string", "repeat", "repeat str len 16"]`	0.92 (5%) ✅	1.00 (1%)
`["tuple", "reduction", ("minimum", "(2, 2)")]`	0.69 (5%) ✅	1.00 (1%)
`["tuple", "reduction", ("minimum", "(2,)")]`	1.05 (5%) ❌	1.00 (1%)
`["union", "array", ("broadcast", "*", "BigFloat", "(false, false)")]`	0.93 (5%) ✅	1.00 (1%)
`["union", "array", ("broadcast", "*", "BigFloat", "(false, true)")]`	0.95 (5%) ✅	1.00 (1%)
`["union", "array", ("broadcast", "*", "BigFloat", "(true, true)")]`	0.93 (5%) ✅	1.00 (1%)
`["union", "array", ("broadcast", "abs", "BigFloat", 1)]`	1.06 (5%) ❌	1.00 (1%)
`["union", "array", ("broadcast", "abs", "Int8", 1)]`	1.28 (5%) ❌	1.00 (1%)
`["union", "array", ("broadcast", "identity", "ComplexF64", 0)]`	0.94 (5%) ✅	1.00 (1%)
`["union", "array", ("collect", "filter", "Bool", 1)]`	1.10 (5%) ❌	1.00 (1%)
`["union", "array", ("collect", "filter", "Int64", 1)]`	1.08 (5%) ❌	1.00 (1%)
`["union", "array", ("map", "*", "BigFloat", "(false, true)")]`	0.92 (5%) ✅	1.00 (1%)
`["union", "array", ("map", "*", "BigFloat", "(true, true)")]`	0.92 (5%) ✅	1.00 (1%)
`["union", "array", ("map", "abs", "BigInt", 1)]`	0.94 (5%) ✅	1.00 (1%)
`["union", "array", ("map", "identity", "ComplexF64", 0)]`	0.94 (5%) ✅	1.00 (1%)
`["union", "array", ("perf_binaryop", "*", "BigFloat", "(false, false)")]`	0.92 (5%) ✅	1.00 (1%)
`["union", "array", ("perf_binaryop", "*", "BigFloat", "(false, true)")]`	0.92 (5%) ✅	1.00 (1%)
`["union", "array", ("perf_binaryop", "*", "BigFloat", "(true, true)")]`	0.92 (5%) ✅	1.00 (1%)
`["union", "array", ("skipmissing", "collect", "Union{Nothing, Bool}", 0)]`	1.06 (5%) ❌	1.00 (1%)
`["union", "array", ("skipmissing", "collect", "Union{Nothing, ComplexF64}", 0)]`	1.08 (5%) ❌	1.00 (1%)
`["union", "array", ("skipmissing", "collect", "Union{Nothing, Int8}", 0)]`	1.06 (5%) ❌	1.00 (1%)
`["union", "array", ("sort", "BigFloat", 0)]`	1.13 (5%) ❌	1.00 (1%)
`["union", "array", ("sort", "Union{Missing, BigFloat}", 1)]`	1.11 (5%) ❌	1.00 (1%)
`["union", "array", ("sort", "Union{Nothing, BigFloat}", 0)]`	1.10 (5%) ❌	1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["array", "accumulate"]
["array", "any/all"]
["array", "bool"]
["array", "cat"]
["array", "comprehension"]
["array", "convert"]
["array", "equality"]
["array", "growth"]
["array", "index"]
["array", "reductions"]
["array", "reverse"]
["array", "setindex!"]
["array", "subarray"]
["broadcast"]
["broadcast", "dotop"]
["broadcast", "fusion"]
["broadcast", "mix_scalar_tuple"]
["broadcast", "sparse"]
["broadcast", "typeargs"]
["collection", "deletion"]
["collection", "initialization"]
["collection", "iteration"]
["collection", "optimizations"]
["collection", "queries & updates"]
["collection", "set operations"]
["dates", "accessor"]
["dates", "arithmetic"]
["dates", "construction"]
["dates", "conversion"]
["dates", "parse"]
["dates", "query"]
["dates", "string"]
["find", "findall"]
["find", "findnext"]
["find", "findprev"]
["frontend"]
["inference", "abstract interpretation"]
["inference"]
["inference", "optimization"]
["io", "array_limit"]
["io", "read"]
["io", "serialization"]
["io"]
["linalg", "arithmetic"]
["linalg", "blas"]
["linalg", "factorization"]
["linalg"]
["micro"]
["misc"]
["misc", "23042"]
["misc", "afoldl"]
["misc", "allocation elision view"]
["misc", "bitshift"]
["misc", "foldl"]
["misc", "issue 12165"]
["misc", "iterators"]
["misc", "julia"]
["misc", "parse"]
["misc", "repeat"]
["misc", "splatting"]
["problem", "chaosgame"]
["problem", "fem"]
["problem", "go"]
["problem", "grigoriadis khachiyan"]
["problem", "imdb"]
["problem", "json"]
["problem", "laplacian"]
["problem", "monte carlo"]
["problem", "raytrace"]
["problem", "seismic"]
["problem", "simplex"]
["problem", "spellcheck"]
["problem", "stockcorr"]
["problem", "ziggurat"]
["random", "collections"]
["random", "randstring"]
["random", "ranges"]
["random", "sequences"]
["random", "types"]
["shootout"]
["simd"]
["sort", "insertionsort"]
["sort", "issorted"]
["sort", "mergesort"]
["sort", "quicksort"]
["sparse", "arithmetic"]
["sparse", "constructors"]
["sparse", "index"]
["sparse", "matmul"]
["sparse", "sparse matvec"]
["sparse", "sparse solves"]
["sparse", "transpose"]
["string", "==(::AbstractString, ::AbstractString)"]
["string", "==(::SubString, ::String)"]
["string", "findfirst"]
["string"]
["string", "readuntil"]
["string", "repeat"]
["tuple", "index"]
["tuple", "linear algebra"]
["tuple", "misc"]
["tuple", "reduction"]
["union", "array"]

Version Info

Primary Build

a79e40d61d

Comparison Build

85a6990a9c

Benchmark Report

Job Properties

Commits: https://github.com/JuliaLang/julia@a79e40d61d4d0861c8fcbf15709588fe18ee8f74 vs https://github.com/JuliaLang/julia@85a6990a9c1d49dd5aeaffeb4b38f881dc120823

Comparison Diff: link

Triggered By: link

Tag Predicate: !scalar

Results

Note: If Chrome is your browser, I strongly recommend installing the Wide GitHub
extension, which makes the result table easier to read.

Below is a table of this job's results, obtained by running the benchmarks found in
JuliaCI/BaseBenchmarks.jl. The values
listed in the ID column have the structure [parent_group, child_group, ..., key],
and can be used to index into the BaseBenchmarks suite to retrieve the corresponding
benchmarks.

The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID	time ratio	memory ratio
`["arithmetic", ("*", "typename(LinearAlgebra.SymTridiagonal)", "Vector", 1024)]`	0.53 (45%) ✅	1.00 (1%)
`["arithmetic", ("*", "typename(LinearAlgebra.UpperTriangular)", "Vector", 256)]`	0.45 (45%) ✅	1.00 (1%)
`["arithmetic", ("+", "Vector", "Vector", 256)]`	1.76 (45%) ❌	1.00 (1%)
`["arithmetic", ("+", "typename(LinearAlgebra.Bidiagonal)", "typename(LinearAlgebra.Bidiagonal)", 1024)]`	0.48 (45%) ✅	1.00 (1%)
`["arithmetic", ("+", "typename(LinearAlgebra.Diagonal)", "typename(LinearAlgebra.Diagonal)", 256)]`	1.83 (45%) ❌	1.00 (1%)
`["arithmetic", ("-", "Vector", "Vector", 1024)]`	0.48 (45%) ✅	1.00 (1%)
`["arithmetic", ("\\", "Matrix", "Matrix", 1024)]`	0.17 (45%) ✅	1.00 (1%)
`["arithmetic", ("\\", "Matrix", "Vector", 1024)]`	0.27 (45%) ✅	1.00 (1%)
`["arithmetic", ("\\", "typename(LinearAlgebra.LowerTriangular)", "typename(LinearAlgebra.LowerTriangular)", 256)]`	0.52 (45%) ✅	1.00 (1%)
`["arithmetic", ("cumsum!", "Float32", 256)]`	1.46 (45%) ❌	1.00 (1%)
`["arithmetic", ("log", "typename(LinearAlgebra.Hermitian)", 1024)]`	4.02 (45%) ❌	1.00 (1%)
`["blas", "gemm!"]`	1.91 (40%) ❌	1.00 (1%)
`["factorization", ("schur", "Matrix", 1024)]`	1.89 (45%) ❌	1.00 (1%)
`["factorization", ("schur", "Matrix", 256)]`	2.75 (45%) ❌	1.00 (1%)
`["factorization", ("svd", "Matrix", 256)]`	1.71 (45%) ❌	1.00 (1%)
`["factorization", ("svd", "typename(LinearAlgebra.UpperTriangular)", 256)]`	1.47 (45%) ❌	1.00 (1%)
`["small exp #29116"]`	0.74 (5%) ✅	1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["arithmetic"]
["blas"]
["factorization"]
[]

Version Info

Primary Build

a79e40d61d

Comparison Build

85a6990a9c

nanosoldier · 2022-01-05T15:33:15Z

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

…prop' callsite (#43347) Makes full use of constant-propagation, by addressing this [TODO](https://github.com/JuliaLang/julia/blob/00734c5fd045316a00d287ca2c0ec1a2eef6e4d1/base/compiler/ssair/inlining.jl#L1212). Here is a performance improvement from #43287: ```julia ulia> using BenchmarkTools julia> X = rand(ComplexF32, 64, 64); julia> dst = reinterpret(reshape, Float32, X); julia> src = copy(dst); julia> @Btime copyto!($dst, $src); 50.819 μs (1 allocation: 32 bytes) # v1.6.4 41.081 μs (0 allocations: 0 bytes) # this commit ``` fixes #43287

…prop' callsite (JuliaLang#43347) Makes full use of constant-propagation, by addressing this [TODO](https://github.com/JuliaLang/julia/blob/00734c5fd045316a00d287ca2c0ec1a2eef6e4d1/base/compiler/ssair/inlining.jl#L1212). Here is a performance improvement from JuliaLang#43287: ```julia ulia> using BenchmarkTools julia> X = rand(ComplexF32, 64, 64); julia> dst = reinterpret(reshape, Float32, X); julia> src = copy(dst); julia> @Btime copyto!($dst, $src); 50.819 μs (1 allocation: 32 bytes) # v1.6.4 41.081 μs (0 allocations: 0 bytes) # this commit ``` fixes JuliaLang#43287

aviatesk added backport 1.7 compiler:optimizer Optimization passes (mostly in base/compiler/ssair/) labels Dec 6, 2021

aviatesk mentioned this pull request Dec 6, 2021

slow fallback array copyto! in Julia 1.7.0 #43287

Closed

vtjnash approved these changes Dec 6, 2021

View reviewed changes

KristofferC mentioned this pull request Dec 7, 2021

release-1.7: Backports for 1.7.1 #43297

Merged

15 tasks

timholy mentioned this pull request Dec 8, 2021

Inference/constant propagation regression in Julia 1.7 #43368

Closed

aviatesk force-pushed the avi/43287 branch from f820a6a to a7e53d1 Compare December 14, 2021 06:36

johnnychen94 mentioned this pull request Dec 16, 2021

Julia 1.7 performance regression on JPEG image JuliaIO/ImageMagick.jl#208

Closed

aviatesk added the needs nanosoldier run This PR should have benchmarks run on it label Dec 21, 2021

aviatesk added 2 commits January 5, 2022 15:56

inlining: follow up #43479, bail out from MethodResultPure assemble…

a79e40d

… earlier

aviatesk force-pushed the avi/43287 branch from a7e53d1 to a79e40d Compare January 5, 2022 06:57

aviatesk merged commit 1b600f0 into master Jan 5, 2022

aviatesk deleted the avi/43287 branch January 5, 2022 14:11

KristofferC mentioned this pull request Jan 5, 2022

release-1.7: Backports for 1.7.2 #43667

Merged

23 tasks

KristofferC mentioned this pull request Feb 15, 2022

Backports for 1.7.3 #44189

Merged

40 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimizer: fully support inlining of union-split, partially constant-prop' callsite #43347

optimizer: fully support inlining of union-split, partially constant-prop' callsite #43347

aviatesk commented Dec 6, 2021

aviatesk commented Dec 6, 2021

oscardssmith commented Dec 6, 2021

KristofferC commented Dec 7, 2021

aviatesk commented Dec 7, 2021

aviatesk commented Dec 7, 2021

aviatesk commented Dec 17, 2021

johnnychen94 commented Dec 19, 2021

aviatesk commented Jan 5, 2022

nanosoldier commented Jan 5, 2022

aviatesk commented Jan 5, 2022

aviatesk commented Jan 5, 2022

aviatesk commented Jan 5, 2022

nanosoldier commented Jan 5, 2022

optimizer: fully support inlining of union-split, partially constant-prop' callsite #43347

optimizer: fully support inlining of union-split, partially constant-prop' callsite #43347

Conversation

aviatesk commented Dec 6, 2021

aviatesk commented Dec 6, 2021

oscardssmith commented Dec 6, 2021

KristofferC commented Dec 7, 2021

aviatesk commented Dec 7, 2021

aviatesk commented Dec 7, 2021

aviatesk commented Dec 17, 2021

johnnychen94 commented Dec 19, 2021

aviatesk commented Jan 5, 2022

nanosoldier commented Jan 5, 2022

aviatesk commented Jan 5, 2022

aviatesk commented Jan 5, 2022

aviatesk commented Jan 5, 2022

Benchmark Report

Job Properties

Results

Benchmark Group List

Version Info

Primary Build

Comparison Build

Benchmark Report

Job Properties

Results

Benchmark Group List

Version Info

Primary Build

Comparison Build

nanosoldier commented Jan 5, 2022