Skip to content

Conversation

@LilithHafner
Copy link
Member

Master

julia> expr = Meta.parseall(read("base/show.jl", String));

julia> using ChairmarksExtras

julia> @btime expr hash
  817.216 μs (22422 allocs: 350.344 KiB)
0x0b71fceb4afc793a

PR

julia> expr = Meta.parseall(read("base/show.jl", String));

julia> using ChairmarksExtras

julia> @btime expr hash
  416.546 μs (23358 allocs: 364.969 KiB)
0x2b97d79da6e8e413

@adienes, @vtjnash are either of you willing to take a look at this?

@adienes
Copy link
Member

adienes commented Aug 24, 2025

is the reason this is needed because the Any fallback for hash is marked @nospecialize(data) ?

@LilithHafner
Copy link
Member Author

The reason is that calling f(x[i]) where x::Vector{Any} invokes runtime (dynamic) dispatch. The compiler doesn't know which method of f will need to be called at compile time, that is only known at runtime.

The construct

elt = x[i]
if elt isa Expr
    f(elt)
else
    f(elt)
end

Helps guide the compiler. Specifically, in the first branch, elt::Expr, so the method f(elt) is known at compile time (in that branch).

At runtime, what happens here is:

if elt is an Expr, then use the appropriate method. Otherwise, transfer control to the runtime which will slowly (double-digit nanoseconds) figure out which method to invoke.

(though this PR has a branch for Expr, Symbol, and LineNumberNode)

This optimization is only possible and helpful if we know substantially more than the compiler does about the plausible/likely set of types that an object could have.

@adienes
Copy link
Member

adienes commented Aug 24, 2025

thanks for the explanation. does adding QuoteNode to that list do anything? also I probably would put union_split somewhere elsewhere than multidimensional.jl

but afaict seems good; the benchmark speaks for itself 🙂

Co-authored-by: Neven Sajko <4944410+nsajko@users.noreply.github.com>
@LilithHafner
Copy link
Member Author

Adding QuoteNode did not improve runtime significantly so I didn't push it.

@LilithHafner
Copy link
Member Author

I probably would put union_split somewhere elsewhere than multidimensional.jl

Where, though? The only ideas I've had are

meta.jl (but that file is only for defining Meta which this isn't)
tuple.jl (but I prefer multidimensional.jl over tuple.jl)

Given no good home, I'm inclined to keep it near its only use for now. Open to fitting locations where it could be.

@adienes
Copy link
Member

adienes commented Aug 25, 2025

essentials.jl looks a little bit more canonical to me as that contains other "fancy compiler things." but of course this is just a bikeshed for which I'd support any decision you end up making

@LilithHafner
Copy link
Member Author

That does look like a place where this function would "fit right in", OTOH gosh, this functionality really isn't essential. And there's also the bootstrapping issue (if this is moved to essentials than it can't depend on anything that depends on essentials.

At this point I'm inclined to leave it in multidimensional.jl. if this gets other users (esp. those earlier in the load order than multidimensional) then we could move it to essentials or somewhere else.

@adienes
Copy link
Member

adienes commented Aug 25, 2025

just to double check, there should be no penalty whatsoever (vs master) when eltype_hint==() ? I tried it on some arrays of Int and it seemed like there was not, but just wanted to make sure

@oscardssmith oscardssmith added latency Latency and removed latency Latency labels Aug 26, 2025
@oscardssmith
Copy link
Member

Does this have any noticable performance impact on compilation?

@LilithHafner
Copy link
Member Author

@oscardssmith I doubt it will improve anything because it would be silly for compilation to be bottlenecked on hashing Exprs.

Do you have a recommendation for testing if it has regressed anything?

@oscardssmith
Copy link
Member

oh, I just assumed that if you were improving hashing Expr that it probably was a bottleneck of something you had run into and compilation seemed like the most likely candidate

@LilithHafner
Copy link
Member Author

I was working with user code that does CSEL on Exprs which was bottlenecked by hash(::Expr)

Co-authored-by: Neven Sajko <4944410+nsajko@users.noreply.github.com>
Co-authored-by: Neven Sajko <4944410+nsajko@users.noreply.github.com>
@LilithHafner
Copy link
Member Author

Commit message:

Union-split on `Expr`, `Symbol`, and `LineNumberNode` when hashing `Expr`s (#59378)

```julia-repl
x@fedora:~/.julia/dev/julia$ julia +pr59378
  o  | Version 1.13.0-DEV.1043 (2025-09-07)
 o o | lh/hash-expr-union-split/8a95cf82d3d (fork: 8 commits, 14 days)
julia> expr = Meta.parseall(read("base/show.jl", String));

julia> using ChairmarksExtras

julia> @btime expr hash
  395.753 μs (23358 allocs: 364.969 KiB)
0x8e1ffc47fe5dc80b

julia> @btime :(sin(x^2) + cos(x^2)) hash
  144.778 ns (15 allocs: 240 bytes)
0xc837adb769107933

julia> 
x@fedora:~/.julia/dev/julia$ julia +nightly
  o  | Version 1.13.0-DEV.1096 (2025-09-07)
 o o | Commit 8a384ab93e4 (0 days old master)
julia> expr = Meta.parseall(read("base/show.jl", String));

julia> using ChairmarksExtras

julia> @btime expr hash
  826.924 μs (22422 allocs: 350.344 KiB)
0xf4f9c5fc15a95298

julia> @btime :(sin(x^2) + cos(x^2)) hash
  275.557 ns (14 allocs: 224 bytes)
0xc837adb769107933
```

Notably, the hash of that big expressions changes between these versions because it contains global refs that have different `objectid`s on these two versions.

---------

Co-authored-by: Neven Sajko <4944410+nsajko@users.noreply.github.com>
Co-authored-by: Andy Dienes <51664769+adienes@users.noreply.github.com>

@LilithHafner LilithHafner added the merge me PR is reviewed. Merge when all tests are passing label Sep 7, 2025
@nsajko
Copy link
Member

nsajko commented Sep 7, 2025

For the record, Github interprets @ mentions in commit messages without interpreting Markdown. So the suggested commit message would ping https://github.com/btime.

@LilithHafner LilithHafner merged commit 5c93bf2 into master Sep 8, 2025
8 checks passed
@LilithHafner LilithHafner deleted the lh/hash-expr-union-split branch September 8, 2025 13:32
@LilithHafner
Copy link
Member Author

🤷 oh well

@adienes adienes removed the merge me PR is reviewed. Merge when all tests are passing label Sep 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants