Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fast path for merge with Dicts #22737

Merged
merged 1 commit into from
Oct 1, 2017
Merged

Conversation

garborg
Copy link
Contributor

@garborg garborg commented Jul 10, 2017

A hair slower if length(Dict) == 1, faster generally.
Closes #22519

@KristofferC
Copy link
Member

Any benchmarks to share?

@garborg
Copy link
Contributor Author

garborg commented Jul 10, 2017

Can do when I get home tonight. The quick benchmark in #22519 shows best-case improvements.
With respect to the first argument, it's marginally faster for dicts with 3 elements, over an order of magnitude for dicts of 1000 elements. Speedup dampened when copy rehashes the dict argument, but then I guess you get that cleanup for free.

Methodology-wise, I saw no slowdown vs a specialized method restricted to known, shared key and value types.

@nalimilan
Copy link
Member

nalimilan commented Jul 10, 2017

There's already an emptymergedict function. Couldn't you add a specialized method for it?

EDIT: maybe giving it another name since it seems it's faster to make a copy than allocating a new dict, but you could keep the existing mechanism.

@garborg
Copy link
Contributor Author

garborg commented Jul 10, 2017

Exactly, emptymergedict seemed like a misnomer. Is making both mergedict, with an empty or fill keyword argument, what you're suggesting, more or less?

@nalimilan
Copy link
Member

Actually I think I'd suggest merging emptymergedict and your mergedict into a single function (maybe with a different name, since none of them seems very explicit about what they do), without a keyword argument. Indeed it seems emptymergedict could have the same behavior as mergedict, i.e. return a new dict of the right type with the contents of the first Associative argument.

@garborg
Copy link
Contributor Author

garborg commented Jul 13, 2017

Good call. Once I went to combine helper functions, I realized all three previous paths were covered by well by either Dict{K,V}(d) or associative_with_eltype.

Timings confused me benchmarking, though. I tried a few variations before I realized the noise was just a matter of whether or not BenchmarkTools inlined the merge function being tested. The functions not being inlined weren't necessarily the ones I would have guessed. Timings look as expected after forcing inlining.

I assume that's just an artifact of the benchmarking and I don't want to force merge to be inlined.

timings w/ & w/o inlining here.

@tkelman
Copy link
Contributor

tkelman commented Jul 13, 2017

That looks pretty simple. Dunno if any existing benchmarks would cover this, if not might be worth adding some new ones to BaseBenchmarks to track it. @nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@garborg
Copy link
Contributor Author

garborg commented Jul 13, 2017

I'd be happy to add a benchmark or two, but I'm still a little confused about the interaction between BenchmarkTools and inlining from that benchmarking script: i.e. I needed to force inlining in benchmarking to see the comparison I expected, and I didn't think merge was something you'd force inlining on.

Copy link
Member

@Sacha0 Sacha0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superficially lgtm :).

@tkelman
Copy link
Contributor

tkelman commented Jul 18, 2017

we should probably rename the existing benchmark label to needs-benchmark rather than adding a new near-duplicate one, since that's most of what it's been used for by jrevels and myself ever since it was first added

@StefanKarpinski
Copy link
Member

It was unclear to me that this was what the benchmark label was for.

@StefanKarpinski StefanKarpinski added potential benchmark Could make a good benchmark in BaseBenchmarks and removed needs benchmark labels Jul 18, 2017
@tkelman
Copy link
Contributor

tkelman commented Jul 18, 2017

#13893 (comment)

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jul 18, 2017

Ah, yes, how could I have missed that one-line comment from two years ago? I've renamed the label to make its meaning clear without ancient comment spelunking.

Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me too. I wish we could find an even more explicit name, but without using a full sentence that doesn't sound possible. :-)

@KristofferC
Copy link
Member

KristofferC commented Jul 31, 2017

I needed to force inlining in benchmarking to see the comparison I expected, and I didn't think merge was something you'd force inlining on.

Here, merge just hands over to merge! and where a bunch of splatting occurs so it is possible the inlining avoids the splatting penalty. If the benchmarks are better with forced inlining, I would keep it.

@KristofferC
Copy link
Member

I don't see any difference between explicit inlining or not anymore. Will rerun CI and merge if green.

@KristofferC KristofferC closed this Oct 1, 2017
@KristofferC KristofferC reopened this Oct 1, 2017
@KristofferC KristofferC merged commit e88b264 into JuliaLang:master Oct 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
potential benchmark Could make a good benchmark in BaseBenchmarks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants