-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Same LLVM IR, radically different performance #22408
Comments
I'm trying to implement this @noinline annotation to the _distributevals_halfperm! function as I also noticed a significant slowdown in Transposing large sparse matrices. How would I go about doing this to see if it makes a difference in performance on my tasks? (Sorry if this comment is basic but I'm confused as to how to edit the source code for the sparse matrix library locally) Thank you for the help |
Do you have a git checkout of Julia or did you download as a binary? If you have the source you just edit the file and type |
I believe I have the binary as I installed with homebrew on Mac OS and I'm not sure if the git checkout has support for this OS. How would I do it for this version? |
Just try eval(Base.SparseArrays, quote
<Your revised function definition goes here>
end) You can copy/paste the original version into the |
But mac OS is well supported, just see https://github.com/JuliaLang/julia#source-download-and-compilation |
You're right, I just upgraded to 0.6 so I'll try it with the eval method. My attempt at git checkout method was buggy when makeing the file possibly due to proxy issues but regardless I think eval should do the trick. |
Running the original test locally I can't reproduce this, @timholy can we lose? |
This came up in the context of #22210, where I'm noticing a big performance hit on
transpose
for sparse matrices. A convenient test case comes from copying these lines to a separate file, and annotating_computecolptrs_halfperm!
with@noinline
(not strictly necessary since it doesn't inline on master) and then comparing the result of using either@noinline
or@inline
on_distributevals_halfperm!
.Demo:
With
@inline
on_distributevals_halfperm!
:With
@noinline
on_distributevals_halfperm!
:Inspection does not suggest an immediate reason for this 40x performance gap; profiling places all the blame at this line with the function evaluation. It made me wonder whether there is some problem inlining the function call.
However, the truly bizarre part is that, with
@inline
,@code_llvm _distributevals_halfperm!(X, A, 1:A.n, identity)
is, for all practical purposes that I can see, identical to@code_llvm halfperm!(X, A, 1:A.n, identity)
(aside from the obvious call to_computecolptrs_halfperm!
). I am not at all good at reading assembly, but even there the differences do not seem dramatic to me (there are some constant differences tomovq
statements that might be problematic?).This seems really puzzling. LLVM bug? Present at least on 0.6.0-rc3 and master.
The text was updated successfully, but these errors were encountered: