Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move CHOLMOD.common_struct from Vector{UInt8} to an actual struct #38919

Merged
merged 2 commits into from
Apr 18, 2021

Conversation

ettersi
Copy link
Contributor

@ettersi ettersi commented Dec 17, 2020

@dkarrasch dkarrasch added linear algebra Linear algebra sparse Sparse arrays stdlib Julia's standard library labels Dec 17, 2020
@musm
Copy link
Contributor

musm commented Dec 18, 2020

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @christopher-dG

@musm musm requested a review from andreasnoack December 19, 2020 18:01
@andreasnoack
Copy link
Member

While there is no doubt that it is nicer to work with a Julia struct like this, the potential issue here is that the common struct changes over time, see https://github.com/DrTimothyAldenDavis/SuiteSparse/blame/master/CHOLMOD/Include/cholmod_core.h. With a shim, we don't have to do anything but compile the shim for each SuiteSparse release but if we layout the struct manually in Julia then we'd have to keep it in sync with the C header file. I'm quite confident that we'll fail to do that.

@ettersi
Copy link
Contributor Author

ettersi commented Dec 21, 2020

I guess we could write a test which checks that the C and Julia structs are indeed equivalent. That's probably a good idea anyway, because it is quite easy to make a mistake when translating all the many many fields of cholmod_common.

Is it possible to get Make to compile a library which is used only for testing? Moving jl_cholmod_common_offset into such a library and expanding it to test all fields would probably be the cleanest way to implement this.

@ViralBShah
Copy link
Member

Yes, we can add a check at build time.

I guess we could also use Clang.jl to auto-generate the struct definition. Perhaps best done whenever we are able to move the sparse stuff into a separate repo.

@ettersi
Copy link
Contributor Author

ettersi commented Jan 24, 2021

I'm currently busy teaching, but I'd be happy to pick this up again once the semester is over (around mid-April) if there is any chance that this PR could be accepted. If not (Viral's comment sounds like you might want to wait for a more major overhaul), then I'd greatly appreciate if you let me know.

@ViralBShah
Copy link
Member

@andreasnoack seems to be not in favour of this for maintainability reasons, which I agree with as well. So the only way to address that is auto-generation with Clang.jl or something - in which case it would have to wait for the separation of repos.

@ettersi
Copy link
Contributor Author

ettersi commented Jan 24, 2021

All right, thanks for letting me know!

One minor remark: I interpreted @andreasnoack 's remark over in Issue JuliaLang/LinearAlgebra.jl#649 as encouragement for this PR, so I can't deny that I am a little disappointed that apparently the proposed improvement was actually doomed from the beginning. I'm sure this was just miscommunication and quite possibly I am the one to blame here, but it takes two to miscommunicate so I thought I'd share my feelings in case it could be helpful in the future.

@ViralBShah
Copy link
Member

ViralBShah commented Jan 24, 2021

Sorry, my intention was not to disappoint.

We really do want to get this PR in - but just with a guard against changes to the struct upstream. BTW, building a separate test library for this as you suggest ought to be something that will be acceptable here, since that will provide a guard against struct changes.

@ViralBShah
Copy link
Member

ViralBShah commented Feb 7, 2021

I'm changing my mind on this. This will let us get rid of the SuiteSparse wrapper - gets us really close to getting rid of the wrapper file. @ettersi Can you help us get this over the finish line? It would be good to add some comments in the code that point to where the struct is (perhaps a github link to Tim Davis' code?)

Can we add some checks to the code so that we can do a CHOLMOD version check so that if SuiteSparse is bumped without checking the code, a warning is given - so that whoever bumps will take a look at the warning? I think that would be sufficient to get this done.

@ettersi
Copy link
Contributor Author

ettersi commented Feb 8, 2021

I'd love to help, but I won't be able to do so before May. Any chance you guys can wait until then?

@ViralBShah
Copy link
Member

Yes, there's no urgency. We have a working system right now. So we just get it done when its ready.

@ettersi
Copy link
Contributor Author

ettersi commented Apr 12, 2021

Okay, I believe I addressed your concerns. Specifically, I added tests to CHOLMOD.__init__() to check that the size and offsets of the C cholmod_common and Julia Common structs match. I also added some comments to point out the dependencies between the various pieces of code. Let me know if you have any further concerns.

@ViralBShah
Copy link
Member

Is it possible to get rid of the C wrapper code completely?

@ViralBShah
Copy link
Member

We should also hard-code the version of cholmod for which this struct is coded up, so that if cholmod is bumped in the distribution, one has to bump the version required in cholmod.jl as well - to avoid accidental breakage when the struct goes out of sync when upstream libraries are bumped.

@ettersi
Copy link
Contributor Author

ettersi commented Apr 12, 2021

I have added tests in CHOLMOD.__init__() which check that the Julia and C struct definitions are compatible (have same size and offsets). That should address the issue of accidental breaks, no?

@ettersi
Copy link
Contributor Author

ettersi commented Apr 12, 2021

Is it possible to get rid of the C wrapper code completely?

Don't think so. I don't see how we could replicate any of the functions in
https://github.com/JuliaLang/julia/blob/cebee5ab93410f0b1784dfba7d35486a78659f86/deps/SuiteSparse_wrapper.c
in Julia.

@ViralBShah
Copy link
Member

If the sizes and offsets are the same in the C and Julia codes, can we not just get rid of the wrapper?

@ettersi
Copy link
Contributor Author

ettersi commented Apr 13, 2021

No, because we need the wrapper to check that the offsets are indeed the same (unless there's some trick to get around this that I am not aware of?). Also, SuiteSparse defines a SuiteSparse_long type that we need to extract.

@ViralBShah
Copy link
Member

ViralBShah commented Apr 13, 2021

I am imagining that we could write a little C program that calculates all these things and writes them out into a .jl file that can be part of our Julia wrappers for SuiteSparse. An attempt to do was made by @simonbyrne in JuliaPackaging/Yggdrasil#505

There's an overall issues to get rid of the wrapper: https://github.com/JuliaLang/julia/issues/34725

@ViralBShah
Copy link
Member

This PR will also fix #20985

@ettersi
Copy link
Contributor Author

ettersi commented Apr 14, 2021

I am imagining that we could write a little C program that calculates all these things and writes them out into a .jl file that can be part of our Julia wrappers for SuiteSparse.

I guess in this case we might as well auto-generate the struct definitions using Clang.jl. I'd be happy to have a go at this, but if I understand correctly then simonbyrne and Gnimuc already tried and got stuck so I doubt I'd get very far without some very detailed explanations.

@Gnimuc
Copy link
Contributor

Gnimuc commented Apr 15, 2021

I believe it's possible to do cross-compilation with pure Clang.jl, build flags(include dirs, defines, targets) and corresponding artifacts(system headers).

@ettersi
Copy link
Contributor Author

ettersi commented Apr 15, 2021

I believe the failed tests are because the SuiteSparse_wrapper.c file in the Julia repo is now out of sync with the one in Yggdrasil.
This hypothesis is based on the observation that the first error message in the failed tests is

��� Error: Error during initialization of module CHOLMOD
���   exception =
���    could not load symbol "jl_cholmod_method_offsets":
���    /buildworker/worker/tester_linux64/build/bin/../lib/julia/libsuitesparse_wrapper.so: undefined symbol: jl_cholmod_method_offsets

I know only little about how Yggdrasil works and how it interacts with base Julia, but Yggdrasil being out of sync with Julia is the only reason I can think of why some tests run into the above while others don't.

What is the procedure for resolving this issue? Should I create another PR in Yggdrasil with the required changes, and then whoever merges has to make sure they merge both at the same time? Or is there a safer way to do this?


Also, I've moved the offset checks from CHOLMOD.__init__() to the cholmod test file. Placing these checks in __init__() was the result of some misconceptions I had about how linking works, and I now believe that it makes no sense to run these checks every time someone loads the CHOLMOD package.

On a related topic, it seems that CHOLMOD occasionally rearranges the order of the cholmod_common fields without bumping the major version number (at least that's the impression I got looking at the blame history). So maybe this would require us to be a bit more strict when comparing version numbers on this line?

Long story short, it would probably be good if someone with a better understanding of versioning and linking would have another look at this to make sure I did not mess things up.

@Gnimuc
Copy link
Contributor

Gnimuc commented Apr 15, 2021

@Gnimuc's idea to use Clang.jl (JuliaPackaging/Yggdrasil#505 (comment)) looks like it works, but would require someone to integrate it with BinaryBuilder.jl (JuliaPackaging/BinaryBuilder.jl#673)

@simonbyrne I just wrote a concrete example using Clang.jl's new generator, indeed, with some hacks: https://github.com/Gnimuc/SuiteSparseGenerator.

I now get stuck with the sysroot flag, even I passed it to libclang via --sysroot=..., it just failed to recognize it.

@ViralBShah
Copy link
Member

@ettersi The new SuiteSparse_wrapper.c for now should just be a PR to the Yggdrasil repo - specifically, replace this file with the new one: https://github.com/JuliaPackaging/Yggdrasil/tree/master/S/SuiteSparse/SuiteSparse/bundled/SuiteSparse_wrapper

The PR will kick off the build process. Separately, we should tighten the version checking to closely follow CHOLMOD down to the patch level for now. Both the Yggdrasil PRs and this one should be merged in close succession - with Yggdrasil first. I'm happy to help with the merging.

@ettersi
Copy link
Contributor Author

ettersi commented Apr 16, 2021

All right, I tightened the version number check and created the Yggdrasil pull request (JuliaPackaging/Yggdrasil#2848). I believe this should be good to go now?

@ViralBShah
Copy link
Member

I've merged the Yggdrasil PR, which should propagate into the relevant places in the next hour or so. We will need all green here based on that PR and then merge this one.

@ViralBShah
Copy link
Member

I merged JuliaRegistries/General#34441 by hand.

@ViralBShah
Copy link
Member

This PR needs to update https://github.com/JuliaLang/julia/blob/master/stdlib/SuiteSparse_jll/Project.toml. The version needs to be changed to 5.8.1+2 so that it picks up the new SuiteSparse binaries.

@ettersi
Copy link
Contributor Author

ettersi commented Apr 16, 2021

Last remaining error:

make: *** No rule to make target 'win-extras'.  Stop.

Unrelated? 🤞

@ViralBShah
Copy link
Member

ViralBShah commented Apr 16, 2021

We should rebase to master, since this PR is against a very old master. Most likely that should fix the test failures on windows.

@ettersi
Copy link
Contributor Author

ettersi commented Apr 17, 2021

Did the rebase. The tests which are failing now were added in #39301, and it seems there are some issues with these tests, see the recent discussion in #40504. I guess we wait until these issues are resolved and try again.

@ettersi
Copy link
Contributor Author

ettersi commented Apr 18, 2021

All green! 🎉🎉🎉

@ViralBShah ViralBShah merged commit 671bccb into JuliaLang:master Apr 18, 2021
@ettersi
Copy link
Contributor Author

ettersi commented Apr 19, 2021

Awesome, thanks everyone for all your help! As you could probably tell, this was my first (significant) contribution to Julia, and thanks to all of you this has been a very enjoyable and rewarding experience!

@ViralBShah
Copy link
Member

Thanks @ettersi for the contribution and staying with the process - which can be a bit long at times. But as a result now we have one more person who understands our sparse solver infrastructure :-)

@ettersi ettersi deleted the cholmod branch April 22, 2021 02:52
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this pull request May 4, 2021
…liaLang#38919)

* Move common_struct from Vector{UInt8} to an actual struct

* Compare full version numbers

Co-authored-by: Simon Etter <ettersi@nus.edu.sg>
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this pull request May 9, 2021
…liaLang#38919)

* Move common_struct from Vector{UInt8} to an actual struct

* Compare full version numbers

Co-authored-by: Simon Etter <ettersi@nus.edu.sg>
johanmon pushed a commit to johanmon/julia that referenced this pull request Jul 5, 2021
…liaLang#38919)

* Move common_struct from Vector{UInt8} to an actual struct

* Compare full version numbers

Co-authored-by: Simon Etter <ettersi@nus.edu.sg>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra sparse Sparse arrays stdlib Julia's standard library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants