-
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure sparse backends and replace subtyping by traits #40
Conversation
Something not mentioned in the top there is that this will require that ArrayInterface.jl take on an ADTypes.jl dependency, which is fine given that this is lightweight and all, but it should be noted that its |
The symbols instead of abstract types thing is weird. That needs some explanation. |
Thanks for the review! Indeed you identified two parts where I was unsure:
See my comment #40 (comment)
Either that or the coloring names are dropped from ArrayInterface altogether? See my comment #40 (comment) |
I can rename
Not sure how to deal with those though. I can also rename |
It's at least contained to two repos, so it's not too big of a deal if it's deprecated properly.
Yeah that's mostly by adjoint. But the bigger thing would be updating all of the downstream analytical solutions to have these functions as well. |
If we keep the name |
That's what we currently do, but if you're going to setup general sparse AD then it's not a good idea since you won't easily know if someone passes a coloring pattern if it should be A or A'. So I like the idea of two functions, but the work of updating all of the analytical solutions in the extensions to the two functions needs to be done as well. |
I agree. Maybe ArrayInterface can keep defining ADTypes.column_coloring(M) = ArrayInterface.matrix_colors(M)
ADTypes.row_coloring(M) = ArrayInterface.matrix_colors(M') without changing the names in the actual function definitions for the time being |
Thanks for the comments! I'll update the PR with a trait mechanism and ping you again for validation if that's okay? |
Let's do the clean break. Such a clean break is the kind of thing that needs someone motivated in order to say yes. I don't see you stopping the DI push, so let's do it right. v0.3.0 here to signal the break, update the downstream in ArrayInterface and SparseDiffTools, and get sparse DI to do things the best way possible, and then integrate it into everything like NonlinearSolve.jl and DifferentialEquations.jl. You've got gusto and we've been sitting here waiting for something like DI to finally fix some of these remaining issues, so let's just pull the trigger and do it. |
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
Not ready yet but close |
@ChrisRackauckas I have implemented the trait mechanism and used the opportunity for other small breaking changes in the concrete structs. All of it is summarized in the first comment, which I have edited: #40 (comment) |
For testing, while it's a hassle to add Diffractor and Zygote to the dependencies, I did perform the following check of my julia> import Diffractor, Zygote
julia> using ADTypes # on the PR branch
julia> ADTypes.mode(AutoChainRules(; ruleconfig=Zygote.ZygoteRuleConfig()))
ADTypes.ReverseMode()
julia> ADTypes.mode(AutoChainRules(; ruleconfig=Diffractor.DiffractorRuleConfig()))
ADTypes.ForwardOrReverseMode()
julia> @which ADTypes.mode(AutoChainRules(; ruleconfig=Zygote.ZygoteRuleConfig()))
mode(::AutoChainRules{RC}) where RC<:(ChainRulesCore.RuleConfig{>:ChainRulesCore.HasReverseMode})
@ ADTypesChainRulesCoreExt ~/.julia/packages/ADTypes/bOklB/ext/ADTypesChainRulesCoreExt.jl:16
julia> @which ADTypes.mode(AutoChainRules(; ruleconfig=Diffractor.DiffractorRuleConfig()))
mode(::AutoChainRules{RC}) where RC<:(ChainRulesCore.RuleConfig{>:Union{ChainRulesCore.HasForwardsMode, ChainRulesCore.HasReverseMode}})
@ ADTypesChainRulesCoreExt ~/.julia/packages/ADTypes/bOklB/ext/ADTypesChainRulesCoreExt.jl:22 Of course there is no ChainRules-compatible backend with a forward-only |
I think we're there, the last few things to iron out are:
|
We should define mode and at least bump to v1.9. I'd say just go to v1.10 knowing it's the next LTS and skip all of the other pain, there's other stuff to worry about in life.
Yes
We just shouldn't run nightly, it's not for humans. We can set the new prerelease branch when it's ready, but since nightly isn't for packages (which is why the prerelease is being made) it's a waste of our time to be trying it. |
Co-authored-by: Christopher Rackauckas <accounts@chrisrackauckas.com>
Done.
Removed nightly from CI |
Once I get an approving review I'll merge, but I'd like to do two things before we register v1:
|
Yes no need to release right away, let's get downstream all set and make sure everyone is bought into this form. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉🎉
Motivation
This is a breaking PR (bumping version to v0.3.0) that restructures the handling of sparse backends, and cleans up the package in the process.
Breaking API changes
Mode
AbstractForwardMode
andAbstractSparseForwardMode
), and replace them with themode
trait. They were not documented but some packages used them anyway, like DifferentiationInterface.jl.mode(ad::AbstractADType)
return one of the following singletons:ForwardMode()
,ReverseMode()
,SymbolicMode()
or the ambiguousForwardOrReverseMode()
(for cases likeAutoEnzyme
andAutoChainRules
)ForwardMode
Concrete structs
AutoModelingToolkit
mode
field forAutoEnzyme
: eithernothing
or a subtype ofEnzymeCore.Mode
(see What is the right way to specify the Enzyme mode? #24)fdm = nothing
default forAutoFiniteDifferences
, since it always has to be a finite difference methodAutoPolyesterForwardDiff
(see Don't usenothing
Tag JuliaDiff/PolyesterForwardDiff.jl#18)AutoSparseForwardDiff
type and others like it, and turn them into deprecated constructors:Non-breaking API changes
AbstractADType
, which is useful for many downstream usersAutoSymbolics
to replaceAutoModelingToolkit
AutoTapir
for https://github.com/withbayes/Tapir.jlSparsity
AutoSparse
struct that wraps adense_ad
backend with asparsity_detector
and acoloring_algorithm
(as discussed in Change handling of sparse backends #38)AbstractSparsityDetector
,NoSparsityDetector
)AbstractColoringAlgorithm
,NoColoringAlgorithm
)AbstractSparsityDetector
:jacobian_sparsity
hessian_sparsity
AbstractColoringAlgorithm
:column_coloring
row_coloring
Other changes
Dependencies
mode
dispatchesChainRulesCore.RuleConfig
for ChainRulesCore.jlEnzymeCore.Mode
for EnzymeCore.jlPackage quality
Checklist
contributor guidelines, in particular the SciML Style Guide and
COLPRAC.
Questions
AutoPolyesterForwardDiff{chunksize}
intoAutoPolyesterForwardDiff{chunksize,T}
is breaking because someone might have used the former as a constructor. That is why I explicitly document keyword constructors and fields.RuleConfig
paradigm? I define very simpleRuleConfig
s inruntests.jl
but it's less reliable than taking the real ones