Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of extern? / Add unthunk, use extern only for testing? #56

Closed
oxinabox opened this issue Oct 17, 2019 · 4 comments · Fixed by #365
Closed

Get rid of extern? / Add unthunk, use extern only for testing? #56

oxinabox opened this issue Oct 17, 2019 · 4 comments · Fixed by #365
Labels
Milestone

Comments

@oxinabox
Copy link
Member

oxinabox commented Oct 17, 2019

@willtebbutt and I were talking,
extern doesn't mean much without context.
maybe we have unthunk which is defined on AbstractThunks
and everything else just handles this during + Most other information you need is available at +, from the other object.

@oxinabox
Copy link
Member Author

This would actually let us revert #55 / #10
if we wanted.
We don't want to right now cos it is too complex for unproven gain,
but we could

@oxinabox
Copy link
Member Author

Repeating from #8 (comment)

I am thinking
We have 3 kinds of objects*: (these do not correspond to julia types)
primal values, and their paired differential,
and scale-factors.
The core rule is that for
s a scale-factor
v a primal value of kind V
d a differential (of kind V')
Then v + sd must be a primal value in V
In the simple case:
scalar factors are scalars, subtyping Real,
primals and differentials are matrixes of the same size.
Zero() is both a scale factor and a differential for All Kinds of primals.
One() is a scalar factor and a differential for scalar primals only.
The main reason to care about One() when acting s a differential is for doing computation of partial derivatives via forward, or for the initial seed in reverse when you have gone down to a single value like a neural network loss.
So One is not like LinearAlgebra.I because when you multiply a matrix by One(), that matrix is a differential, d in core formula above,
and the core formula only has scalefactor multiplication with a differential.
So it is scalar in that case.
We don’t need to multiply two differentials by each other.
Core formula could be wrong though
Thunk is a differential that is equivelent to the differential being thunked.
(if you thunk something else then it is invalid)
Think of this like the deletation pattern, Wrapped{T<:AbstractT} <:AbstractT
*(plus and maybe differentials that escape their pairings like Wirtinger, but that hopefully will come home)

The interesting one in this framework is the Composite
which is the named-tuple like differential that matches to all stucts.
Because it is nontrival to make the addition of it to its primal (which could be any struct) work
because constructing primals its hard.
But we can do the thing that mostly works and fails when someone makes a primal that can’t be constructed,
and we can make it easy to overload that

Under that framework,
extern is characterized as:
I will attempt to work out the zero of the primal kind (V) matching to the differential kind (V') of the differential object I have. If I can't work it out I will make some assumptions about the primal kind, that are documented in the differentials docstring to work it out (in particular I think it will always assume scalar kind).

@oxinabox
Copy link
Member Author

oxinabox commented Oct 29, 2019

Basically i think we keep extern around for testing purposes,
which tend to be cases that it works for.

This is also inline with the fact that it is now recursive.
(since #47 / #48)

But we shouldn't be using extern internally, to open up thunks.
Instead we should be using unthunk
which I think is already in #54

@oxinabox oxinabox changed the title Get rid of extern? Get rid of extern? / Add unthunk, use extern only for testing? Oct 29, 2019
@oxinabox
Copy link
Member Author

hmmm
I wonder we should change:

@eval Base.:+(a::One, b::$T) = extern(a) + b

To:

@eval Base.:+(a::One, b::$T) = one(b) + b

docs:

  one(x)

Return a multiplicative identity for x: a value such that one(x)*x == x*one(x) == x
julia> one(NaN)
1.0

@simeonschaub would like it if it did return I for matrixes,
which one(b) does do for square matrixes (and it errors otherwise, which is fine as even if it didn't it would .

julia> one(rand(3,3))
3×3 Array{Float64,2}:
 1.0  0.0  0.0
 0.0  1.0  0.0
 0.0  0.0  1.0

Technically reather than returning I it returns the concrete instance, which is a bit sad,
so we might want to make a custom versiont hat fixes that.
Maybe one that acts on types instead, so can compile-time it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants