Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to numeric exponent rules #224

Merged
merged 7 commits into from
Jul 8, 2020
Merged

Conversation

sethaxen
Copy link
Member

@sethaxen sethaxen commented Jul 2, 2020

This PR makes a few improvements to the rules for ^(::Number, ::Number).

For the general rule, it slightly improves the efficiency by removing the second ^ call.

It also adds a new real rule that avoids unnecessarily complexifying the tangents and cotangents. The general rule embeds the numbers in the complex plane for complex differentiation. However, for real negative base, exponentiation is undefined (literally throws an error) unless the exponent is exactly an integer. So for negative base, the derivative wrt the exponent is actually undefined (hence we can't even call FD on it). The new rule adopts the subgradient convention when the base is negative.

Oh, and since the rules are defined in fastmath.jl, I moved the test to the corresponding test file.

Here are are a few examples:

Example 1: frule with positive real base, real exponent

Only the type has changed. Instead of getting a purely real complex tangent, we just get the equivalent real tangent. The rrule has the same behavior.

using ChainRules
x, p = 2.5, 1.5
Δx, Δp = 0.33, -0.47
frule((Zero(), Δx, Δp), ^, x, p)

on master:

(3.952847075210474, -0.9196561346879987 - 0.0im)

this pr:

(3.952847075210474, -0.9196561346879987)

Example 2: frule with negative real base, integer exponent

We get the same tangent as we would have gotten had the input tangent on p been Zero()

x, p = -2.5, 3
frule((Zero(), Δx, Zero()), ^, x, p) # (-15.625, 6.1875)
frule((Zero(), Δx, Δp), ^, x, p)

on master:

(-15.625, 12.916510062200826 + 23.07107104980004im)

this pr:

(-15.625, 6.1875)

Example 3: rrule with negative real base, integer exponent

The cotangent on p is 0.

x, p = -2.5, 3
ΔΩ = 2.3
rrule(^, x, p)[2](ΔΩ)

on master:

(Zero(), 43.125, -32.929198176727446 - 112.90098598838317im)

this pr:

(Zero(), 43.125, -0.0)

@sethaxen sethaxen changed the title Improvements to real exponent rules Improvements to numeric exponent rules Jul 2, 2020
Copy link
Member

@willtebbutt willtebbutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just needs version bump.

@sethaxen
Copy link
Member Author

sethaxen commented Jul 2, 2020

Thanks! This function's pretty important, so I'll hold off on merging for a few days for others to review.

@sethaxen
Copy link
Member Author

sethaxen commented Jul 2, 2020

It's also not clear to me whether this should be considered a breaking change.

@willtebbutt
Copy link
Member

Hmm yeah, it depends on whether or not we call this a bug. I want to call it a bug because the previous behaviour is a bit odd and is inconsistent with what we're generally aiming for. However, it might cause some people's code to change behaviour.

@sethaxen
Copy link
Member Author

sethaxen commented Jul 2, 2020

Yeah, the main (only?) place a user could see a change would be if they passed the exponent as an int to an entry point of AD, e.g.:

julia> Zygote.gradient((x, p) -> x^real(p), -10.0, 2) # real's pullback drops the imaginary part

on master:

(-20.0, 230.25850929940458)

this pr

(-20.0, 0.0)

But realistically, the gradient on master cannot be used for anything like gradient descent, because if the 2 is perturbed by a non-integer value, then the gradient will raise a DomainError.

@oxinabox
Copy link
Member

oxinabox commented Jul 5, 2020

Sounds like old behavour was a bug.

@sethaxen
Copy link
Member Author

sethaxen commented Jul 6, 2020

I think so too. If there are no objections, I'll merge on Tuesday (after bumping version number).

@willtebbutt willtebbutt mentioned this pull request Jul 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants