Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isapprox can give absurd result in case of integer overflow #50380

Open
nalimilan opened this issue Jul 1, 2023 · 8 comments · Fixed by #50730
Open

isapprox can give absurd result in case of integer overflow #50380

nalimilan opened this issue Jul 1, 2023 · 8 comments · Fixed by #50730
Labels
correctness bug ⚠ Bugs that are likely to lead to incorrect results in user code without throwing maths Mathematical functions

Comments

@nalimilan
Copy link
Member

nalimilan commented Jul 1, 2023

I was surprised to realize this:

julia> Int8(-68)  Int8(60)
true

This is because of the way the isapprox fallback is defined, and more precisely to:

julia> abs(Int8(-68) - Int8(60))
-128

Shouldn't we define a custom method for integers that is robust to overflow?

@oscardssmith
Copy link
Member

yes

@nalimilan
Copy link
Member Author

OK. I'm not sure how to handle overflow though. Using widen is the easy way, but the performance impact will be large for Int64/UInt64 and even more Int128/UInt128. The best I could find is a hybrid approach which simply calls == in the most common (default) case and relies on widen for other situations:

function isapprox(x::Integer, y::Integer;
                  atol::Real=0, rtol::Real=rtoldefault(x,y,atol),
                  nans::Bool=false, norm::Function=abs)
    if norm === abs && atol < 1 && rtol == 0
        return x == y
    else
        return norm(widen(x) - widen(y)) <= max(atol, rtol*max(norm(widen(x)), norm(widen(y))))
    end
end

Unfortunately not all integer types define widen (notably not Bool) so that method would have to be used only for types defined in Base.

This method will also have to be used/adapted for cases mixing integers and floats. That case is more complex as rtol will be nonzero by default. For performance, an intermediate branch could be added to check whether x - y will overflow and whether x == typemin(x) or y == typemin(y) (which makes abs return a negative value). If that's not the case (99% of cases) we can call the formula without widen. (An alternative solution would be to call float on the integer value, but for Int64/UInt64 and Int128/UInt128 this would give incorrect results for large numbers above maxintfloat(Float64), which is worse than the current overflow risk.)

@oscardssmith
Copy link
Member

can't you just check use a subtract with overflow checking?

@stevengj stevengj added the maths Mathematical functions label Jul 2, 2023
@nalimilan
Copy link
Member Author

Well that's no enough. First, norm(x) can also pose problems. For the default norm=abs we can check whether x == typemin(x). But that doesn't work for arbitrary user functions. Maybe we can say that it's the user's responsibility to pass a function that checks for overflow?

The second issue is what to do if overflow happens. In general that means that x and y are not approximately equal, but nothing prevents the user from passing atol=typemax(Int) or rtol=100, so for full correctness we have to compute the full formula in a wider type AFAICT, at least in a fallback branch which is almost never taken.

@vtjnash
Copy link
Member

vtjnash commented Jul 5, 2023

I guess part of the confusion here is that (except for exactly typemin as shown in the first answer) most of the answers actually are close, in the mod Int8 arithmetic base that is being expressed here.

julia> isapprox(Int8(127), Int8(-118), atol=10)
false

julia> isapprox(Int8(127), Int8(-119), atol=10)
true

@nalimilan
Copy link
Member Author

Yes, but do we really want isapprox to use modular arithmetic? That sounds confusing, and for the most common case we can implement a solution robust to overflow with a very small performance cost.

@LilithHafner
Copy link
Member

Here's another overflow:

julia> isapprox(UInt8(60), UInt8(61), atol=1)
false

julia> isapprox(UInt8(60), UInt8(59), atol=1)
true

Technically, the overflow behavior described in this thread is correct because the docstring defines the semantics in terms of the implementation:

isapprox returns true if norm(x-y) <= max(atol, rtol*max(norm(x), norm(y)))

But I still think some of this behavior is bad enough to warrant changing (and changing the documentation to reflect that)

KristofferC pushed a commit that referenced this issue Aug 14, 2023
Ensure that `isapprox` gives correct results when comparing an integer
with another integer or with a float. For comparison between integers,
the fix only works when keeping default values for `rtol` and `norm`,
and with `atol < 1`.

It is not possible to handle the (atypical) case where `norm !== abs`,
but that's OK since the user is responsible for providing a safe
function.

It would be possible to handle the case where `rtol > 0` or `atol >= 1`,
but with complex code which would check for overflow and handle all
possible corner cases; it would work only for types defined in Base and
would not be extensible by packages. So I'm not sure that's worth it. At
least with PR fixes the most common case.

Fixes #50380.
@nalimilan
Copy link
Member Author

#50730 does fix the case described in the last comment as it only covers atol < 1. I can try proposing more code to cover this kind of overflow, but it seems almost impossible to cover all situations, notably when mixing signed and unsigned.

@nalimilan nalimilan reopened this Aug 14, 2023
@nalimilan nalimilan added the correctness bug ⚠ Bugs that are likely to lead to incorrect results in user code without throwing label Aug 15, 2023
KristofferC pushed a commit that referenced this issue Aug 18, 2023
Ensure that `isapprox` gives correct results when comparing an integer
with another integer or with a float. For comparison between integers,
the fix only works when keeping default values for `rtol` and `norm`,
and with `atol < 1`.

It is not possible to handle the (atypical) case where `norm !== abs`,
but that's OK since the user is responsible for providing a safe
function.

It would be possible to handle the case where `rtol > 0` or `atol >= 1`,
but with complex code which would check for overflow and handle all
possible corner cases; it would work only for types defined in Base and
would not be extensible by packages. So I'm not sure that's worth it. At
least with PR fixes the most common case.

Fixes #50380.

(cherry picked from commit 5f03a18)
IanButterworth pushed a commit that referenced this issue Aug 19, 2023
Ensure that `isapprox` gives correct results when comparing an integer
with another integer or with a float. For comparison between integers,
the fix only works when keeping default values for `rtol` and `norm`,
and with `atol < 1`.

It is not possible to handle the (atypical) case where `norm !== abs`,
but that's OK since the user is responsible for providing a safe
function.

It would be possible to handle the case where `rtol > 0` or `atol >= 1`,
but with complex code which would check for overflow and handle all
possible corner cases; it would work only for types defined in Base and
would not be extensible by packages. So I'm not sure that's worth it. At
least with PR fixes the most common case.

Fixes #50380.

(cherry picked from commit 5f03a18)
IanButterworth pushed a commit that referenced this issue Aug 25, 2023
Ensure that `isapprox` gives correct results when comparing an integer
with another integer or with a float. For comparison between integers,
the fix only works when keeping default values for `rtol` and `norm`,
and with `atol < 1`.

It is not possible to handle the (atypical) case where `norm !== abs`,
but that's OK since the user is responsible for providing a safe
function.

It would be possible to handle the case where `rtol > 0` or `atol >= 1`,
but with complex code which would check for overflow and handle all
possible corner cases; it would work only for types defined in Base and
would not be extensible by packages. So I'm not sure that's worth it. At
least with PR fixes the most common case.

Fixes #50380.

(cherry picked from commit 5f03a18)
nalimilan added a commit that referenced this issue Nov 5, 2023
Ensure that `isapprox` gives correct results when comparing an integer
with another integer or with a float. For comparison between integers,
the fix only works when keeping default values for `rtol` and `norm`,
and with `atol < 1`.

It is not possible to handle the (atypical) case where `norm !== abs`,
but that's OK since the user is responsible for providing a safe
function.

It would be possible to handle the case where `rtol > 0` or `atol >= 1`,
but with complex code which would check for overflow and handle all
possible corner cases; it would work only for types defined in Base and
would not be extensible by packages. So I'm not sure that's worth it. At
least with PR fixes the most common case.

Fixes #50380.

(cherry picked from commit 5f03a18)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
correctness bug ⚠ Bugs that are likely to lead to incorrect results in user code without throwing maths Mathematical functions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants