Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add safe acos and safe asin #76876

Closed
wants to merge 1 commit into from
Closed

Conversation

lawnjelly
Copy link
Member

@lawnjelly lawnjelly commented May 9, 2023

A common bug with using acos and asin is that input outside -1 to 1 range will result in Nan output. This can occur due to floating point error in the input.

The standard solution is to provide safe_acos function with clamped input. For Godot it may make more sense to make the standard functions safe, and add extra unsafe functions for rare occasions when needed.

Fixes #76857

Notes

  • I'm equally happy to explicitly use safe_acos and have the default unsafe (I've used this in the past), but given how often this mistake is made by contributors and users (basically nearly everywhere), I'd be inclined to think making the default safe might be wise
  • I've gone for both explicit versions (safe and unsafe) to make it clear in calling code whether clamping has taken place, as well as wrapping the default to the safe version. Would be happy to change if anyone has better suggestions for how to approach.
  • We could also consider replacing all acos and asin use in the core engine with explicit versions, to make it obvious to the reader / writer the problem, and the choice of the appropriate version.

@AThousandShips
Copy link
Member

AThousandShips commented May 9, 2023

The documentation should be updated for this I believe, GDScript uses the plain, i.e. now safe, acos/asin

@AThousandShips
Copy link
Member

AThousandShips commented May 9, 2023

Not sure if it makes much sense to add the unsafe version to GDScript, unless people specifically depend on the behaviour outside the defined range, as the performance benefits are likely outweighed by the general overhead, at least at present

If anything I'd say to use the unsafe version there and suggest using the clamp method in the documentation, or adding specifically safe_acos/asin to GDScript regardless of if we go by the safe or unsafe by default in the engine

I'd vote for making the default in the engine safe, and if there are any places that rely on the behaviour outside the defined range that'd be resolved by using the unsafe one there, or as stated for performance critical cases

@lawnjelly lawnjelly marked this pull request as ready for review May 9, 2023 13:38
@lawnjelly lawnjelly requested review from a team as code owners May 9, 2023 13:38
@Calinou Calinou added this to the 4.1 milestone May 9, 2023
core/math/math_funcs.h Outdated Show resolved Hide resolved
Copy link
Member

@clayjohn clayjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I agree that the version exposed to GDScript should be the safe version as managing NaNs in GDScript can be a huge pain (not to mention the fact that many of our users are unaware of the risk of NaNs appearing in their code)

@lawnjelly
Copy link
Member Author

Have switched the implementation to @aaronfranke 's suggestion, and I'm tired, so that could do with a double check before merging. 😄

@AThousandShips
Copy link
Member

CI should be fixed by #76885, currently broken due to missing documentation

Comment on lines +83 to +88
static _ALWAYS_INLINE_ double asin(double p_x) { return safe_asin(p_x); }
static _ALWAYS_INLINE_ float asin(float p_x) { return safe_asin(p_x); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the intent to have both an explicit safe_asin and the main asin which is safe by default, but having two synonymous methods just means that we'll introduce inconsistency across the codebase where some contributors use asin and others use safe_asin.

Copy link
Member Author

@lawnjelly lawnjelly May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tricky one. I added the explicit safe_acos to replace parts of code that were already clamping. This signifies areas where the author / bugfix has shown clamping to be absolutely necessary.

In general there are three cases rather than two:

  1. You absolutely know you must clamp (safe_acos)
  2. You aren't sure, and not a bottleneck, so might as well (acos)
  3. You have determined the Nan condition can't happen and / or this is a bottleneck and Nans are dealt with (unsafe_acos)

We could alternatively use a comment to specify that a clamp is necessary instead of safe_acos (1), but having all three is clear and easily searchable.

I can adjust the PR to use comments to distinguish (1) and (2) but imo I would caution against it. What happens when you later come to optimize a function that uses acos? You now no longer know whether it must be safe or not (unless you re-analyse the code, which may be complex for math) - has someone missed a comment etc. There is essentially a danger of e.g. 10 different programmers examine the same bit of code in the following years and waste time trying to work out the intent again and again, when the code just could have been explicit from day 1.

I can add some comment explanation to the math_funcs.h file to make it clearer when to use each version, as a guide for contributors. 🤔

The other alternative which I may have mentioned earlier, is to disallow the generic acos and force all (core) use to be explicit. This is a compromise but you do lose a little information, but might be viable. We can't easily remove the generic to prevent compat breaking in modules etc.

Anyway I am happy to modify the PR if broad opinion favours one of the other options, but just want to caution that "simplifying" potentially has a cost in terms of future maintenance / bug prevention. 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might benefit from some deeper analysis of the uses of the trigonometric functions, from the case in #8111 there might be reasons to use atan2 instead in situations where it is possible, i.e. where a well defined x and y can be found, I think fixing the immediate problem of NaNs might be sufficient to merge this with a placeholder function and then investigate individual cases to follow up

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not sure about real code) But at least in micro benchmarks the unsafe version is only about 0.5% faster then the branched version (if all inputs are actually in the -1 - 1 range, otherwise the branched version quickly wins), so I am not sure if the unsafe version of these functions is actually necessary at all.

----------------------------------------------------------
Benchmark                Time             CPU   Iterations
----------------------------------------------------------
BenchAsinClamp       30815 ns        30815 ns        22743
BenchAsinBranch      27037 ns        27037 ns        25850
BenchAsinUnsafe      26934 ns        26934 ns        26773

Copy link
Member Author

@lawnjelly lawnjelly May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RedworkDE that's actually a good point. The unsafe version by definition shouldn't be hitting the -1 / +1 limit. I guess we could do this. The other option is that the unsafe version is so rare (not even used as yet) that we could directly call ::acos at the call site with a comment. 🤔

(This is difficult to benchmark because the benchmark with that data will presumably be using perfect branch prediction, and in the real world it is more likely to be mixed instructions.)

The other option is to just use safe version everywhere, and delay this discussion until we actually find a case where profiling shows would benefit from the unsafe(!). 😄

I'll create another simpler PR with just the safe version and will let you guys decide which to go for.

Copy link
Member Author

@lawnjelly lawnjelly May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RedworkDE comparing unsafe to the branching, I'm getting unsafe being around:
4% faster (when some are outside range)
10% faster (when all inside range)

This probably depends very much on the benchmark / CPU / compiler / compiler options etc. But in general the clamped version shouldn't be that much slower (and in practice, often cache is the bottleneck rather than these instructions).

Also, as ever, benchmarking is notoriously easy to get wrong 😄 , yours may be more reliable if you are using google benchmark, I'm doing this in a test app.

A common bug with using acos and asin is that input outside -1 to 1 range will result in Nan output. This can occur due to floating point error in the input.

The standard solution is to provide safe_acos function with clamped input. For Godot it may make more sense to make the standard functions safe, and add extra unsafe functions for rare occasions when needed.
@kleonc
Copy link
Member

kleonc commented May 10, 2023

What about (the same for acos):

  • leaving the current implementation of Math::asin unchanged (calling ::asin) so it will still be unsafe,
  • adding a safe version like Math::asin_clamped (I think it would be more discoverable than Math::safe_asin; also more explicit),
  • changing the VariantUtilityFunctions::asin to use Math::asin_clamped so the binded asin would be safe

? 🤔

@lawnjelly
Copy link
Member Author

The name safe_acos is widely used, although less discoverable alphabetically, so there's some history here:
https://www.google.com/search?q=safe_acos

One problem we are trying to address is that most contributors (rather than just users) are unaware / forget the problem, as evidenced by the number of calls that don't use clamping in the codebase. If we continue to have the default be unsafe, then the number of such bugs will inevitably rise again.

Thus the argument for making safety the default (for core as well as bound), and having contributors make a conscious decision not to use the safe version, rather than the other way around. Especially as the unsafe version is probably only really of use for bottlenecks.

Copy link
Member

@kleonc kleonc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

(Fine to me if it's preferred over #76906, both make sense.)

@lawnjelly
Copy link
Member Author

Closing this in favour of #76906 , as that version seems to be far preferred in the poll.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MotionCollision get_angle() returns NAN
8 participants