Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add single source of truth for absolute and relative differences to Mantid::Kernel::FloatingPointComparison #37931

Merged
merged 3 commits into from
Sep 17, 2024

Conversation

rboston628
Copy link
Contributor

@rboston628 rboston628 commented Sep 6, 2024

Description of work

There are many areas of the code which call for comparing two numbers within a tolerance. This is the first pass to standardizing and optimizing those operations.

Summary of work

Within Mantid::Kernel there exists an underused FloatingPointComparison module which has methods for comparing floating point values accounting for machine precision issues.

This seemed like a good place to add more functions, which can compare floating point values to within a specified tolerance.

There are two ways of comparing numbers: by an absolute difference, or a relative difference.

A comparison by absolute difference is straight forward: check $|x_2 - x_1| \leq \delta$.

However, comparison by relative difference is more complex: check $|x_2 - x_1|/(0.5*(x_2+x_2)) \leq \delta$, which, if not done intelligently, can require unneeded FLOPs wasting time and energy (especially if done over hundreds of bin values over thousands of spectra).

An optimized method of performing this check has several return-early situations that prevent the full calculation. Further, the calculation can be optimized by avoiding the division in favor of an equivalent check with multiplication.

Purpose of work

There are many areas of the code which call for comparing two numbers within a tolerance. This can have unexpected results for NaNs, and for relative comparisons takes some thought to implement optimally. This was seen in Issue #37877 , which apparently tried an optimized relative difference that was errant, and had repercussions for other areas of the code.

This work is meant to centralize optimized versions of absolute and relative difference comparators. Future optimizations may be possible. This gives one place for the optimizations, that the rest of the code should be setup to refer to.

There is no associated issue.

EWM 7196

Further detail of work

For handling the comparison of NaN, it was decided to use the framework for isclose in python, described in PEP 485. This follows closely to IEEE 754 treatment of NaNs.

In practical terms, NaN is not considered close to any value, not even NaN.

To test:

The unit tests should be very convincing.

Ensurethat the NaN cases considered enforce the correct behavior in comparisons.

The functionality inside FloatingPointComparison is currently only used in PolygonEdge, so this is a relatively safe PR.


Reviewer

Please comment on the points listed below (full description).
Your comments will be used as part of the gatekeeper process, so please comment clearly on what you have checked during your review. If changes are made to the PR during the review process then your final comment will be the most important for gatekeepers. In this comment you should make it clear why any earlier review is still valid, or confirm that all requested changes have been addressed.

Code Review

  • Is the code of an acceptable quality?
  • Does the code conform to the coding standards?
  • Are the unit tests small and test the class in isolation?
  • If there is GUI work does it follow the GUI standards?
  • If there are changes in the release notes then do they describe the changes appropriately?
  • Do the release notes conform to the release notes guide?

Functional Tests

  • Do changes function as described? Add comments below that describe the tests performed?
  • Do the changes handle unexpected situations, e.g. bad input?
  • Has the relevant (user and developer) documentation been added/updated?

Does everything look good? Mark the review as Approve. A member of @mantidproject/gatekeepers will take care of it.

Gatekeeper

If you need to request changes to a PR then please add a comment and set the review status to "Request changes". This will stop the PR from showing up in the list for other gatekeepers.

@jclarkeSTFC
Copy link
Contributor

As well as the unit tests, I would suggest running the system tests on all OS's by using the build_packages_from_branch Jenkins job (just the build and test, not the package build bit).

@rboston628 rboston628 added this to the Release 6.12 milestone Sep 9, 2024
@rboston628 rboston628 marked this pull request as ready for review September 13, 2024 20:10
@rboston628
Copy link
Contributor Author

I kicked off a build_packages_from_branch job as requested. I think I selected the correct options to just run the build and test and not the package bit.

return true;
} else {
// otherwise we have to calculate the denominator
T const denom = static_cast<T>(0.5 * (std::abs(x) + std::abs(y)));
Copy link
Member

@peterfpeterson peterfpeterson Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of the two values being on opposite sides of zero, this gives a very different value than the average of the two

    T const denom = static_cast<T>(0.5 * std::abs(x + y));

Is that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does lead to different values. I'm pretty convinced that the average of the absolute values is the correct norm to use. I am following what is used, for instance, in TableColumn,h. It's also safer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is using the max of the absolute values. That is what is used in python's math.isclose(). In the case of a positive and negative, the max and average absolute value will remain similar, while the absolute value of the average could be zero.

E.g. x=-2, y=3.

method rel. diff.
max(|x|,|y|) 5/3 = 2.66
avg(|x|, |y|) 5/2.5 = 2
|avg(x, y)| 5/1 = 5

E.g. x=-10, y=10.

method rel. diff.
max(|x|,|y|) 20/10 = 2
avg(|x|, |y|) 20/10 = 2
|avg(x, y)| 20/0 = +inf

I'm willing to try the max instead, but that might change some expected behavior

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel sufficiently schooled.

@rboston628
Copy link
Contributor Author

rboston628 commented Sep 16, 2024

I kicked off a build_packages_from_branch job as requested. I think I selected the correct options to just run the build and test and not the package bit.

For some reason it passed on Linux and Windows, but failed on Mac. Going to investigate now.

There was only one failing test, and the failure didn't seem to have anything to do with changes made here. Tried to re-run that test, but it fails as soon as it starts for a credential issue.

Using the build package action instead, on just the osx platform.

Looks like it was a bug that was fixed after I kicked off the jobs in PR #37972

Copy link
Contributor

@jclarkeSTFC jclarkeSTFC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good, I had a go at the 3D view to see if I could spot any weird polygon behaviour, but looked okay. I just had a couple of points:

  • Were you thinking of putting these methods in the Python API? If not, then I think the release note isn't needed.
  • I'm not suggesting that you go back and change this now, but I don't think it's a good idea for existing files to have their const style changed when they are edited. Because otherwise the next person will come along, do something in this class and feel that they can change them all back again because they prefer the other way.

@peterfpeterson
Copy link
Member

@jclarkeSTFC I generically agree with you on release notes, but this is a functionality that we'd like to get more developers using. The release notes can be for them too.

When we move to a newer clang-format (>=14), east const vs west const becomes a configuration option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants