Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More efficient bisection for 1D Newton root finder #1012

Open
wants to merge 22 commits into
base: develop
Choose a base branch
from

Conversation

ryanelandt
Copy link
Contributor

@ryanelandt ryanelandt commented Aug 5, 2023

This is chunk #4 for #1000. Also closes #1008.

This PR adds tools that enable efficient bisection for the 1D Newton root finder.

The Situation

The 1D Newton root finder has a bisection fallback mechanism. If one of the bounds is large (i.e., std::numeric_limits::max()), naively bisecting the bounds is slow. For example, for type double, bisecting from 1.79769e+308 to 1 takes 1024 iterations. The situation is even worse for types with more precision. When the Newton solver runs out of (e.g., 200) iterations during bisection, the solver declares success even if far from a root. Again, see issue #1008.

Functionality

What's the maximum number of bisections needed to root find a double? A double is 64 bits. So the maximum number of iterations needed is 64. Efficient bisection is bisection in bit space. An interesting consequence of bisection in bitspace, is that infinity is a bisectable bound.

For types that do not conform to the standard (i.e., IEEE 754), an approximate version of the bisection algorithm above is used. The approximate version supports bisection with infinity if the type is specialized (i.e., if std::numeric_limits<T>::is_specialized = true).

Fun Facts

The mean of 0 and infinity is:

  • infinity (arithmetic mean)
  • NaN (geometric mean)
  • 1.5 (bitspace mean for double and float)
  • 1.0 (bitspace mean for long double using the approximate algorithm)

@ryanelandt ryanelandt mentioned this pull request Aug 5, 2023
include/boost/math/tools/roots.hpp Outdated Show resolved Hide resolved
Comment on lines 320 to 321
static_assert(!std::is_same<T, float>::value, "Need to use Midpoint754 solver when T is float");
static_assert(!std::is_same<T, double>::value, "Need to use Midpoint754 solver when T is double");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an assertion for long double as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an explanation as to why long double does not use the Midpoint754 solver.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A guess there are a couple cases:

  1. On some systems (e.g. Windows and MacOS) long doubles are just aliases to double. When long doubles are 64-bits we could just cast to double and use Midpoint754 to speed things up.

  2. On x86_64 linux systems you have the 80-bit long double. Normally you have unsigned __int128 but you'd have to shift the bits around to cast it back properly. Likely not worth the magic required.

  3. On systems with IEEE 128-bit long doubles (e.g. ARM and S390x) you have access to unsigned __int128 which should generally work. We have both of the aforementioned architectures running natively in the CI system.

Here is where I have defined macros before for determining the size of long double.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On some systems (e.g. Windows and MacOS) long doubles are just aliases to double. When long doubles are 64-bits we could just cast to double and use Midpoint754 to speed things up.

These systems will use the specialization for double.

On x86_64 linux systems you have the 80-bit long double. Normally you have unsigned __int128 but you'd have to shift the bits around to cast it back properly. Likely not worth the magic required.

I tried to make this work. Starting with numeric_limits<long double>::max() and adding 1 bit gives NaN, not Inf as it does for float, double and __float128. It's just not meant to be.

On systems with IEEE 128-bit long doubles (e.g. ARM and S390x) you have access to unsigned __int128 which should generally work. We have both of the aforementioned architectures running natively in the CI system.

I got this to work for both _float128 and boost::multiprecision::float128 (wrapper for __float128). The implementation does not require libquadmath.

include/boost/math/tools/roots.hpp Outdated Show resolved Hide resolved
include/boost/math/tools/roots.hpp Outdated Show resolved Hide resolved
include/boost/math/tools/roots.hpp Show resolved Hide resolved
include/boost/math/tools/roots.hpp Outdated Show resolved Hide resolved
include/boost/math/tools/roots.hpp Outdated Show resolved Hide resolved
}

template <typename U = T>
static typename std::enable_if<std::numeric_limits<U>::is_specialized, U>::type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may be able to get away without checking has_denorm because it was deprecated in C++23

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to be able to bisect with denormals because the solution to the root finding problem could be a denormal number.

Reading the depreciation document, it seems like the depreciation of has_denorm has a lot to do with it being a static constexpr. This is problematic because the actual behavior can change at runtime with e.g., _MM_SET_FLUSH_ZERO_MODE.

I think we can actually delete std::numeric_limits::min() because denorm_min returns numeric_limits::min() if denormals are off. Does that seen sensible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can actually delete std::numeric_limits::min() because denorm_min returns numeric_limits::min() if denormals are off. Does that seen sensible?

I think so. Since you are already check for is_specialized you should avoid ever getting 0 instead of a finite minimum value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. Since you are already check for is_specialized you should avoid ever getting 0 instead of a finite minimum value.

boostorg/multiprecision#562

test/test_roots.cpp Show resolved Hide resolved
test/test_roots.cpp Outdated Show resolved Hide resolved
@mborland
Copy link
Member

mborland commented Aug 7, 2023

@NAThompson I think it would be worthwhile for you to take a look at this as well, availability dependent of course.

@ryanelandt
Copy link
Contributor Author

@mborland thank you for the review. Please let me know if there are additional changes.

@ryanelandt
Copy link
Contributor Author

I added IEEE754 bisection support for __float128. I also updated the reason why the 80 bit long double cannot use the IEEE754 solver. Please take another look when you can.

@ryanelandt ryanelandt requested a review from mborland August 19, 2023 01:52
@ryanelandt
Copy link
Contributor Author

I realized that the code overlap between what is needed for find the midpoint between two IEEE754 floats and the code needed to find the fast float distance between two IEEE754 floats is high. I'm in the process of refactoring this functionality into a few classes that when operated on allow the following operations to be performed in a few lines of code: midpoint, float_distance, and float_advance. I'm putting this PR on hold until I can get this done.

@ryanelandt ryanelandt closed this Aug 22, 2023
@ryanelandt ryanelandt reopened this Aug 22, 2023
@ryanelandt
Copy link
Contributor Author

The introduced tools in ieee754_linear lower the programming effort required to perform bit-themed operations on IEEE754 types that include: float, double, and __float128. Runtime-detected value of denorm support are used in calculations. These tools were used to calculate the midpoint in bitspace for the above mentioned types. The tools can be applied in a slightly different way to calculate fast float distance:

      template <typename T, typename U>
      static T fast_float_distance(T x, T y, boost::math::tools::detail::ieee754_linear::Layout_IEEE754Linear<T, U>) {
         using boost::math::tools::detail::ieee754_linear::BitsInflated;
         using boost::math::tools::detail::ieee754_linear::Denormflator;
         const auto df = Denormflator<T, U>();  // Calculates if `has_denorm` is true at this instant

         // Step 1 -- Calculate inflated representations
         const BitsInflated<T, U> y_inflated(y);
         const BitsInflated<T, U> x_inflated(x);

         // Step 2 -- Calculate deflated representations
         const auto y_def = df.deflate(y_inflated);
         const auto x_def = df.deflate(x_inflated);

         // Step 3 -- Subtract the two deflated representations
         const auto xy_def_distance = (y_def - x_def)

         // Step 4 -- The distance between the deflated points is the fast float distance
         return xy_def_distance.static_cast_int_value_to_float();
      }

Or it can be used to perform the float advance operation:

      template <typename T, typename U>
      static T float_advance(T x, int n, boost::math::tools::detail::ieee754_linear::Layout_IEEE754Linear<T, U>) {
         using boost::math::tools::detail::ieee754_linear::BitsInflated;
         using boost::math::tools::detail::ieee754_linear::Denormflator;
         const auto df = Denormflator<T, U>();  // Calculates if `has_denorm` is true at this instant

         // Step 1 -- Calculate inflated representation
         const BitsInflated<T, U> x_inflated(x);

         // Step 2 -- Calculate deflated representation
         const auto x_def = df.deflate(x_inflated);

         // Step 3 -- Move over n bits
         const auto x_plus_n_def = x_def + n;

         // Step 4 -- Inflate
         const auto x_plus_n = df.inflate(x_plus_n_def);

         // Step 5 -- Convert to float
         return x_plus_n.reinterpret_as_float();
      }

@mborland
Copy link
Member

If you are looking for a different quick way to calculate float distance @NAThompson showed me this trick: https://github.com/boostorg/math/pull/814/files#diff-45f56077fb4aa15ca1aca5025321e5c275fd20b3e161585a25442eaa2c7952a3R724

@ryanelandt
Copy link
Contributor Author

The same idea of viewing floats in bitspace is used here. Seeing that trick used in #814 inspired this refactor. In addition, this implementation attempts to:

  • augment calculations for denorm support as detected at runtime
  • operate on numbers with different signs
  • use template parameters to allow it to work for a variety of types (e.g., std::float32_t and std::float64_t, or even boost::math::std_real_concept if it's set up to emulate a float type).



#ifdef BOOST_HAS_FLOAT128
#include <boost/multiprecision/float128.hpp>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this breaks standalone mode as BOOST_HAS_FLOAR128 may be set but the multiprecision headers will not be available. We also try to avoid mutual/cyclic dependencies like these. What is the include actually used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch with the dependency. I've removed this include and replaced it with the following includes:

#include <boost/math/tools/config.hpp>  // For BOOST_MATH_FLOAT128_TYPE
#include <boost/cstdfloat.hpp>  // For numeric_limits support for 128 bit floats

There may be a better way of including the needed files that I'm unaware of.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This include, or one like it appears to be necessary in order to pass the std_real_concept_check_128 test associated with the file compile_test/std_real_concept_check.cpp.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you've changed something in that test, it uses nothing from multiprecision at all so the include should not be needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As stated in a previous comment, I've removed the header #include <boost/multiprecision/float128.hpp> and replaced it with #include <boost/cstdfloat.hpp> // For numeric_limits support for 128 bit floats. The std_real_concept_check_128 test needs for std::numeric_limits specializations to be defined for the BOOST_MATH_FLOAT128_TYPE. If this header is removed, then test will not compile. (I tried this)

Why this occurs: the std_real_concept_check_128 test makes the std_real_concept emulate a 128 bit floating point type. The tools in ieee754_linear recognize this as an IEEE 754 floating point type. The comment on line 133 goes on to say:

// NOTE: returns Layout_IEEE754Linear<BOOST_MATH_FLOAT128_TYPE, ...> instead of Layout_IEEE754Linear<T, ...>
//       in order to work with non-trivial types that wrap trivial 128 bit floating point types.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see the issue, but I think that we're looking at this the wrong way:

  1. Not depending on boost/cstdfloat.hpp is a "good thing" because while that header is genuinely very useful, it's also rather fragile to changes in GCC releases.
  2. I think the assumptions in IsFloat128 (and maybe elsewhere) are wrong: just because a type has the correct size and properties, does NOT mean it's a trivial wrapper around a native type. It could for example be a pointer (or two) to the actual implementation which just happens to have the same size. mpfr_t is 24 bytes on my system so doesn't quite qualify, but if I'm reading it correctly, MPFR could be configured in a such a way that it was a 16 bye / 128 bit type which was set up to emulate a 128 bit float. Another way of looking at this is that std_real_concept_type and the other concept checking types should be viewed as black boxes - their purpose is to emulate to some degree or another, some other random type about which we know nothing that the library may be instantiated with in the future. You may be able to actually see their internals, but you really shouldn't look ;)

So I would change

   class IsFloat128 {
   public:
      template <typename T>
      static constexpr bool value() {
         return std::numeric_limits<T>::is_iec559 &&
                std::numeric_limits<T>::radix == 2 &&
                std::numeric_limits<T>::digits == 113 &&  // Mantissa has 112 bits + 1 implicit bit
                sizeof(T) == 16;
      }
   };

To maybe:

   class IsFloat128 {
   public:
      template <typename T>
      static constexpr bool value() {
         return std::numeric_limits<T>::is_iec559 &&
                std::numeric_limits<T>::radix == 2 &&
                std::numeric_limits<T>::digits == 113 &&  // Mantissa has 112 bits + 1 implicit bit
#ifdef BOOST_HAS_FLOAT128
                (std::is_floating_Point<T>::value || std::is_same<__float128, T>::value)
#else
                std::is_floating_Point<T>::value
#endif
      }
   };

Likewise for the other IsFloatXX classes.

Then change:

   template <typename T, typename U>
   class Layout_IEEE754Linear {
      static_assert(std::numeric_limits<T>::is_iec559, "Type must be IEEE 754 floating point.");
      static_assert(std::is_unsigned<U>::value, "U must be an unsigned integer type.");
      static_assert(sizeof(T) == sizeof(U), "Type and uint size must be the same.");

   public:
      using type_float = T;  // The linear floating point type
   };

To:

   template <typename T, typename U>
   class Layout_IEEE754Linear {
      static_assert(std::numeric_limits<T>::is_iec559 
#ifdef BOOST_HAS_FLOAT128
       || std::is_same<T, __float128>::value
#endif 
       , "Type must be IEEE 754 floating point.");
      static_assert(std::is_unsigned<U>::value, "U must be an unsigned integer type.");
      static_assert(sizeof(T) == sizeof(U), "Type and uint size must be the same.");

   public:
      using type_float = T;  // The linear floating point type
   };

Which would allow the header dependency to be removed as well. I do appreciate it's less clean, but I suspect it's a safer option in the long run!

Copy link
Contributor Author

@ryanelandt ryanelandt Sep 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not depending on boost/cstdfloat.hpp is a "good thing" because while that header is genuinely very useful, it's also rather fragile to changes in GCC releases.

Does the following change make things less fragile? Changing the include from:

#include <boost/cstdfloat.hpp>

to

#if defined(BOOST_HAS_INT128) && defined(BOOST_HAS_FLOAT128)
#if !defined(BOOST_CSTDFLOAT_NO_LIBQUADMATH_LIMITS)
  #include <boost/math/cstdfloat/cstdfloat_limits.hpp>
#endif
#endif

I think the assumptions in IsFloat128 (and maybe elsewhere) are wrong: just because a type has the correct size and properties, does NOT mean it's a trivial wrapper around a native type. It could for example be a pointer (or two) to the actual implementation which just happens to have the same size. [...] MPFR could be configured in a such a way [...] to emulate a 128 bit float.

For the value() function of any of the IsFloat classes to return true, the following statement needs to be true:

std::numeric_limits<T>::is_iec559

The 32, 64, and 128 bit IEEE floats have a specific layout consisting of a sign bit, an exponent, and a fraction. A number type containing two pointers will not meet this standard. Skimming the MPFR codebase, this library is clearly set up with an awareness of IEEE 754 arithmetic. They don't appear to specialize numeric limits, and therefore don't set std::numeric_limits<T>::is_iec559 = true. I think the check that std::numeric_limits<T>::is_iec559 = true provides reasonable protection against accidental use of this functionality.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup apologies, you're correct that std::numeric_limits<T>::is_iec559 should provide a reasonable check given that that std defines a binary layout.

I would still prefer to avoid the cstdfloat includes if possible though - and we know that __float128 is an iec559 compatible type, so including it for that one case is overkill when we can just use is_same<T, __float128>::value ?

Other than that and one minor nit about inclusion this looks good to go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a good way to do this. The code operates by duck typing a float type T. For example, something is determined to be a 32 bit binary floating point type based on the value of IsFloat32::value<T>(). Something is determined to be such a type if:

std::numeric_limits<T>::is_iec559 &&
std::numeric_limits<T>::radix == 2 &&
std::numeric_limits<T>::digits == 24 &&  // Mantissa has 23 bits + 1 implicit bit
sizeof(T) == 4;

evaluates to true. If any of the four conditions above cannot be evaluated, then the code will fail to compile. The include is needed for it to compile, even if the method were to be unused.


The alternative way of structuring this code is to manually match up binary floating point types with the appropriate methods, but this is messy and tends to miss valid use cases. is_same<T, __float128>::value might work on GCC, but what about an Intel compiler?


I'd like to avoid this include here too as it shouldn't be necessary. Perhaps the issue occurs here:

#ifdef BOOST_MATH_USE_FLOAT128
//
// Only enable this when the compiler really is GCC as clang and probably
// intel too don't support __float128 yet :-(
//
# if defined(__INTEL_COMPILER) && defined(__GNUC__)
# if (__GNUC__ > 4) || ((__GNUC__ == 4) && (__GNUC_MINOR__ >= 6))
# define BOOST_MATH_FLOAT128_TYPE __float128
# endif
# elif defined(__GNUC__)
# define BOOST_MATH_FLOAT128_TYPE __float128
# endif
# ifndef BOOST_MATH_FLOAT128_TYPE
# define BOOST_MATH_FLOAT128_TYPE _Quad
# endif
#endif

where a 128 bit floating point type is defined without defining std::numeric_limits for that type.

#include <boost/math/tools/toms748_solve.hpp>
#include <boost/math/policies/error_handling.hpp>

#ifdef BOOST_HAS_FLOAT128
#include <boost/multiprecision/float128.hpp>
using boost::multiprecision::float128;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, we shouldn't be forcing multiprecision headers to be included here, plus the using declaration at global header scope is evil ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header and using declaration was unneeded. Good catch.

#include <boost/multiprecision/float128.hpp>
#endif

#if __has_include(<stdfloat>)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate, but __has_include is an insufficient check, because it may pass even when the translation mode is not C++23, so we will need to check the value of BOOST_CXX_VERSION as well (unfortunately __cplusplus is non-reliable for MSVC). I yes, must get around to adding proper Boost.Config macros and tests for these new C++23 headers sometime!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The include pattern:

#if __has_include(<stdfloat>)
#  include <stdfloat>
#endif

occurs in include/boost/math/tools/promotion.hpp, include/boost/math/concepts/real_concept.hpp, and about 30 test files including test/jacobi_test.cpp. I think it makes sense to allow it here for consistency. If required it should be fixed across the codebase in a separate PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This came up here: #1023 and the headers were fixed at that time. I hadn't realised this had been sprinkled across all the tests as well :(

The only remaining header I can see which is incorrect is here:

# if __has_include(<bit>)
@mborland ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be desirable for the pattern:

#if defined __has_include
#  if __cplusplus > 202002L || _MSVC_LANG > 202002L 
#    if __has_include (<X>)
#    include <X>
#    endif
#  endif
#endif

to me made into a macro similar to

INCLUDE_CPP23_HEADER(X)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but you still need to be able to determine whether the header has been included or not (ie whether you can use those contents or not), so the normal method is:

#ifndef BOOST_NO_CXX23_STDFLOAT
#include <stdfloat>
#endif 



#ifndef BOOST_NO_CXX23_STDFLOAT

// do something that uses the header

#endif 

I'm working on the C++23 macros for Boost.Config now...

Copy link
Contributor Author

@ryanelandt ryanelandt Sep 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach described above typically applies when one is manually defining methods. A manually defined method will only compile if certain headers have been included. For example, one might manually define a method for type __float128, another for _Quad, and another for std::float128_t. Careful preprocessor use is required because certain methods will only compile if the correct headers have been included.

In contrast to this approach, this approach relies on duck typing. In C++ duck typing happens at compile time and is implemented with templates and SFINAE. It also allows a single templatized method to be used for various types. For example, the 128 bit type is used by the following types: __float128, _Quad, std::float128_t, multiprecision::float128, as well as std_real_concept when it's set up to emulate a 128 bit float. I may be missing some, but the compiler will not.

The benefit of the duck type approach that the code only needs to compile for the template parameters that are actually used. For this reason, the approach does not require the careful preprocessor use of the manual approach.

Also, thank you for working on the C++23 macros.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1D Newton iteration count issue(s)
3 participants