-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
help compiler folding for divideFloor/moduloFloor #4512
Conversation
cf13472
to
6d444bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments about possible simplifications and C++20-related usage
lib/bitrange.h
Outdated
* | ||
* This function is declared constexpr to allow it to be used in tests to optimize code generation. | ||
*/ | ||
constexpr bool isPowerOfTwo(int value) { return value && !(value & (value - 1)); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the intended behavior wrt negative numbers? This currently returns false
for e.g. -2
, but it takes int
as an argument. If negatives are not expected, I'd go with explicit unsigned
here, or maybe even something like this?
template <typename T, typename = std::enable_if_t<std::is_unsigned_v<T>>>
[[nodiscard]] constexpr inline bool isPowerOfTwo(T value) noexcept {
return (value != 0) && ((value & (value - 1)) == 0);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For divideFloor/moduloFloor, I think it makes sense to restrict the divisor to unsigned numbers. The dividend does need to be signed, but I don't call isPowerOfTwo or log2ForPowerOfTwo on the dividend.
The test code at least has a few tests for negative divisors, so I can't restrict divideFloor/moduloFloor to unsigned.
I've switched to absl::has_single_bit, which has forced me to do an explicit type cast when calling that function.
lib/bitrange.h
Outdated
*/ | ||
constexpr bool isPowerOfTwo(int value) { return value && !(value & (value - 1)); } | ||
|
||
constexpr int log2ForPowerOfTwo(int value) { return sizeof(value) * 8 - __builtin_clz(value) - 1; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This explicitly assumes that sizeof(value) == 4
as __builtin_clz
accepts unsigned argument. Also for the clarity I'd change the argument to unsigned
.
In fact, we can use abseil implementations of C++20 <bit
> routines: absl::countl_zero
and their friends. With FIXME to replace the implementation with standard ones.
Instead of isPowerOfTwo
above we could use absl::has_single_bit
.
Even if we'd use our own implementations I'd use C++20 names to ensure we can just drop them in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish I'd known about the Abseil alternatives earlier: would have saved my attempting to roll my own equivalents to the STL functions that exist in C++20 and beyond.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sorry. I would probably need to add some short summary to docs of what is available....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, do you know of a more portable alternative to __builtin_constant_p
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, do you know of a more portable alternative to
__builtin_constant_p
?
Unfortunately, not. I believe the question was raised few times, but there is no direct alternative. Maybe if consteval
from C++23 is the closest alternative.
@asl Do you happen to know what I need to adjust in cmake to get it to build p4ctoolkit (lib directory) correctly? It currently fails with:
|
I think you'd need to do |
f4927e4
to
67ac50b
Compare
lib/bitrange.h
Outdated
* | ||
* This function is declared constexpr to allow it to be used in tests to optimize code generation. | ||
*/ | ||
constexpr unsigned ffs(unsigned value) { return sizeof(value) * 8 - absl::countl_zero(value) - 1; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asl I think this comment was on this line (I force-pushed a few times and it seemed to result in losing the comment in github):
Isn't it countr_zero essentially?
It is as long as value is a power of 2 🙂
I'm only using it in the case where value is a power of 2, so maybe I will switch. But I'll wait until I've got my build issues sorted out.
@@ -93,3 +93,4 @@ add_library(p4ctoolkit STATIC ${LIBP4CTOOLKIT_SRCS}) | |||
|
|||
# Disable libcall (realloc, malloc) optimizations which may cause infinite loops. | |||
set_target_properties(p4ctoolkit PROPERTIES COMPILE_FLAGS -fno-builtin) | |||
target_link_libraries(p4ctoolkit absl::bits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asl Another comment I lost due to force-pushing:
It should not be private as bits are used in the header. If you'd make it PUBLIC, then you'd not need to modify every user (like gtest below)
I copied and pasted from one of the other places where we were depending upon an Abseil library (before changing it to bits). That usage had the PRIVATE, which is why I copied it here originally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, right. In few other places we only depend inside .cpp, so PRIVATE
is a way to reduce visibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this seems to work, is there a "more correct" solution here?
My concern is that I'm trying to fix the problem of the .h file not being present at compile time. Doesn't target_link_libraries
only impact linking: couldn't cmake still try the cpp compilation before the header files are available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't target_link_libraries only impact linking:
Despite the name – it impacts compile time as well. As it pulls the interface include paths and other properties. So, this is the correct way of doing things. See https://cmake.org/cmake/help/latest/manual/cmake-buildsystem.7.html#target-usage-requirements:
Generally, a dependency should be specified in a use of target_link_libraries() with the PRIVATE keyword if it is used by only the implementation of a library, and not in the header files. If a dependency is additionally used in the header files of a library (e.g. for class inheritance), then it should be specified as a PUBLIC dependency. A dependency which is not used by the implementation of a library, but only by its headers should be specified as an INTERFACE dependency. The target_link_libraries() command may be invoked with multiple uses of each keyword:
That seems to do the trick. |
46ecd9c
to
cf1fbd9
Compare
@asl I think I've got it down to a minimal set of changes using Abseil functionality as much as possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
cf1fbd9
to
fbe4fb8
Compare
Add additional test to assist compiler folding to perform these replacements when the divisor is a power-of-two compile-time constant: - divideFloor -> right-shift - moduloFloor -> bitwise and --------- Co-authored-by: Anton Korobeynikov <anton@korobeynikov.info>
fbe4fb8
to
72b535e
Compare
Add additional test to assist compiler folding to perform these replacements when the divisor is a power-of-two compile-time constant:
This is in response to this PR comment from @ChrisDodd:
The proposed solution adds a test that helps the compiler with optimization for a power-of-two compile-time constant:
__builtin_constant_p()
and__builtin_clz
are not supported by all compilers and so are protected by #if guards for gcc/clang.This Compiler Explorer test shows the impact of the extra test on both clang 15.0.0 and gcc 12.3. (These versions match those available on my test system.) The code contains multiple divideFloor/moduloFloor functions: