-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RISC-V support in cycleclock::Now #833
Conversation
The RISC-V implementation of `cycleclock::Now` uses the user-space `rdcycle` instruction to query how many cycles have happened since the core started. The only complexity here is on 32-bit RISC-V, where `rdcycle` can only read the lower 32 bits of the 64-bit hardware counter. In this case, `rdcycleh` reads the higher 32 bits of the counter. We match the powerpc implementation to detect and correct for overflow in the high bits.
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here (e.g. What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
@@ -164,6 +164,21 @@ inline BENCHMARK_ALWAYS_INLINE int64_t Now() { | |||
uint64_t tsc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not directly related to this patch, but this preprocessor hell is such a maze :/
This needs refactoring into separate functions.
asm("rdcycle %0" : "=r"(cycles_lo)); | ||
asm("rdcycleh %0" : "=r"(cycles_hi1)); | ||
// This matches the PowerPC overflow detection, above | ||
cycles_lo &= -static_cast<int64_t>(cycles_hi0 == cycles_hi1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now i'm curious, why is this different from the compiler lowering for READCYCLECOUNTER
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hack matches the powerpc implementation higher in the file.
As far as @asb and I can tell, this clears all the low bits if the upper bits changed between the two reads, which will be approximately correct (you know when the upper bits changed, you overflowed, and the lower bits passed through zero). This avoids branching and looping, which might cause your cycle count to change more than this hack. Both the loop and this hack compromise accuracy, it's not clear which compromises it less, but at least the hack executes in constant time regardless of overflow.
In the LLVM patch, I again copied the PPC lowering, which does use a loop, because that's what people will be expecting, and you know the returned low word is a bit pattern that was present in the cycle
CSR (which you don't know with the hack, think of rdcycle
taking two cycles and starting on an UINT32_MAX
).
Looks ok but the CLA issue needs to be resolved. |
Yeah, it's going through the bureaucracy at work as we speak. Updates when I hear something. |
Reluctant bureaucrat checking in :) I've docusigned the corporate CLA on behalf of lowRISC CIC, I think I now need to await a counter-signed version and perhaps explicitly list contributors afterwards. |
I signed it! |
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
thank you! |
Summary: Fixed by backporting the upstream fix from here: google/benchmark#833 Reviewers: lebedev.ri Reviewed By: lebedev.ri Subscribers: asb, kito-cheng, shiva0217, rogfer01, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64237 llvm-svn: 365610
Summary: Fixed by backporting the upstream fix from here: google/benchmark#833 Reviewers: lebedev.ri Reviewed By: lebedev.ri Subscribers: asb, kito-cheng, shiva0217, rogfer01, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64237 git-svn-id: https://llvm.org/svn/llvm-project/test-suite/trunk@365610 91177308-0d34-0410-b5e6-96231b3b80d8
The RISC-V implementation of `cycleclock::Now` uses the user-space `rdcycle` instruction to query how many cycles have happened since the core started. The only complexity here is on 32-bit RISC-V, where `rdcycle` can only read the lower 32 bits of the 64-bit hardware counter. In this case, `rdcycleh` reads the higher 32 bits of the counter. We match the powerpc implementation to detect and correct for overflow in the high bits.
Credit: adapted from google/benchmark#833
Credit: adapted from google/benchmark#833
Adapted from google/benchmark#833 authored by Sam Elliot at lowRISC. This requires the RISCV kernel to set the CY bit of the mcountern register which is done on Linux, but documenting here in case another OS hits a SIGILL here. When CY bit of the mcounteren register is unset, reading the cycle register will cause illegal instruction exception in the next privilege level ( user mode or supervisor mode ). See the privileged isa manual section 3.1.11 in https://github.com/riscv/riscv-isa-manual/releases/latest
Adapted from google/benchmark#833 authored by Sam Elliot at lowRISC. This requires the RISCV kernel to set the CY bit of the mcountern register which is done on Linux, but documenting here in case another OS hits a SIGILL here. When CY bit of the mcounteren register is unset, reading the cycle register will cause illegal instruction exception in the next privilege level ( user mode or supervisor mode ). See the privileged isa manual section 3.1.11 in https://github.com/riscv/riscv-isa-manual/releases/latest
The RISC-V implementation of
cycleclock::Now
uses the user-spacerdcycle
instruction to query how many cycles have happened since thecore started.
The only complexity here is on 32-bit RISC-V, where
rdcycle
can onlyread the lower 32 bits of the 64-bit hardware counter. In this case,
rdcycleh
reads the higher 32 bits of the counter. We match the powerpcimplementation to detect and correct for overflow in the high bits.