-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aal::aal_arm implements tick for the 64 bits flavor. #564
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. I'm a bit surprised to see you needed to do something like this.
Our Arm AAL includes NoCpuCycleCounters
in its aal_features
, so
snmalloc/src/snmalloc/aal/aal.h
Lines 156 to 185 in a060462
/** | |
* Return an architecture-specific cycle counter. | |
* | |
* If the compiler provides a portable prefetch builtin, use it directly, | |
* otherwise delegate to the architecture-specific layer. This allows new | |
* architectures to avoid needing to implement a custom `tick` method | |
* if they are used only with a compiler that provides the builtin. | |
*/ | |
static inline uint64_t tick() noexcept | |
{ | |
if constexpr ( | |
(Arch::aal_features & NoCpuCycleCounters) == NoCpuCycleCounters) | |
{ | |
auto tick = std::chrono::high_resolution_clock::now(); | |
return static_cast<uint64_t>( | |
std::chrono::duration_cast<std::chrono::nanoseconds>( | |
tick.time_since_epoch()) | |
.count()); | |
} | |
else | |
{ | |
#if __has_builtin(__builtin_readcyclecounter) && \ | |
!defined(SNMALLOC_NO_AAL_BUILTINS) | |
return __builtin_readcyclecounter(); | |
#else | |
return Arch::tick(); | |
#endif | |
} | |
} | |
}; |
std::chrono::high_resolution_clock
approach both prior to and with this patch. Am I misreading our #if
and if constexpr
maze?
Were we to remove NoCpuCycleCounters
, on aarch64
it seems like we'd be able to use __builtin_readcyclecounter
. Clang lowers that to reading from PMCCNTR_EL0
, which Arm describes as "[h]old[ing] the value of the processor Cycle Counter, CCNT, that counts processor clock cycles" and notes that it is mediated by PMCCFILTR_EL0
. On the other hand, CNTVCT_EL0
is described as "[h]old[ing] the 64-bit virtual count value. The virtual count value is equal to the physical count value minus the virtual offset visible in CNTVOFF_EL2
". Is there a reason the PMCCNTR_EL0
value is not sufficient for your use?
Last, while this is purely cosmetic at the moment, I'd prefer the #if
conditional to be on __aarch64__
rather than SNMALLOC_VA_BITS_64
.
That's the point I am conceding, I missed this flag indeed.
Because, as you mentioned, the offset is already removed.
Hm, I do not really see the point, it will make it just inconsistent with the rest. |
e9a60d6
to
a67ce3d
Compare
Changes have been done, and @nwf-msr is OOF
No description provided.