-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RISC-V support #272
Add RISC-V support #272
Conversation
@@ -49,6 +49,7 @@ set(PROCESSOR_IS_ARM FALSE) | |||
set(PROCESSOR_IS_AARCH64 FALSE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gchatelet, should we update the Bazel file in this PR?
|
||
#include "cpuinfo_riscv.h" | ||
|
||
//////////////////////////////////////////////////////////////////////////////// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to move duplicate logic to a generic place for other operating systems
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment there isn't a duplicate logic except for the GetRiscvInfo()
function (just a few lines of code). Do I nevertheless have to move it to another file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
char uarch[64]; // 0 terminated string | ||
char vendor[64]; // 0 terminated string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should consider API that will be consistent with other operating systems, what about FreeBSD and possible future operating systems? Can we get this info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about other os compatibility. The more general way would be using misa
instruction but as we have seen in #247 it's not possible unless you have machine privileges. Do you have any suggestion? We could define custom codes for each uarch and vendor and fill them accordingly with informations retrieved from the os
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that we should define only common info, and avoid features which on certain operating systems will return empty results.
for example, if both Linux and FreeBSD return hart
from the cpuinfo file, then we can add to our functionality, but if only one Linux/FreeBSD returns some info and the other doesn't provide that info we should avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your parser of the RISC-V ISA string is incorrect, because the ISA string can contain multi-letter extensions, such as "_zbc". Simply searching for single letters within the ISA string won't work, you need to parse it to find out where the multi-letter extensions begin.
Ok @michael-roe, Do I have to support these extensions as well? |
Probably ought to support all the user space ones. The supervisor state ones (starting with S) are less relevant to user space programs that call cpu_features, so can probably leave those out. |
value.size-=2; | ||
for (size_t i = 0; i < RISCV_LAST_; ++i){ | ||
tmp_sv.ptr = kCpuInfoFlags[i]; | ||
tmp_sv.size = strlen(kCpuInfoFlags[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you using libc functions like strlen
, strcpy
? According to documentation cpu_features is suitable for implementing fundamental libc functions like malloc, memcpy, and memcmp
. Hence we use our implementation, see src/copy.inl
, src/equals.inl
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gchatelet, it seems I misunderstood. Is this the right point? Since I found several codes where memmove
and strlen
are used:
https://github.com/google/cpu_features/blob/main/src/stack_line_reader.c#L59
https://github.com/google/cpu_features/blob/main/include/internal/string_view.h#L47
https://github.com/google/cpu_features/blob/main/src/filesystem.c#L46
https://github.com/google/cpu_features/blob/main/src/filesystem.c#L57
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It used to be strongly enforced but it is not anymore as it is hard to prevent the compiler from generating calls to these functions. That is, even if the code contains no call to C
memory functions, the compiler may decide to generate them (e.g., https://godbolt.org/z/h5KKM7MEj)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we provide info to REAMDE that even with compiler flags such as freestanding
no-stdlib
no-builtin
there is a possibility that the compiler will generate C functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I think so, here is an example where the compiler generates a call even with all the compiler flags https://godbolt.org/z/rTTfzEq15
In fact, compilers require some C functions to exist.
StringView key, value; | ||
if (CpuFeatures_StringView_GetAttributeKeyValue(line, &key, &value)) { | ||
if (CpuFeatures_StringView_IsEquals(key, str("isa"))) { | ||
StringView tmp_sv; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(CpuFeatures_StringView_StartsWith(value, "rv") == false) continue;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(!CpuFeatures_StringView_StartsWith(value, "rv")) continue;
?
if (CpuFeatures_StringView_IsEquals(key, str("isa"))) { | ||
StringView tmp_sv; | ||
value.ptr+=2; // ignore 'rv' prefix at the beginning | ||
value.size-=2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value = CpuFeatures_StringView_PopFront(value, 2)
value.ptr+=2; // ignore 'rv' prefix at the beginning | ||
value.size-=2; | ||
for (size_t i = 0; i < RISCV_LAST_; ++i){ | ||
tmp_sv.ptr = kCpuInfoFlags[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StringView tmp_sv = str(kCpuInfoFlags[i]);
tmp_sv.ptr = kCpuInfoFlags[i]; | ||
tmp_sv.size = strlen(kCpuInfoFlags[i]); | ||
kSetters[i](&info->features, | ||
CpuFeatures_StringView_IndexOf(value, tmp_sv) != -1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract a variable:
bool is_set = CpuFeatures_StringView_IndexOf(value, tmp_sv) != -1;
kSetters[i](&info->features, is_set);
value.size-=2; | ||
for (size_t i = 0; i < RISCV_LAST_; ++i){ | ||
tmp_sv.ptr = kCpuInfoFlags[i]; | ||
tmp_sv.size = strlen(kCpuInfoFlags[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It used to be strongly enforced but it is not anymore as it is hard to prevent the compiler from generating calls to these functions. That is, even if the code contains no call to C
memory functions, the compiler may decide to generate them (e.g., https://godbolt.org/z/h5KKM7MEj)
} | ||
|
||
#endif // defined(CPU_FEATURES_OS_LINUX) || defined(CPU_FEATURES_OS_ANDROID) | ||
#endif // CPU_FEATURES_ARCH_RISCV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an empty line at EOF
@@ -408,6 +413,12 @@ static Node* CreateTree(void) { | |||
AddMapEntry(root, "microarchitecture", | |||
CreateString(strings.type.base_platform)); | |||
AddFlags(root, &info.features); | |||
#elif defined(CPU_FEATURES_ARCH_RISCV) | |||
const RiscvInfo info = GetRiscvInfo(); | |||
AddMapEntry(root, "arch", CreateString("risc-v")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With enough time people will rely on this string. Are we sure we want "risc-v"?
Alternative naming scheme I can think of :
- "riscv"
- "riscv-32"/ "riscv-64" / "riscv-128"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the second naming scheme seems better to me since this functionality is implemented with GetRiscvInfo
and more precise arch detection compared to riscv, but I also don't mind "riscv"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an afterthought, it seems that the 32 / 64 / 128 address space is really seen as a feature so we could stick to "risc-v". As you said the information exists elsewhere. I don't mind either, I just wanted to check whether we could regret this choice later on.
TEST(CpuinfoRiscvTest, KendryteFromCpuInfo) { | ||
ResetHwcaps(); | ||
auto& fs = GetEmptyFilesystem(); | ||
fs.CreateFile("/proc/cpuinfo", R"(hart : 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put hart : 0
on a new line for visibility
} | ||
|
||
} // namespace | ||
} // namespace cpu_features |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new line at EOF
StringView tmp_sv; | ||
value.ptr+=2; // ignore 'rv' prefix at the beginning | ||
value.size-=2; | ||
for (size_t i = 0; i < RISCV_LAST_; ++i){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The complexity here is O(N^2)
I'd rather scan the characters after "rv" and check if the character corresponds to one of the riscv extensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just read the comment from @michael-roe that the extensions can be strings and not single letters so I take my previous comment back.
Do we have a strong guarantee that no extension string can be embedded in another extension string ? If not, we cannot simply look at CpuFeatures_StringView_IndexOf(...) != -1
as we would take an extension for another one.
RISCV_32, | ||
RISCV_64, | ||
RISCV_128, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it right that we mixed isa extensions and arch to RiscvFeaturesEnum
should we separate them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wikipedia separates them into base
and extension
, I think we should do the same (see my comment on l.40)
| Linux | yes² | yes¹ | yes¹ | yes¹ | yes¹ | yes¹ | | ||
| MacOs | yes² | N/A | not yet | N/A | no | N/A | | ||
| Windows | yes² | not yet | not yet | N/A | N/A | N/A | | ||
| FreeBSD | yes² | not yet | not yet | not yet | not yet | N/A | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FreeBSD - not yet
RISCV_32, | ||
RISCV_64, | ||
RISCV_128, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wikipedia separates them into base
and extension
, I think we should do the same (see my comment on l.40)
CPU_FEATURES_START_CPP_NAMESPACE | ||
|
||
typedef struct { | ||
int riscv32 : 1; // Is a 32 bit architecture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Is a 32 bit architecture
AFAIU this is more about address space
int m : 1; // Integer Multiplication and Division | ||
int v : 1; // Vector Operations | ||
int q : 1; // Quad Precision Floating Point | ||
} RiscvFeatures; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed lower in this file I think we should make a distinction between base and extensions.
https://en.wikipedia.org/wiki/RISC-V#ISA_base_and_extensions
I was tempted to create two Features
structs (i.e., BaseRiscvFeatures
and ExtRiscvFeatures
) but this will not play nicely with the current introspection facilities.
Let's embed the distinction in the name instead.
I see two paths forward : use the names as they are defined in the wikipedia page or prefix them with base_
(std_
?) and ext_
.
I think I prefer sticking to the original names (RVWMO
, RV32I
, RV32E
, RV64I
, RV128I
for base and M
,A
,F
,D
,Zicsr
,Zifencei
,G
,Q
,L
,C
,B
,J
,T
,P
,V
,K
,N
,H
,S
,Zam
,Ztso
for extensions) as long as riscv commits to this naming scheme.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you to stick naming according to riscv documentation, thus we eliminate the confusion with riscv32
/riscv64
/riscv128
, also we can add a comment with the isa definition (base/ext):
typedef struct {
int rvmwo : 1; // Base ISA. Weak Memory Ordering
int rv32i: 1; // Base ISA. Base Integer Instruction Set, 32-bit
...
int zicsr: 1; // Extension ISA. Control and Status Register (CSR)
...
} RiscvFeatures;
StringView tmp_sv; | ||
value.ptr+=2; // ignore 'rv' prefix at the beginning | ||
value.size-=2; | ||
for (size_t i = 0; i < RISCV_LAST_; ++i){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just read the comment from @michael-roe that the extensions can be strings and not single letters so I take my previous comment back.
Do we have a strong guarantee that no extension string can be embedded in another extension string ? If not, we cannot simply look at CpuFeatures_StringView_IndexOf(...) != -1
as we would take an extension for another one.
value.size-=2; | ||
for (size_t i = 0; i < RISCV_LAST_; ++i){ | ||
tmp_sv.ptr = kCpuInfoFlags[i]; | ||
tmp_sv.size = strlen(kCpuInfoFlags[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I think so, here is an example where the compiler generates a call even with all the compiler flags https://godbolt.org/z/rTTfzEq15
In fact, compilers require some C functions to exist.
@@ -408,6 +413,12 @@ static Node* CreateTree(void) { | |||
AddMapEntry(root, "microarchitecture", | |||
CreateString(strings.type.base_platform)); | |||
AddFlags(root, &info.features); | |||
#elif defined(CPU_FEATURES_ARCH_RISCV) | |||
const RiscvInfo info = GetRiscvInfo(); | |||
AddMapEntry(root, "arch", CreateString("risc-v")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an afterthought, it seems that the 32 / 64 / 128 address space is really seen as a feature so we could stick to "risc-v". As you said the information exists elsewhere. I don't mind either, I just wanted to check whether we could regret this choice later on.
Hi @DaniAffCH are you still interested in contributing to this branch? |
Hi @gchatelet, yes I'm still interested in contributing to this project but I have a really busy period, so I'm unable to fix the PR issues, I'll do it as soon as I'm free though. If you want to keep the PR list clean feel free to close my PR and I'll open it again when I'm ready. |
Ok no worries. I've started #287 based on this PR to retain your original work (I hope you don't mind!). I'm now working on fixing the comments we've made. Thx to @Mizux we also have a running CI for RISC-V 💪. |
Ok, good job! I can probably be back active in 2 weeks. Please keep track of the problems you have fixed (or I'm gonna figure it out from your commits). |
Merged as part of #287 . Thx a lot @DaniAffCH for the initial implementation and research, this helped ! |
I added RISC-V architecture support on linux by parsing
/proc/cpuinfo
. It has been tested using QEMU emulator.I have done the following points:
Tests have been added as well