-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable new win-arm64 platform #82
Conversation
Build on win-arm64 |
5933e1a
to
cf12755
Compare
break; | ||
} | ||
/* Reset the lowest bit in affinity mask */ | ||
group_processors_mask &= (group_processors_mask - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This resets the most significant bits, not the lowest bit.
Would it be clearer to shift right and test for 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've inherited this code from x86 init: https://github.com/pytorch/cpuinfo/blob/master/src/x86/windows/init.c#L229
I mean it's absolutely not an excuse, I'll check that again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it really clears all bits, that are lower than the least set bit.
ff 11111111
fe 11111110
fc 11111100
f8 11111000
f0 11110000
e0 11100000
c0 11000000
80 10000000
0 00000000
0x10
0x00x20
0x00x40
0x00x80
0x0
} | ||
|
||
/* 2. Read topology information via MSDN API: packages, cores and caches*/ | ||
nr_of_packages = read_packages_for_processors( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are many things that can go wrong in these calls, but there is no way to chain the error and exit the function on error. That's not great for defensive code.
Consider adding throw/catch if permitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I see, there are not throw/catch in the current codebase and I'd think that's intentional, so I'd try to cleanup the error paths and make sure it's going to be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes C
code mostly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general principals of querying CPU details and assigning to the cpuinfo relevant structures looks ok to me. There are some issues with the error handling here that might need addressing.
5a0c4b0
to
9208c53
Compare
@malfet Could you review it? Thanks! |
related PR: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good to me, but please explain why ARM64
needs a special treatment in ELSEIF(NOT CPUINFO_TARGET_PROCESSOR_MATCHES
And use bool
instead of BOOL
whenever possible (i.e. in callback its probably fine, but otherwise should not be used)
9208c53
to
24ef3fc
Compare
@malfet Can I do anything else to close this and the related PR? |
any updates on this? |
Not on my side, but I'd happily close this! |
Can this be merged, please? |
Any plans on merging this PR ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use dash -
to separate filename parts rather than underscore _
for consistency with other files in cpuinfo.
}; | ||
|
||
/* Please add new SoC/chip info here! */ | ||
static struct woa_chip_info woa_chips[] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This covers literally two devices. We need a more generic detection method that doesn't require hard-coding every SoC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I absolutely agree that this solution is not ideal, but we haven't found any better and still robust option considering the existing Windows API. Also new SoCs won't turn up frequently, so even though it's a limited solution, it can be maintained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a similar concern but there doesn't seem to be any better way to do it with the help of Windows API.
Ampere SoC also runs Windows/Arm64 - Does that need to be added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice, but I can't really test that, so I'd ask to do as a follow-up commit.
CMakeLists.txt
Outdated
@@ -171,6 +171,9 @@ IF(CPUINFO_SUPPORTED_PLATFORM) | |||
LIST(APPEND CPUINFO_SRCS | |||
src/arm/android/properties.c) | |||
ENDIF() | |||
ELSEIF(CMAKE_SYSTEM_NAME STREQUAL "Windows" AND CPUINFO_TARGET_PROCESSOR STREQUAL "ARM64") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong indentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Gabor, thanks for working on this. It will help to solve our build issue in WoA Chromium. I managed to backport your PR to Chromium to check how the build works.
I left some comments about my findings. Basically lot of them are warnings reported by Clang-cl which treat as errors. Thanks.
src/arm/windows/init.c
Outdated
static void set_cpuinfo_isa_fields(); | ||
static bool get_system_info_from_registry( | ||
struct woa_chip_info** chip_info, | ||
enum cpu_info_vendor* vendor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: cpu_info_vendor -> cpuinfo_vendor
src/arm/windows/init.c
Outdated
|
||
struct vendor_info { | ||
char vendor_name[VENDOR_NAME_MAX]; | ||
enum cpu_info_vendor cpu_info_vendor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
src/arm/windows/init.c
Outdated
PINIT_ONCE init_once, PVOID parameter, PVOID* context) | ||
{ | ||
struct woa_chip_info *chip_info = NULL; | ||
enum cpu_info_vendor vendor = cpuinfo_vendor_unknown; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: cpu_info_vendor -> cpuinfo_vendor
src/arm/windows/init.c
Outdated
|
||
static bool get_system_info_from_registry( | ||
struct woa_chip_info** chip_info, | ||
enum cpu_info_vendor* vendor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
src/arm/windows/windows_arm_init.h
Outdated
@@ -0,0 +1,32 @@ | |||
#pragma once | |||
|
|||
bool get_core_uarch_for_efficiency( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move these functions down to avoid 'forward references to enum' reported by Clang-cl.
src/arm/windows/init.c
Outdated
bool result = false; | ||
char* textBuffer = NULL; | ||
LPCTSTR cpu0_subkey = | ||
"HARDWARE\\DESCRIPTION\\System\\CentralProcessor\\0"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please mark these strings as wide-character literals by using 'L'.
L"HARDWARE\DESCRIPTION\System\CentralProcessor\0"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly the MSVC compiler throws the following error to 'L' , even though I'd expected your proposal is the correct one:
warning C4133: 'initializing': incompatible types - from 'unsigned short [47]' to 'LPCTSTR'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh okay, seems using (LPCTSTR) cast works fine.
break; | ||
case CacheData: | ||
processors[processor_global_index].cache.l1d = current_cache; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here as above
case CacheData: | ||
current_cache = l1d_base + l1d_counter; | ||
l1d_counter++; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here as above
break; | ||
case CacheData: | ||
numbers_of_caches[cpuinfo_cache_level_1d]++; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-cl complains about the missing other two enum values:
"error: enumeration values 'CacheUnified' and 'CacheTrace' not handled in switch [-Werror,-Wswitch]"
Could you add these values as empty cases as well?
src/arm/windows/init.c
Outdated
/* Please add new SoC/chip info here! */ | ||
static struct woa_chip_info woa_chips[] = { | ||
/* Microsoft SQ1 Kryo 495 4 + 4 cores (3 GHz + 1.80 GHz) */ | ||
"Microsoft SQ1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coud you add some extra braces to each struct elements?
../../third_party/cpuinfo/src/src/arm/windows/init.c(39,2): error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] "Microsoft SQ1", ^~~~~~~~~~~~~~~~ { ../../third_party/cpuinfo/src/src/arm/windows/init.c(52,2): error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] "Microsoft SQ2", ^~~~~~~~~~~~~~~~ {
(@gaborkertesz-linaro is on holiday now and planning to address the comments in couple of weeks) |
At the moment the Chrome build is broken for Windows on ARM, since ruy build relies on cpuinfo third_party library which doesn't support Windows on ARM port yet. There is a pull request which implements the required functions for WoA platfrom [1]. After it is landed we can re-enable cpuinfo build again. [1] pytorch/cpuinfo#82 Bug: 1329774 Change-Id: Idb619eaa3180fac5068a939704151da0f653ee89 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3721652 Reviewed-by: Tarun Bansal <tbansal@chromium.org> Commit-Queue: Tarun Bansal <tbansal@chromium.org> Cr-Commit-Position: refs/heads/main@{#1018705}
f15be14
to
6551262
Compare
I hope I've fixed all of your findings, please let me know if I missed any of them! |
Gabor, thanks for updating it. Checking on my side, I get back to you asap. |
This patch implements the required APIs for the new win-arm64 platform by reading topology information via Windows API. Build config: cmake . -A ARM64
6551262
to
e693110
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks. I'm not a reviewer, but I give you an informal LGTM.
@malfet What do you think? |
This patch implements the required APIs for the new win-arm64 platform by reading topology information via Windows API. Build config: cmake . -A ARM64 GitOrigin-RevId: 1baac2b
At the moment the Chrome build is broken for Windows on ARM, since ruy build relies on cpuinfo third_party library which doesn't support Windows on ARM port yet. There is a pull request which implements the required functions for WoA platfrom [1]. After it is landed we can re-enable cpuinfo build again. [1] pytorch/cpuinfo#82 Bug: 1329774 Change-Id: Idb619eaa3180fac5068a939704151da0f653ee89 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3721652 Reviewed-by: Tarun Bansal <tbansal@chromium.org> Commit-Queue: Tarun Bansal <tbansal@chromium.org> Cr-Commit-Position: refs/heads/main@{#1018705} NOKEYCHECK=True GitOrigin-RevId: 2db8c63892cff0e269c00f2da300aff3843e5735
This PR enables the new Windows on Arm (win-arm64) platform, by implementing the required APIs, reading topology information via Windows API.