-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(lscq): add arm64 support #152
feat(lscq): add arm64 support #152
Conversation
Using CASP instruction to implement double-width CAS for arm64. The CASP instruction is available for instruction set Armv8.1+ (inclusive, Apple M1/M2/A12, Snapdragon 845, etc). All tests in lscq_test.go passed on Macbook Air M1 2020, macOS 12.5.1. Change-Id: Ieb89fa9361f1fce8fb52b102dca556867f7e8e8a
…alone CAS instructions(CASP is only available for ARMv8.1+, use LDAX/STLX instruction on ARMv8.0 instead). Currently, package golang.org/x/sys/cpu do not fully support darwin/arm64(Apple M1/M2), resulting in the cpu.ARM64.HasATOMICS to be false, which is actually true. runtime/internal/cpu has fully supported detecting cpu features on darwin/arm64, maybe extracting code from it is a good walk around.
…204/gopkg into feature/lscq_support_arm64
use cpu.ARM64.HasATOMICS to determine whether the arm64 cpu has standalone CAS instructions(CASP is only available for ARMv8.1+, use LDAX/STLX instruction on ARMv8.0 instead). Currently, package golang.org/x/sys/cpu do not fully support darwin/arm64(Apple M1/M2), resulting in the cpu.ARM64.HasATOMICS to be false, which is actually true. runtime/internal/cpu has fully supported detecting cpu features on darwin/arm64, maybe extracting code from it is a good walk around.
… feature/lscq_support_arm64 # Conflicts: # collection/lscq/asm_arm64.s
Update: use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this exciting work! It looks good on my armv8.1 environment, just a few questions before merging this request :)
fix bad indention problem.
Use MOVD instead of ORR to copy between registers. Prettify indentions between operator and operand.
…204/gopkg into feature/lscq_support_arm64
On darwin, golang.org/x/sys/cpu.ARM64.HasATOMICS is set to false, which actually should be true. Therefore, in order to use faster CASPD instruction on darwin_arm64, we can use sysctl/sysctlbyname to detect cpu features on darwin, see https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code. The sysctlEnabled is exported from internal/cpu.sysctlEnabled, which will call sysctlbyname to detect if specific cpu feature is enabled. And it's ok to use golang.org/x/sys/cpu.ARM64.HasATOMICS on other OS. One more bug fixed: MOVD R2, R6 => MOVD R6, R2. The bug wouldn't be found if arm64HasAtomics is set to false. All test and bench passed.
Update: detect Before: After: All test and bench passed on darwin(m1) and ubuntu20.04(ampere altra). More details are written in the commit message. |
Nice work! It works fine in my |
@kabu1204 It's fine, the test is passed on my |
* feat(lscq): add arm64 support Using CASP instruction to implement double-width CAS for arm64. The CASP instruction is available for instruction set Armv8.1+ (inclusive, Apple M1/M2/A12, Snapdragon 845, etc). All tests in lscq_test.go passed on Macbook Air M1 2020, macOS 12.5.1. Change-Id: Ieb89fa9361f1fce8fb52b102dca556867f7e8e8a * use cpu.ARM64.HasATOMICS to determine whether the arm64 cpu has standalone CAS instructions(CASP is only available for ARMv8.1+, use LDAX/STLX instruction on ARMv8.0 instead). Currently, package golang.org/x/sys/cpu do not fully support darwin/arm64(Apple M1/M2), resulting in the cpu.ARM64.HasATOMICS to be false, which is actually true. runtime/internal/cpu has fully supported detecting cpu features on darwin/arm64, maybe extracting code from it is a good walk around. * feat(lscq): use LDAXP/STLXP for armv8.0 instead of CASP use cpu.ARM64.HasATOMICS to determine whether the arm64 cpu has standalone CAS instructions(CASP is only available for ARMv8.1+, use LDAX/STLX instruction on ARMv8.0 instead). Currently, package golang.org/x/sys/cpu do not fully support darwin/arm64(Apple M1/M2), resulting in the cpu.ARM64.HasATOMICS to be false, which is actually true. runtime/internal/cpu has fully supported detecting cpu features on darwin/arm64, maybe extracting code from it is a good walk around. * fix(lscq): fix bad indention fix bad indention problem. * fix(lscq): replace ORR with MOVD Use MOVD instead of ORR to copy between registers. Prettify indentions between operator and operand. * fix(lscq): detect atomics feature correctly on darwin On darwin, golang.org/x/sys/cpu.ARM64.HasATOMICS is set to false, which actually should be true. Therefore, in order to use faster CASPD instruction on darwin_arm64, we can use sysctl/sysctlbyname to detect cpu features on darwin, see https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code. The sysctlEnabled is exported from internal/cpu.sysctlEnabled, which will call sysctlbyname to detect if specific cpu feature is enabled. And it's ok to use golang.org/x/sys/cpu.ARM64.HasATOMICS on other OS. One more bug fixed: MOVD R2, R6 => MOVD R6, R2. The bug wouldn't be found if arm64HasAtomics is set to false. All test and bench passed. Co-authored-by: yuchengye <yuchengye@bytedance.com>
Using CASP instruction to implement double-width CAS for arm64.
The CASP instruction is available for instruction set Armv8.1+
(inclusive, Apple M1/M2/A12, Snapdragon 845, etc).
All tests in lscq_test.go passed on Macbook Air M1 2020, macOS 12.5.1.
Change-Id: Ieb89fa9361f1fce8fb52b102dca556867f7e8e8a