Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(lscq): add arm64 support #152

Merged
merged 11 commits into from
Oct 11, 2022

Conversation

kabu1204
Copy link
Contributor

Using CASP instruction to implement double-width CAS for arm64.
The CASP instruction is available for instruction set Armv8.1+
(inclusive, Apple M1/M2/A12, Snapdragon 845, etc).
All tests in lscq_test.go passed on Macbook Air M1 2020, macOS 12.5.1.

Change-Id: Ieb89fa9361f1fce8fb52b102dca556867f7e8e8a

Using CASP instruction to implement double-width CAS for arm64.
The CASP instruction is available for instruction set Armv8.1+
(inclusive, Apple M1/M2/A12, Snapdragon 845, etc).
All tests in lscq_test.go passed on Macbook Air M1 2020, macOS 12.5.1.

Change-Id: Ieb89fa9361f1fce8fb52b102dca556867f7e8e8a
kabu1204 and others added 5 commits August 27, 2022 17:28
…alone CAS instructions(CASP is only available for ARMv8.1+, use LDAX/STLX instruction on ARMv8.0 instead).

Currently, package golang.org/x/sys/cpu do not fully support darwin/arm64(Apple M1/M2), resulting in the cpu.ARM64.HasATOMICS to be false, which is actually true.
runtime/internal/cpu has fully supported detecting cpu features on darwin/arm64, maybe extracting code from it is a good walk around.
use cpu.ARM64.HasATOMICS to determine whether the arm64 cpu has standalone CAS instructions(CASP is only available for ARMv8.1+, use LDAX/STLX instruction on ARMv8.0 instead).
Currently, package golang.org/x/sys/cpu do not fully support darwin/arm64(Apple M1/M2), resulting in the cpu.ARM64.HasATOMICS to be false, which is actually true.
runtime/internal/cpu has fully supported detecting cpu features on darwin/arm64, maybe extracting code from it is a good walk around.
… feature/lscq_support_arm64

# Conflicts:
#	collection/lscq/asm_arm64.s
@kabu1204
Copy link
Contributor Author

kabu1204 commented Sep 7, 2022

Update: use cpu.ARM64.HasATOMICS to determine whether the arm64 cpu has standalone CAS instructions(CASP is only available for ARMv8.1+, use LDAX/STLX instruction on ARMv8.0 instead).

Copy link
Member

@zhangyunhao116 zhangyunhao116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this exciting work! It looks good on my armv8.1 environment, just a few questions before merging this request :)

collection/lscq/asm_arm64.s Show resolved Hide resolved
collection/lscq/asm_arm64.s Outdated Show resolved Hide resolved
collection/lscq/asm_arm64.s Outdated Show resolved Hide resolved
yuchengye and others added 5 commits October 10, 2022 00:52
fix bad indention problem.
Use MOVD instead of ORR to copy between registers. Prettify indentions between operator and operand.
On darwin, golang.org/x/sys/cpu.ARM64.HasATOMICS is set to false,
which actually should be true. Therefore, in order to use faster
CASPD instruction on darwin_arm64, we can use sysctl/sysctlbyname
to detect cpu features on darwin, see https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code.
The sysctlEnabled is exported from internal/cpu.sysctlEnabled,
which will call sysctlbyname to detect if specific cpu feature is enabled.
And it's ok to use golang.org/x/sys/cpu.ARM64.HasATOMICS on other OS.
One more bug fixed: MOVD R2, R6 => MOVD R6, R2. The bug wouldn't be found
if arm64HasAtomics is set to false.
All test and bench passed.
@kabu1204
Copy link
Contributor Author

kabu1204 commented Oct 10, 2022

Update: detect arm64HasAtomics correctly on darwin

Before:
arm64HasAtomics = golang.org/x/sys/cpu.ARM64.HasATOMICS. This is correct on other platforms except darwin_arm64. After checking the src, I found that the package golang.org/x/sys/cpu simply does not support darwin_arm64 :), so it sets all optional cpu features of arm64 to false on darwin.
This would not affect the correctness of LSCQ on darwin_arm64, the Cas128bit will just take the slightly slower way(using LDAXP/STLXP instruction).

After:
Use sysctlEnabled() to detect cpu features on darwin_arm64. The function is exported from internal/cpu by go:linkname. It uses sysctlbyname, a system call provided by darwin, to detect whether the given cpu features is enabled.
On other platforms, we still use golang.org/x/sys/cpu.ARM64.HasATOMICS.
Therefore, we can now use faster CASPD instruction on darwin_arm64.

All test and bench passed on darwin(m1) and ubuntu20.04(ampere altra).

More details are written in the commit message.

@zhangyunhao116
Copy link
Member

Nice work! It works fine in my liunx/arm64 and darwin/arm64 environments. Thanks!

@zhangyunhao116 zhangyunhao116 merged commit a5420d7 into bytedance:develop Oct 11, 2022
@zhangyunhao116
Copy link
Member

@kabu1204 It's fine, the test is passed on my linux/amd64 machine, since it is not a feature for linux/amd64, I think it's ok to merge this. Thanks again!

joway pushed a commit to joway/gopkg that referenced this pull request Apr 17, 2024
* feat(lscq): add arm64 support

Using CASP instruction to implement double-width CAS for arm64.
The CASP instruction is available for instruction set Armv8.1+
(inclusive, Apple M1/M2/A12, Snapdragon 845, etc).
All tests in lscq_test.go passed on Macbook Air M1 2020, macOS 12.5.1.

Change-Id: Ieb89fa9361f1fce8fb52b102dca556867f7e8e8a

* use cpu.ARM64.HasATOMICS to determine whether the arm64 cpu has standalone CAS instructions(CASP is only available for ARMv8.1+, use LDAX/STLX instruction on ARMv8.0 instead).
Currently, package golang.org/x/sys/cpu do not fully support darwin/arm64(Apple M1/M2), resulting in the cpu.ARM64.HasATOMICS to be false, which is actually true.
runtime/internal/cpu has fully supported detecting cpu features on darwin/arm64, maybe extracting code from it is a good walk around.

* feat(lscq): use LDAXP/STLXP for armv8.0 instead of CASP

use cpu.ARM64.HasATOMICS to determine whether the arm64 cpu has standalone CAS instructions(CASP is only available for ARMv8.1+, use LDAX/STLX instruction on ARMv8.0 instead).
Currently, package golang.org/x/sys/cpu do not fully support darwin/arm64(Apple M1/M2), resulting in the cpu.ARM64.HasATOMICS to be false, which is actually true.
runtime/internal/cpu has fully supported detecting cpu features on darwin/arm64, maybe extracting code from it is a good walk around.

* fix(lscq): fix bad indention

fix bad indention problem.

* fix(lscq): replace ORR with MOVD

Use MOVD instead of ORR to copy between registers. Prettify indentions between operator and operand.

* fix(lscq): detect atomics feature correctly on darwin

On darwin, golang.org/x/sys/cpu.ARM64.HasATOMICS is set to false,
which actually should be true. Therefore, in order to use faster
CASPD instruction on darwin_arm64, we can use sysctl/sysctlbyname
to detect cpu features on darwin, see https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code.
The sysctlEnabled is exported from internal/cpu.sysctlEnabled,
which will call sysctlbyname to detect if specific cpu feature is enabled.
And it's ok to use golang.org/x/sys/cpu.ARM64.HasATOMICS on other OS.
One more bug fixed: MOVD R2, R6 => MOVD R6, R2. The bug wouldn't be found
if arm64HasAtomics is set to false.
All test and bench passed.

Co-authored-by: yuchengye <yuchengye@bytedance.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants