-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
module: small fixes for FreeBSD/aarch64 #14715
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -507,6 +507,16 @@ CFLAGS.zstd_lazy.c+= ${__ZFS_ZSTD_AARCH64_FLAGS} | |
CFLAGS.zstd_ldm.c+= ${__ZFS_ZSTD_AARCH64_FLAGS} | ||
CFLAGS.zstd_opt.c+= ${__ZFS_ZSTD_AARCH64_FLAGS} | ||
|
||
sha256-armv8.o: sha256-armv8.S | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't like this hackish thing ;-) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same, but I just copied the technique from the next couple of lines after it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, I'm not really sure what you mean. We have to pass -mgeneral-regs-only normally to avoid unexpected fpu use, but we really need the neon instructions here. We don't have a better way to strip it from just these four files AFAICT, and removing the offending instructions completely defeats the point of the implementation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a runtime test to check for fpu+armv8 crypto ... when this test is okay, the code gets executed. The current work arround may be some There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. -mgeneral-regs-only forbids use of floating-point instructions in assembly. These files, by design, use floating-point. Therefore the flag must be stripped specifically for these files, otherwise they will not build. What is your exact objection? It's unclear what your two paragraphs are actually saying specifically on a technical level. This isn't OS-specific; it's just a difference between LLVM, which applies -mgeneral-regs-only to the assembly, and GCC, which doesn't forward it on to GNU as (and there may not even be a flag to do it?). Compiling the Linux module with LLVM would hit the exact same issue. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems a bit odd to me that CFLAGS applies to assembly, but that is a complaint to file with LLVM. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. CFLAGS is just the name for it in the build system. And nothing says -mfoo is purely for the compiler and not the assembler; there's precedent with things like -march governing which instructions you get, with -mgeneral-regs-only being implemented in LLVM as disabling the fp-armv8, crypto and neon features, i.e. the same as what you get with -march=armv8-a+nofp+nosimd+nocrypto. It's just a difference of opinion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. -mgeneral-regs-only does more than just that. It permanently prevents you from turning on these instructions via pragmas, which has caused me plenty of grief late last year when looking into aarch64 optimizations on key fletcher4 functions. :/ In any case, I have far too much to do to have time to complain to LLVM about this and even if I did and they agreed, we would still need to support the older versions anyway. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
${CC} -c ${CFLAGS:N-mgeneral-regs-only} ${WERROR} ${.IMPSRC} \ | ||
-o ${.TARGET} | ||
${CTFCONVERT_CMD} | ||
|
||
sha512-armv8.o: sha512-armv8.S | ||
${CC} -c ${CFLAGS:N-mgeneral-regs-only} ${WERROR} ${.IMPSRC} \ | ||
-o ${.TARGET} | ||
${CTFCONVERT_CMD} | ||
|
||
b3_aarch64_sse2.o: b3_aarch64_sse2.S | ||
${CC} -c ${CFLAGS:N-mgeneral-regs-only} ${WERROR} ${.IMPSRC} \ | ||
-o ${.TARGET} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has this been tested? Maybe we can wait a bit, until I fixed blake3 and sha2 assembly.
I think I dig deeper into this tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as noted in the x18 issue, I tested this with hacked-around blake3 and can actually boot again now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to enter the fpu state because the checksum benchmark is called from the boot thread. In this thread floating point instructions are disabled so we need to enter a fpu section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Master being broken on FreeBSD is a big problem. Rather than leaving master broken while we figure out the ideal solution, we should fix it now and revise it to be more elegant later to allow development that depends on aarch64 working on FreeBSD to continue independently of this.