-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-36685: [R][C++] Fix illegal opcode failure with Homebrew #36705
Conversation
|
@github-actions crossbow submit homebrew-r-brew |
This comment was marked as outdated.
This comment was marked as outdated.
@github-actions crossbow submit homebrew-r-brew |
This comment was marked as outdated.
This comment was marked as outdated.
@github-actions crossbow submit homebrew-r-brew |
This comment was marked as outdated.
This comment was marked as outdated.
@github-actions crossbow submit homebrew-r-brew |
This comment was marked as outdated.
This comment was marked as outdated.
@github-actions crossbow submit homebrew-r-brew |
This comment was marked as outdated.
This comment was marked as outdated.
@github-actions crossbow submit homebrew-r-brew |
This comment was marked as outdated.
This comment was marked as outdated.
@github-actions crossbow submit homebrew-r-brew |
This comment was marked as outdated.
This comment was marked as outdated.
@github-actions crossbow submit homebrew-r-brew |
This comment was marked as outdated.
This comment was marked as outdated.
@github-actions crossbow submit homebrew-r-brew |
This comment was marked as outdated.
This comment was marked as outdated.
@pitrou @westonpace Could you check the PR description? |
This comment was marked as outdated.
This comment was marked as outdated.
You could pass |
The other solution probably has a performance penalty as well, otherwise it wouldn't work? (can you check the resulting binary and see if there any AVX2 routines?) |
I can't check it easily... (We need to upload binaries and check them locally.) I'll use |
@github-actions crossbow submit homebrew-r-brew |
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description makes sense to me. I agree that 1 is a simpler approach. We probably shouldn't try to rely on strange build configurations to make runtime SIMD dispatch work. Instead we should have a software solution. Do we want a follow-up PR for a software fix in case someone wanted to try and improve the performance on macos?
There is a lot of debug info in this PR. I think this is probably a good thing (having more info in our build logs can help debugging) but I just want to verify it is intentional?
# TODO(ARROW-16907): apache/arrow@main seems to be installed already | ||
# so this does nothing on a branch/PR | ||
brew install -v --HEAD apache-arrow | ||
brew install -v --HEAD {{ '$(brew --repository homebrew/core)/Formula/apache-arrow.rb' }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to specify absolute path for a formula to use the local modified version. (We didn't need this with old Homebrew.)
I think that it's related to the "Using JSON files downloaded from formulae.brew.sh for package installation rather than local homebrew/core and homebrew/cask taps." change in Homebrew 4.0.0: https://brew.sh/2023/02/16/homebrew-4.0.0/
We may be able to use HOMEBREW_NO_INSTALL_FROM_API=1
instead of specifying an absolute path. But I think that the absolute path approach is straightforward.
@@ -57,7 +57,8 @@ class ApacheArrow < Formula | |||
fails_with gcc: "5" | |||
|
|||
def install | |||
# https://github.com/Homebrew/homebrew-core/issues/76537 | |||
# This isn't for https://github.com/Homebrew/homebrew-core/issues/76537 . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is deleting the old comment enough? Is there a reason a future reader would think this is related to 76537?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we defer this to another PR?
To be honest, I want to remove ENV.runtime_cpu_detection if Hardware::CPU.intel?
entirely or remove if Hardware::CPU.intel?
. If I do the change, we can remove this.
But I want to keep this in this PR to clarify we choose "1. -DARROW_RUNTIME_SIMD_LEVEL=NONE
instead of "2. Remove ENV.runtime_cpu_detection if Hardware::CPU.intel?
".
@github-actions crossbow submit homebrew-r-brew |
Revision: 7e688a0 Submitted crossbow builds: ursacomputing/crossbow @ actions-6a66c6840c
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably shouldn't try to rely on strange build configurations to make runtime SIMD dispatch work. Instead we should have a software solution. Do we want a follow-up PR for a software fix in case someone wanted to try and improve the performance on macos?
I've opened a new issue for it: #36902
There is a lot of debug info in this PR. I think this is probably a good thing (having more info in our build logs can help debugging) but I just want to verify it is intentional?
Yes. It's intentional. I think that we should be able to get the debug information by default. I've also mentioned these changes in the PR description.
@@ -57,7 +57,8 @@ class ApacheArrow < Formula | |||
fails_with gcc: "5" | |||
|
|||
def install | |||
# https://github.com/Homebrew/homebrew-core/issues/76537 | |||
# This isn't for https://github.com/Homebrew/homebrew-core/issues/76537 . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we defer this to another PR?
To be honest, I want to remove ENV.runtime_cpu_detection if Hardware::CPU.intel?
entirely or remove if Hardware::CPU.intel?
. If I do the change, we can remove this.
But I want to keep this in this PR to clarify we choose "1. -DARROW_RUNTIME_SIMD_LEVEL=NONE
instead of "2. Remove ENV.runtime_cpu_detection if Hardware::CPU.intel?
".
# TODO(ARROW-16907): apache/arrow@main seems to be installed already | ||
# so this does nothing on a branch/PR | ||
brew install -v --HEAD apache-arrow | ||
brew install -v --HEAD {{ '$(brew --repository homebrew/core)/Formula/apache-arrow.rb' }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to specify absolute path for a formula to use the local modified version. (We didn't need this with old Homebrew.)
I think that it's related to the "Using JSON files downloaded from formulae.brew.sh for package installation rather than local homebrew/core and homebrew/cask taps." change in Homebrew 4.0.0: https://brew.sh/2023/02/16/homebrew-4.0.0/
We may be able to use HOMEBREW_NO_INSTALL_FROM_API=1
instead of specifying an absolute path. But I think that the absolute path approach is straightforward.
If nobody objects this, I'll merge this in a few days. |
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit af23f6a. There were 3 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them. |
…ache#36705) ### Rationale for this change Summary of this problem: apache#31132 (comment) Why this problem is happen again? Because I removed `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` in apache#36583. The solution we chose by apache#14342 was forcing to use `-O2` for SIMD related code. It works for `-DCMAKE_BUILD_TYPE=MinSizeRel` but it doesn't work for Homebrew. Because Homebrew's CC https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/super/cc forces to use the same `-O` flag. The default is `-Os`. If we specify `-O2`, Homebrew's CC replaces it to `-Os`. If we use `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"`, Homebrew's CC always use `-O2`. So the solution we chose by apache#14342 isn't used for Homebrew. But Homebrew thinks that `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` is a workaround. So we need another solution for Homebrew. Here are candidate solutions: 1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE` 2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?` "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" works because we don't use the runtime SIMD dispatch feature (the problematic feature) entirely. "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" works but I don't know why... If `ENV.runtime_cpu_detection` is called, Homebrew's CC stops replacing `-march=*`. If we call `ENV.runtime_cpu_detection`, `-march=haswell` is used for AVX2 related code and `-march=skylake-avx512` is used for AVX512 including BMI2 related code. If we don't call `ENV.runtime_cpu_detection`, `-march=nehalem` is always used. (Note that SIMD related flags such as `-mbmi2` aren't removed by Homebrew's CC. So I think that SIMD is enabled.) I don't know why but "the one-definition-rule violation" (see the summary for details: apache#31132 (comment) ) isn't happen. FYI: CPU info for GitHub Actions macOS hosted-runner: ```console $ sysctl hw.optional machdep.cpu hw.optional.adx: 0 hw.optional.aes: 1 hw.optional.avx1_0: 1 hw.optional.avx2_0: 0 hw.optional.avx512bw: 0 hw.optional.avx512cd: 0 hw.optional.avx512dq: 0 hw.optional.avx512f: 0 hw.optional.avx512ifma: 0 hw.optional.avx512vbmi: 0 hw.optional.avx512vl: 0 hw.optional.bmi1: 0 hw.optional.bmi2: 0 hw.optional.enfstrg: 0 hw.optional.f16c: 1 hw.optional.floatingpoint: 1 hw.optional.fma: 0 hw.optional.hle: 0 hw.optional.mmx: 1 hw.optional.mpx: 0 hw.optional.rdrand: 1 hw.optional.rtm: 0 hw.optional.sgx: 0 hw.optional.sse: 1 hw.optional.sse2: 1 hw.optional.sse3: 1 hw.optional.sse4_1: 1 hw.optional.sse4_2: 1 hw.optional.supplementalsse3: 1 hw.optional.x86_64: 1 machdep.cpu.address_bits.physical: 43 machdep.cpu.address_bits.virtual: 48 machdep.cpu.arch_perf.events: 127 machdep.cpu.arch_perf.events_number: 7 machdep.cpu.arch_perf.fixed_number: 0 machdep.cpu.arch_perf.fixed_width: 0 machdep.cpu.arch_perf.number: 4 machdep.cpu.arch_perf.version: 1 machdep.cpu.arch_perf.width: 48 machdep.cpu.cache.L2_associativity: 8 machdep.cpu.cache.linesize: 64 machdep.cpu.cache.size: 256 machdep.cpu.mwait.extensions: 3 machdep.cpu.mwait.linesize_max: 4096 machdep.cpu.mwait.linesize_min: 64 machdep.cpu.mwait.sub_Cstates: 16 machdep.cpu.thermal.ACNT_MCNT: 0 machdep.cpu.thermal.core_power_limits: 0 machdep.cpu.thermal.dynamic_acceleration: 0 machdep.cpu.thermal.energy_policy: 0 machdep.cpu.thermal.fine_grain_clock_mod: 0 machdep.cpu.thermal.hardware_feedback: 0 machdep.cpu.thermal.invariant_APIC_timer: 1 machdep.cpu.thermal.package_thermal_intr: 0 machdep.cpu.thermal.sensor: 0 machdep.cpu.thermal.thresholds: 0 machdep.cpu.tlb.data.small: 64 machdep.cpu.tlb.inst.large: 8 machdep.cpu.tlb.inst.small: 64 machdep.cpu.tlb.shared: 512 machdep.cpu.tsc_ccc.denominator: 0 machdep.cpu.tsc_ccc.numerator: 0 machdep.cpu.xsave.extended_state: 7 832 832 0 machdep.cpu.xsave.extended_state1: 0 0 0 0 machdep.cpu.brand: 0 machdep.cpu.brand_string: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz machdep.cpu.core_count: 3 machdep.cpu.cores_per_package: 4 machdep.cpu.extfamily: 0 machdep.cpu.extfeature_bits: 4967106816 machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI machdep.cpu.extmodel: 3 machdep.cpu.family: 6 machdep.cpu.feature_bits: 18427078393948011519 machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON VMX SSSE3 CX16 SSE4.1 SSE4.2 x2APIC POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C machdep.cpu.leaf7_feature_bits: 643 0 machdep.cpu.leaf7_feature_bits_edx: 3154117632 machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SMEP ERMS MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD machdep.cpu.logical_per_package: 4 machdep.cpu.max_basic: 13 machdep.cpu.max_ext: 2147483656 machdep.cpu.microcode_version: 1070 machdep.cpu.model: 58 machdep.cpu.processor_flag: 0 machdep.cpu.signature: 198313 machdep.cpu.stepping: 9 machdep.cpu.thread_count: 3 machdep.cpu.vendor: GenuineIntel ``` ### What changes are included in this PR? "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" because it's straightforward and "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" may also disable runtime SIMD dispatch implicitly. This also adds the following debug information for easy to debug in future: * CPU information for GitHub Actions runner * Homebrew's build logs ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * Closes: apache#36685 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…ache#36705) ### Rationale for this change Summary of this problem: apache#31132 (comment) Why this problem is happen again? Because I removed `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` in apache#36583. The solution we chose by apache#14342 was forcing to use `-O2` for SIMD related code. It works for `-DCMAKE_BUILD_TYPE=MinSizeRel` but it doesn't work for Homebrew. Because Homebrew's CC https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/super/cc forces to use the same `-O` flag. The default is `-Os`. If we specify `-O2`, Homebrew's CC replaces it to `-Os`. If we use `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"`, Homebrew's CC always use `-O2`. So the solution we chose by apache#14342 isn't used for Homebrew. But Homebrew thinks that `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` is a workaround. So we need another solution for Homebrew. Here are candidate solutions: 1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE` 2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?` "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" works because we don't use the runtime SIMD dispatch feature (the problematic feature) entirely. "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" works but I don't know why... If `ENV.runtime_cpu_detection` is called, Homebrew's CC stops replacing `-march=*`. If we call `ENV.runtime_cpu_detection`, `-march=haswell` is used for AVX2 related code and `-march=skylake-avx512` is used for AVX512 including BMI2 related code. If we don't call `ENV.runtime_cpu_detection`, `-march=nehalem` is always used. (Note that SIMD related flags such as `-mbmi2` aren't removed by Homebrew's CC. So I think that SIMD is enabled.) I don't know why but "the one-definition-rule violation" (see the summary for details: apache#31132 (comment) ) isn't happen. FYI: CPU info for GitHub Actions macOS hosted-runner: ```console $ sysctl hw.optional machdep.cpu hw.optional.adx: 0 hw.optional.aes: 1 hw.optional.avx1_0: 1 hw.optional.avx2_0: 0 hw.optional.avx512bw: 0 hw.optional.avx512cd: 0 hw.optional.avx512dq: 0 hw.optional.avx512f: 0 hw.optional.avx512ifma: 0 hw.optional.avx512vbmi: 0 hw.optional.avx512vl: 0 hw.optional.bmi1: 0 hw.optional.bmi2: 0 hw.optional.enfstrg: 0 hw.optional.f16c: 1 hw.optional.floatingpoint: 1 hw.optional.fma: 0 hw.optional.hle: 0 hw.optional.mmx: 1 hw.optional.mpx: 0 hw.optional.rdrand: 1 hw.optional.rtm: 0 hw.optional.sgx: 0 hw.optional.sse: 1 hw.optional.sse2: 1 hw.optional.sse3: 1 hw.optional.sse4_1: 1 hw.optional.sse4_2: 1 hw.optional.supplementalsse3: 1 hw.optional.x86_64: 1 machdep.cpu.address_bits.physical: 43 machdep.cpu.address_bits.virtual: 48 machdep.cpu.arch_perf.events: 127 machdep.cpu.arch_perf.events_number: 7 machdep.cpu.arch_perf.fixed_number: 0 machdep.cpu.arch_perf.fixed_width: 0 machdep.cpu.arch_perf.number: 4 machdep.cpu.arch_perf.version: 1 machdep.cpu.arch_perf.width: 48 machdep.cpu.cache.L2_associativity: 8 machdep.cpu.cache.linesize: 64 machdep.cpu.cache.size: 256 machdep.cpu.mwait.extensions: 3 machdep.cpu.mwait.linesize_max: 4096 machdep.cpu.mwait.linesize_min: 64 machdep.cpu.mwait.sub_Cstates: 16 machdep.cpu.thermal.ACNT_MCNT: 0 machdep.cpu.thermal.core_power_limits: 0 machdep.cpu.thermal.dynamic_acceleration: 0 machdep.cpu.thermal.energy_policy: 0 machdep.cpu.thermal.fine_grain_clock_mod: 0 machdep.cpu.thermal.hardware_feedback: 0 machdep.cpu.thermal.invariant_APIC_timer: 1 machdep.cpu.thermal.package_thermal_intr: 0 machdep.cpu.thermal.sensor: 0 machdep.cpu.thermal.thresholds: 0 machdep.cpu.tlb.data.small: 64 machdep.cpu.tlb.inst.large: 8 machdep.cpu.tlb.inst.small: 64 machdep.cpu.tlb.shared: 512 machdep.cpu.tsc_ccc.denominator: 0 machdep.cpu.tsc_ccc.numerator: 0 machdep.cpu.xsave.extended_state: 7 832 832 0 machdep.cpu.xsave.extended_state1: 0 0 0 0 machdep.cpu.brand: 0 machdep.cpu.brand_string: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz machdep.cpu.core_count: 3 machdep.cpu.cores_per_package: 4 machdep.cpu.extfamily: 0 machdep.cpu.extfeature_bits: 4967106816 machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI machdep.cpu.extmodel: 3 machdep.cpu.family: 6 machdep.cpu.feature_bits: 18427078393948011519 machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON VMX SSSE3 CX16 SSE4.1 SSE4.2 x2APIC POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C machdep.cpu.leaf7_feature_bits: 643 0 machdep.cpu.leaf7_feature_bits_edx: 3154117632 machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SMEP ERMS MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD machdep.cpu.logical_per_package: 4 machdep.cpu.max_basic: 13 machdep.cpu.max_ext: 2147483656 machdep.cpu.microcode_version: 1070 machdep.cpu.model: 58 machdep.cpu.processor_flag: 0 machdep.cpu.signature: 198313 machdep.cpu.stepping: 9 machdep.cpu.thread_count: 3 machdep.cpu.vendor: GenuineIntel ``` ### What changes are included in this PR? "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" because it's straightforward and "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" may also disable runtime SIMD dispatch implicitly. This also adds the following debug information for easy to debug in future: * CPU information for GitHub Actions runner * Homebrew's build logs ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * Closes: apache#36685 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Rationale for this change
Summary of this problem: #31132 (comment)
Why this problem is happen again? Because I removed
ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"
in #36583. The solution we chose by #14342 was forcing to use-O2
for SIMD related code. It works for-DCMAKE_BUILD_TYPE=MinSizeRel
but it doesn't work for Homebrew.Because Homebrew's CC https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/super/cc forces to use the same
-O
flag. The default is-Os
. If we specify-O2
, Homebrew's CC replaces it to-Os
. If we useENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"
, Homebrew's CC always use-O2
. So the solution we chose by #14342 isn't used for Homebrew.But Homebrew thinks that
ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"
is a workaround. So we need another solution for Homebrew.Here are candidate solutions:
-DARROW_RUNTIME_SIMD_LEVEL=NONE
ENV.runtime_cpu_detection if Hardware::CPU.intel?
"1.
-DARROW_RUNTIME_SIMD_LEVEL=NONE
" works because we don't use the runtime SIMD dispatch feature (the problematic feature) entirely."2. Remove
ENV.runtime_cpu_detection if Hardware::CPU.intel?
" works but I don't know why... IfENV.runtime_cpu_detection
is called, Homebrew's CC stops replacing-march=*
. If we callENV.runtime_cpu_detection
,-march=haswell
is used for AVX2 related code and-march=skylake-avx512
is used for AVX512 including BMI2 related code. If we don't callENV.runtime_cpu_detection
,-march=nehalem
is always used. (Note that SIMD related flags such as-mbmi2
aren't removed by Homebrew's CC. So I think that SIMD is enabled.) I don't know why but "the one-definition-rule violation" (see the summary for details: #31132 (comment) ) isn't happen.FYI: CPU info for GitHub Actions macOS hosted-runner:
What changes are included in this PR?
"1.
-DARROW_RUNTIME_SIMD_LEVEL=NONE
" because it's straightforward and "2. RemoveENV.runtime_cpu_detection if Hardware::CPU.intel?
" may also disable runtime SIMD dispatch implicitly.This also adds the following debug information for easy to debug in future:
Are these changes tested?
Yes.
Are there any user-facing changes?
No.