Skip to content

Commit 635d383

Browse files
committed
[X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and later Intel CPUs.
AVX512 instructions can cause a frequency drop on these CPUs. This can negate the performance gains from using wider vectors. Enabling prefer-vector-width=256 will prevent generation of zmm registers unless explicit 512 bit operations are used in the original source code. I believe gcc and icc both do something similar to this by default. Differential Revision: https://reviews.llvm.org/D67259 llvm-svn: 371694
1 parent 55d86f0 commit 635d383

File tree

4 files changed

+20
-2
lines changed

4 files changed

+20
-2
lines changed

clang/docs/ReleaseNotes.rst

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,8 +56,12 @@ Improvements to Clang's diagnostics
5656
Non-comprehensive list of changes in this release
5757
-------------------------------------------------
5858

59-
- ...
60-
59+
- For X86 target, -march=skylake-avx512, -march=icelake-client,
60+
-march=icelake-server, -march=cascadelake, -march=cooperlake will default to
61+
not using 512-bit zmm registers in vectorized code unless 512-bit intrinsics
62+
are used in the source code. 512-bit operations are known to cause the CPUs
63+
to run at a lower frequency which can impact performance. This behavior can be
64+
changed by passing -mprefer-vector-width=512 on the command line.
6165

6266
New Compiler Flags
6367
------------------

llvm/docs/ReleaseNotes.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,10 @@ Changes to the X86 Target
9696
be passed in ZMM registers for calls and returns. Previously they were passed
9797
in two YMM registers. Old behavior can be enabled by passing
9898
-x86-enable-old-knl-abi
99+
* -mprefer-vector-width=256 is now the default behavior skylake-avx512 and later
100+
Intel CPUs. This tries to limit the use of 512-bit registers which can cause a
101+
decrease in CPU frequency on these CPUs. This can be re-enabled by passing
102+
-mprefer-vector-width=512 to clang or passing -mattr=-prefer-256-bit to llc.
99103

100104
Changes to the AMDGPU Target
101105
-----------------------------

llvm/lib/Target/X86/X86.td

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -601,6 +601,7 @@ def ProcessorFeatures {
601601

602602
// Skylake-AVX512
603603
list<SubtargetFeature> SKXAdditionalFeatures = [FeatureAVX512,
604+
FeaturePrefer256Bit,
604605
FeatureCDI,
605606
FeatureDQI,
606607
FeatureBWI,
@@ -634,6 +635,7 @@ def ProcessorFeatures {
634635

635636
// Cannonlake
636637
list<SubtargetFeature> CNLAdditionalFeatures = [FeatureAVX512,
638+
FeaturePrefer256Bit,
637639
FeatureCDI,
638640
FeatureDQI,
639641
FeatureBWI,

llvm/test/CodeGen/X86/min-legal-vector-width.ll

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,14 @@
11
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
22
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=fast-variable-shuffle,avx512vl,avx512bw,avx512dq,prefer-256-bit | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512
33
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=fast-variable-shuffle,avx512vl,avx512bw,avx512dq,prefer-256-bit,avx512vbmi | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
4+
; Make sure CPUs default to prefer-256-bit. avx512vnni isn't interesting as it just adds an isel peephole for vpmaddwd+vpaddd
5+
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=skylake-avx512 | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512
6+
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=cascadelake | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512
7+
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=cooperlake | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512
8+
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=cannonlake | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
9+
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=icelake-client | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
10+
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=icelake-server | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
11+
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=tigerlake | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
412

513
; This file primarily contains tests for specific places in X86ISelLowering.cpp that needed be made aware of the legalizer not allowing 512-bit vectors due to prefer-256-bit even though AVX512 is enabled.
614

0 commit comments

Comments
 (0)