Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR for llvm/llvm-project#53592 #5

Merged
merged 2 commits into from
Feb 7, 2022
Merged

PR for llvm/llvm-project#53592 #5

merged 2 commits into from
Feb 7, 2022

Commits on Feb 5, 2022

  1. [x86] add test coverage for AMD Ryzen fast sqrt codegen; NFC

    (cherry picked from commit 1eb4f88)
    rotateright authored and llvmbot committed Feb 5, 2022
    Configuration menu
    Copy the full SHA
    5ce5c17 View commit details
    Browse the repository at this point in the history
  2. [x86] enable fast sqrtss/sqrtps tuning for AMD Zen cores

    As discussed in D118534, all of the recent AMD CPUs have
    relatively fast (<14 cycle latency) "sqrtss" and "sqrtps"
    instructions:
    https://uops.info/table.html?search=sqrtps&cb_lat=on&cb_tp=on&cb_SNB=on&cb_SKL=on&cb_ZENp=on&cb_ZEN2=on&cb_ZEN3=on&cb_measurements=on&cb_avx=on&cb_sse=on
    
    So we should set this tuning flag to alter codegen of plain
    "sqrt(X)" expansion (as opposed to reciprocal-sqrt - there
    is other test coverage for that pattern). The expansion is
    both slower and less accurate than the hardware instruction.
    
    Differential Revision: https://reviews.llvm.org/D119001
    
    (cherry picked from commit fff3e1d)
    rotateright authored and llvmbot committed Feb 5, 2022
    Configuration menu
    Copy the full SHA
    324f21b View commit details
    Browse the repository at this point in the history