Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Target] LLVM helper functions for any target info #15761

Merged
merged 1 commit into from
Sep 27, 2023
Merged

[Target] LLVM helper functions for any target info #15761

merged 1 commit into from
Sep 27, 2023

Conversation

cbalint13
Copy link
Contributor

@cbalint13 cbalint13 commented Sep 15, 2023

Hi folks,

This PR adds helper functions to query any LLVM information about: targets, arches and it's features.
It can enhance TVM target handling with information from LLVM module, just like partially was done in: #15685


Changes:

  • Introduces llvm_get_targets(), llvm_get_cpu_archlist(), llvm_get_cpu_features()& llvm_cpu_has_features()
  • Extends current target_has_feature() to work with any target, now also having -mtriple/-mattr awareness
  • Exposes booth C/FFI & PY interfaces for all the functions mentioned above.
  • Adds new unit tests for the function logic and API consistency stress.
  • Ensures proper API compatibility with older LLVM, tested with LLVM={10,14,15,16,17}.

Usage example

  • List all LLVM targets:
print( tvm.target.codegen.llvm_get_targets() )

["arm", "armeb", "aarch64", "aarch64_be", "aarch64_32", "avr", "bpfel", "bpfeb", "hexagon", 
"loongarch32", "loongarch64", "mips", "mipsel", "mips64", "mips64el", "msp430", "powerpc", 
"powerpcle", "powerpc64", "powerpc64le", "r600", "amdgcn", "riscv32", "riscv64", "sparc", 
"sparcv9", "sparcel", "s390x", "thumb", "thumbeb", "i386", "x86_64", "xcore", "nvptx", 
"nvptx64", "lanai", "wasm32", "wasm64"]
  • List all cpu architectures for a specific target:
print( tvm.target.codegen.llvm_get_cpu_archlist(tvm.target.Target("llvm -mtriple=hexagon--")))

["generic", "hexagonv5", "hexagonv55", "hexagonv60", "hexagonv62", "hexagonv65", 
"hexagonv66", "hexagonv67", "hexagonv67t", "hexagonv68", "hexagonv69", "hexagonv71", 
"hexagonv71t", "hexagonv73"]

print( tvm.target.codegen.llvm_get_cpu_archlist(tvm.target.Target("llvm -mtriple=riscv64-linux-gnu")))

["generic", "generic-rv32", "generic-rv64", "rocket", "rocket-rv32", "rocket-rv64", 
"sifive-7-series",  "sifive-e20", "sifive-e21", "sifive-e24", "sifive-e31", "sifive-e34", 
"sifive-e76", "sifive-s21", "sifive-s51", "sifive-s54", "sifive-s76", "sifive-u54", 
"sifive-u74", "sifive-x280", "syntacore-scr1-base",  "syntacore-scr1-max"]
  • List all cpu features for a specific cpu architecture for a specific target:
print( tvm.target.codegen.llvm_get_cpu_features(tvm.target.Target("llvm -mtriple=aarch64-linux-gnu -mcpu=neoverse-v2")) )

[ "altnzcv", "am", "bf16", "bti", "ccdp", "ccidx", "ccpp", "complxnum", "crc", "dit", "dotprod", 
"el2vmsa", "el3", "enable-select-opt", "ete", "flagm", "fp-armv8", "fp16fml", "fptoint", 
"fullfp16", "fuse-aes", "i8mm", "jsconv", "lor", "lse", "lse2", "lsl-fast", "mec", "mpam", "mte", 
"neon", "neoversev2", "nv", "pan", "pan-rwv", "pauth", "perfmon", "predictable-select-expensive", 
"predres", "rand", "ras", "rcpc", "rcpc-immo", "rdm", "rme", "sb", "sel2", "spe", "specrestrict", 
"ssbs", "sve", "sve2", "sve2-bitperm", "tlb-rmi", "tracev8.4", "trbe", "uaops", "use-postra-scheduler",
 "use-scalar-inc-vl", "v8.1a", "v8.2a", "v8.3a", "v8.4a", "v8.5a", "v8a", "v9a", "vh"]
  • Multiple features query:
$ cat tvm-check-avx512bw.py
#!/usr/bin/python3

import tvm
from tvm.target import codegen, Target
from tvm.target.codegen import target_has_features

import logging
logging.basicConfig(level=logging.DEBUG)

for mcpu in codegen.llvm_get_cpu_archlist(Target("llvm -mtriple=x86_64--")):
  with tvm.target.Target("llvm -mtriple=x86_64 -mcpu=%s" % mcpu):
    if target_has_features("avx512bw"):
      has_avx512f = target_has_features("avx512f")
      print("ARCH [%s] having `avx512bw` has avx512f=[%i]" % (mcpu, has_avx512f))

$ ./tvm-check-avx512bw.py 
ARCH [cannonlake] having `avx512bw` has avx512f=[1]
ARCH [cascadelake] having `avx512bw` has avx512f=[1]
ARCH [cooperlake] having `avx512bw` has avx512f=[1]
ARCH [emeraldrapids] having `avx512bw` has avx512f=[1]
ARCH [graniterapids] having `avx512bw` has avx512f=[1]
ARCH [graniterapids-d] having `avx512bw` has avx512f=[1]
ARCH [graniterapids_d] having `avx512bw` has avx512f=[1]
ARCH [icelake-client] having `avx512bw` has avx512f=[1]
ARCH [icelake-server] having `avx512bw` has avx512f=[1]
ARCH [icelake_client] having `avx512bw` has avx512f=[1]
ARCH [icelake_server] having `avx512bw` has avx512f=[1]
ARCH [rocketlake] having `avx512bw` has avx512f=[1]
ARCH [sapphirerapids] having `avx512bw` has avx512f=[1]
ARCH [skx] having `avx512bw` has avx512f=[1]
ARCH [skylake-avx512] having `avx512bw` has avx512f=[1]
ARCH [skylake_avx512] having `avx512bw` has avx512f=[1]
ARCH [tigerlake] having `avx512bw` has avx512f=[1]
ARCH [x86-64-v4] having `avx512bw` has avx512f=[1]
ARCH [znver4] having `avx512bw` has avx512f=[1]

  • Running on LLVM version:
print( "LLVM version = %s" % tvm.target.codegen.llvm_version_major() )

LLVM version = 17

Cc: @kparzysz-quic , @Lunderberg , @areusch , @junrushao , @tqchen , @elvin-n , @vvchernov , @echuraev

Thank you,
~Cristian.

@junrushao
Copy link
Member

This is awesome! Could you guys help review? @Lunderberg @vvchernov @kparzysz-quic @echuraev

Copy link
Contributor

@vvchernov vvchernov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @cbalint13! Thank you for your work! LGTM

Copy link
Contributor

@echuraev echuraev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbalint13 Thank you for the nice improvements! LGTM!

Copy link
Contributor

@kparzysz-quic kparzysz-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code should be in LLVMTargetInfo. Could move it there?

@cbalint13
Copy link
Contributor Author

cbalint13 commented Sep 20, 2023

Hi @kparzysz-quic ,

Thank you for the review !

This code should be in LLVMTargetInfo. Could move it there?

Not sure about how to integrate there.

These (optional) helpers can be used independently, much earlier, prior emitting or llvm lowering steps.
I am looking at LLVMTargetInfo lifetime/scope here and not sure how to cope with FFI ins<->outs for any info requests.

Help me with a hint/idea about.

@kparzysz-quic
Copy link
Contributor

kparzysz-quic commented Sep 20, 2023

The idea behind LLVMInstance is that all interactions with the LLVM libraries only happen when an object of that class has been created. You can think of it as if "LLVM instance" was turning on LLVM, and turning it back off when the object is destroyed. There was a discussion about this a while back, motivated mostly by keeping track of the global state of LLVM. Long story short, LLVM code has a global state, and modifying the global state in one part of TVM may affect how LLVM works in other parts (e.g. setting up codegen for CPU may affect codegen for GPU). LLVM instance was created to encapsulate any potential global state changes into the lifetime period of the LLVM instance object.

The functions you wrote can be moved to llvm_instance.cc. They can create a temporary LLVMInstance object, get LLVMTargetInfo, get the llvm::Target object (or llvm::TargetMachine) from it, and do their work.

Btw, the functionality you're adding is great, what I'm asking for is to integrate it with the existing code.

Edit: the query functions you have written don't need to be members of LLVMTargetInfo , but they should create an LLVMInstance object and only call LLVM functions during the lifetime of that object.
Edit: Removed unfinished part of the sentence above.

@cbalint13
Copy link
Contributor Author

@kparzysz-quic ,

Thanks again taking time !

The idea behind LLVMInstance is that all interactions with the LLVM libraries only happen when an object of that class has been created. You can think of it as if "LLVM instance" was turning on LLVM, and turning it back off when the object is destroyed. There was a discussion about this a while back, motivated mostly by keeping track of the global state of LLVM. Long story short, LLVM code has a global state, and modifying the global state in one part of TVM may affect how LLVM works in other parts (e.g. setting up codegen for CPU may affect codegen for GPU). LLVM instance was created to encapsulate any potential global state changes into the lifetime period of the LLVM instance object.

  • I reflected on this way (as being the best place) when wrote this PR, but now explaining it is more clear.

The functions you wrote can be moved to llvm_instance.cc. They can create a temporary LLVMInstance object, get LLVMTargetInfo, get the llvm::Target object (or llvm::TargetMachine) from it, and do their work.

  • If it is fine to create-destroy a whole LLVMInstance() and appeal methods from it, lets go.
  • Worried about fact that even for single FFI query, maybe repeated many times in the hot-path iterations, to create/destroy/appeal whole LLVMInstance (plus sub-LLVM related) might be bad.
  • I was worrying that already have to go through (LLVM point of view) a whole 2*inits() ->register-target->target-machine->subtarget->query (LLVM way) just for the sake of "one simple" info. But that's it, if we want this.

Btw, the functionality you're adding is great, what I'm asking for is to integrate it with the existing code.

Thank you much for the feedback !

  • One thing I really keen was the ability to tell info on any LLVM version. I really insisted on making it backward compat, even for earliest LLVM. It makes sure about LLVM limits, e.g. some recentish "super-dooper" CPU cant use it's X or Y feature because your LLVM is older (and user doesn't know), so don't even try schedule whatever nice tensorization steps for the user. Perhaps even warn the user that his LLVM is old, but upgrading he will benefit.

Edit: the query functions you have written don't need to be members of LLVMTargetInfo if that , but they should create an LLVMInstance object and only call LLVM functions during the lifetime of that object.

  • Maybe, in future we could make this object a persistent one, even cache some things inside, but I would put this idea aside for future. Maybe at a time, it will be decided on some way to migrate more arches (all?) towards this way of query things, perhaps even at earliest tvm.target.Target("llvm -xxx") creation step.

The idea behind LLVMInstance is that all interactions with the LLVM libraries only happen when an object of that class has been created. You can think of it as if "LLVM instance" was turning on LLVM, and turning it back off when the object is destroyed. There was a discussion about this a while back, motivated mostly by keeping track of the global state of LLVM. Long story short, LLVM code has a global state, and modifying the global state in one part of TVM may affect how LLVM works in other parts (e.g. setting up codegen for CPU may affect codegen for GPU). LLVM instance was created to encapsulate any potential global state changes into the lifetime period of the LLVM instance object.

The functions you wrote can be moved to llvm_instance.cc. They can create a temporary LLVMInstance object, get LLVMTargetInfo, get the llvm::Target object (or llvm::TargetMachine) from it, and do their work.

Btw, the functionality you're adding is great, what I'm asking for is to integrate it with the existing code.

Edit: the query functions you have written don't need to be members of LLVMTargetInfo if that , but they should create an LLVMInstance object and only call LLVM functions during the lifetime of that object.

See now, again thanks much for the time !
Let me try (1-2 day), as you suggested, then if don't mind will re-ask to help again with a review.

@kparzysz-quic
Copy link
Contributor

Maybe, in future we could make this object a persistent one, even cache some things inside, but I would put this idea aside for future.

We're trying to prevent the LLVM state from being persistent. The specific use-case was using command-line options with LLVM. Once you set use some flags, their effects will persist, and LLVM may even prevent you from using the same flags again.

The workflow is

  1. Create LLVM instance object.
  2. Set up LLVM as you want (i.e. by command-line flags). This isn't required if you don't want to change anything.
  3. Use LLVM functions.
  4. Destroy the instance object.
  5. Don't use any LLVM functions again, and if you want to, repeat 1-4.

@cbalint13 cbalint13 closed this Sep 24, 2023
@cbalint13 cbalint13 reopened this Sep 24, 2023
@cbalint13
Copy link
Contributor Author

cbalint13 commented Sep 26, 2023

@kparzysz-quic

Added the fuctions as LLVMTargetInfo public members, now looks to be their ideal place.
Updated the first comment here with the example & usage of the exported FFI/PY interface interactions.

Changes:

  • LLVM 6.0 minimum is required, ever since, even before this PR, this can be seen in the old codegen_x86_64.cc
  • Since the context is LLVMTargetInfo class itself, getter/checker functions are now glued to: triple_, cpu_, attrs_
  • Updated all x86 checkers, as a continuation of [Target][TOPI] Use LLVM for x86 CPU feature lookup #15685
  • Added extra testcases to assure the new glue-logic against LLVMTargetInfo

Caevats:

  • llvm -mpcu=skylake works on any x86 host since -mtriple will be filled in automatically, but will fail on non-x86
  • Error would be: LLVM cpu architecture `-mcpu=skylake` is not valid in `-mtriple=aarch64-linux-gnu`
  • llvm -mtriple=x86_64-linux-gnu -mpcu=skylake is the explicit way to assure non-x86 hosts (updated CI tests)
  • The issue is also vice-versa: any missing -mtriple having only -mcpu on a foreign host will (explicitly) fail.

It is not possible to avoid explicit llvm -mtriple on foreign hosts, atempt to auto-deduce the -mtriple out of the explicit -mcpu by iterating all LLVM targets may lead to confusion of 32 vs. 64 bit targets (i.e. arm vs. aarch64), and the cpu in such wrongly infered context will have very different features.

@kparzysz-quic kparzysz-quic merged commit cf8521a into apache:main Sep 27, 2023
@junrushao
Copy link
Member

Hey @cbalint13, I got a compiler warning from this commit saying:

/Users/jshao/Projects/tvm-dev/src/target/llvm/llvm_instance.cc:76:81: warning: reference to stack memory associated with parameter 'Obj' returned [-Wreturn-stack-address]
   76 |   friend ArrayRef<SubtargetSubTypeKV>& archViewer(MCSubtargetInfo Obj) { return Obj.*Member; }
      |                                                                                 ^~~
/Users/jshao/Projects/tvm-dev/src/target/llvm/llvm_instance.cc:84:81: warning: reference to stack memory associated with parameter 'Obj' returned [-Wreturn-stack-address]
   84 |   friend ArrayRef<SubtargetFeatureKV>& featViewer(MCSubtargetInfo Obj) { return Obj.*Member; }
      |                                                                                 ^~~
/Users/jshao/Projects/tvm-dev/src/target/llvm/llvm_instance.cc:763:49: warning: cast from 'const llvm::MCSubtargetInfo *' to 'llvm::MCSubtargetInfo *' drops const qualifier [-Wcast-qual]
  763 |       llvm::archViewer(*(llvm::MCSubtargetInfo*)MCInfo);
      |                                                 ^
/Users/jshao/Projects/tvm-dev/src/target/llvm/llvm_instance.cc:784:49: warning: cast from 'const llvm::MCSubtargetInfo *' to 'llvm::MCSubtargetInfo *' drops const qualifier [-Wcast-qual]
  784 |       llvm::featViewer(*(llvm::MCSubtargetInfo*)MCInfo);
      |                                                 ^

I am a bit confused by the warning message, not completely sure if it is false positive from the compiler, so am following up here to double check

@cbalint13
Copy link
Contributor Author

cbalint13 commented Oct 16, 2023

/Users/jshao/Projects/tvm-dev/src/target/llvm/llvm_instance.cc:763:49: warning: cast from 'const llvm::MCSubtargetInfo *' to 'llvm::MCSubtargetInfo *' drops const qualifier [-Wcast-qual]

Hmm, if it's about const I think it can be fixed easily.
Let me know what compiler you have and I try address a fix for both warns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants