Skip to content

Commit

Permalink
Update LLVM release/16.x
Browse files Browse the repository at this point in the history
  • Loading branch information
github-actions[bot] committed Jul 10, 2023
1 parent f705553 commit eadead3
Show file tree
Hide file tree
Showing 64 changed files with 7,804 additions and 2,852 deletions.
167 changes: 98 additions & 69 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,93 +1,122 @@
# Capstone's LLVM with refactored TableGen backends
# The LLVM Compiler Infrastructure

This LLVM version has the purpose to generate code for the
[Capstone disassembler](https://github.com/capstone-engine/capstone).
This directory and its sub-directories contain the source code for LLVM,
a toolkit for the construction of highly optimized compilers,
optimizers, and run-time environments.

It refactors the TableGen emitter backends, so they can emit C code
in addition to the C++ code they normally emit.
The README briefly describes how to get started with building LLVM.
For more information on how to contribute to the LLVM project, please
take a look at the
[Contributing to LLVM](https://llvm.org/docs/Contributing.html) guide.

Please note that within LLVM we speak of a `Target` if we refer to an architecture.
## Getting Started with the LLVM System

## Code generation
Taken from [here](https://llvm.org/docs/GettingStarted.html).

### Relevant files
### Overview

The TableGen emitter backends are located in `llvm/utils/TableGen/`.
Welcome to the LLVM project!

The target definition files (`.td`) define the
instructions, operands, features and other things. This is the source of all our information.
If something is wrongly defined there, it will be wrong in the generated files.
You can find the `td` files in `llvm/lib/Target/<ARCH>/`.
The LLVM project has multiple components. The core of the project is
itself called "LLVM". This contains all of the tools, libraries, and header
files needed to process intermediate representations and convert them into
object files. Tools include an assembler, disassembler, bitcode analyzer, and
bitcode optimizer. It also contains basic regression tests.

### Code generation overview
C-like languages use the [Clang](http://clang.llvm.org/) frontend. This
component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode
-- and from there into object files, using LLVM.

Generating code for a target has 6 steps:
Other components include:
the [libc++ C++ standard library](https://libcxx.llvm.org),
the [LLD linker](https://lld.llvm.org), and more.

```
5 6
┌──────────┐ ┌──────────┐
│Printer │ │CS .inc │
1 2 3 4 ┌──►│Capstone ├─────►│files │
┌───────┐ ┌───────────┐ ┌───────────┐ ┌──────────┐ │ └──────────┘ └──────────┘
│ .td │ │ │ │ │ │ Code- │ │
│ files ├────►│ TableGen ├────►│ CodeGen ├────►│ Emitter │◄─┤
└───────┘ └──────┬────┘ └───────────┘ └──────────┘ │
│ ▲ │ ┌──────────┐ ┌──────────┐
└─────────────────────────────────┘ └──►│Printer ├─────►│LLVM .inc │
│LLVM │ │files │
└──────────┘ └──────────┘
```
### Getting the Source Code and Building LLVM

1. LLVM targets are defined in `.td` files. They describe instructions, operands,
features and other properties.
The LLVM Getting Started documentation may be out of date. The [Clang
Getting Started](http://clang.llvm.org/get_started.html) page might have more
accurate information.

2. [LLVM TableGen](https://llvm.org/docs/TableGen/index.html) parses these files
and converts them to an internal representation of [Classes, Records, DAGs](https://llvm.org/docs/TableGen/ProgRef.html)
and other types.
This is an example work-flow and configuration to get and build the LLVM source:

3. In the second step a TableGen component called [CodeGen](https://llvm.org/docs/CodeGenerator.html)
abstracts this even further.
The result is a representation which is _not_ specific to any target
(e.g. the `CodeGenInstruction` class can represent a machine instruction of any target).
1. Checkout LLVM (including related sub-projects like Clang):

4. Different code emitter backends use the result of the former two components to
generated code.
* ``git clone https://github.com/llvm/llvm-project.git``

5. Whenever the emitter emits code it calls a `Printer`. Either the `PrinterCapstone` to emit C or `PrinterLLVM` to emit C++.
Which one is controlled by the `--printerLang=[CCS,C++]` option passed to `llvm-tblgen`.
* Or, on windows, ``git clone --config core.autocrlf=false
https://github.com/llvm/llvm-project.git``

6. After the emitter backend is done, the `Printer` writes the `output_stream` content into the `.inc` files.
2. Configure and build LLVM and Clang:

### Emitter backends and their use cases
* ``cd llvm-project``

We use the following emitter backends
* ``cmake -S llvm -B build -G <generator> [options]``

| Name | Generated Code | Note |
|------|----------------|------|
| AsmMatcherEmitter | Mapping tables for Capstone | |
| AsmWriterEmitter | State machine to decode the asm-string for a `MCInst` | |
| DecoderEmitter | State machine which decodes bytes to a `MCInst`. | |
| InstrInfoEmitter | Tables with instruction information (instruction enum, instr. operand information...) | |
| RegisterInfoEmitter | Tables with register information (register enum, register type info...) | |
| SubtargetEmitter | Table about the target features. | |
| SearchableTablesEmitter | Usually used to generate tables and decoding functions for system operands. | **1.** Not all targets use this. |
Some common build system generators are:

## Developer notes
* ``Ninja`` --- for generating [Ninja](https://ninja-build.org)
build files. Most llvm developers use Ninja.
* ``Unix Makefiles`` --- for generating make-compatible parallel makefiles.
* ``Visual Studio`` --- for generating Visual Studio projects and
solutions.
* ``Xcode`` --- for generating Xcode projects.

- If you find C++ code within the generated files you need to extend `PrinterCapstone::translateToC()`.
If this still doesn't fix the problem, the code snipped wasn't passed through `translateToC()` before emitting.
So you need to figure out where this specific code snipped is printed and pass it to `translateToC()`.
Some common options:

- If the mapping files miss operand types or access information, then the `.td` files are incomplete (happens surprisingly often).
You need to search for the instruction or operands with missing or incorrect values and fix them.
```
Wrong access attributes for:
- Registers, Immediates: The instructions defines "out" and "in" operands incorrectly.
- Memory: The "mayLoad" or "mayStore" variable is not set for the instruction.
* ``-DLLVM_ENABLE_PROJECTS='...'`` and ``-DLLVM_ENABLE_RUNTIMES='...'`` ---
semicolon-separated list of the LLVM sub-projects and runtimes you'd like to
additionally build. ``LLVM_ENABLE_PROJECTS`` can include any of: clang,
clang-tools-extra, cross-project-tests, flang, libc, libclc, lld, lldb,
mlir, openmp, polly, or pstl. ``LLVM_ENABLE_RUNTIMES`` can include any of
libcxx, libcxxabi, libunwind, compiler-rt, libc or openmp. Some runtime
projects can be specified either in ``LLVM_ENABLE_PROJECTS`` or in
``LLVM_ENABLE_RUNTIMES``.

Operand type is invalid:
- The "OperandType" variable is unset for this operand.
```
For example, to build LLVM, Clang, libcxx, and libcxxabi, use
``-DLLVM_ENABLE_PROJECTS="clang" -DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi"``.

- If certain target features (e.g. architecture extensions) were removed from upstream LLVM or you want to add your own,
checkout [DeprecatedFeatures.md](DeprecatedFeatures.md).
* ``-DCMAKE_INSTALL_PREFIX=directory`` --- Specify for *directory* the full
path name of where you want the LLVM tools and libraries to be installed
(default ``/usr/local``). Be careful if you install runtime libraries: if
your system uses those provided by LLVM (like libc++ or libc++abi), you
must not overwrite your system's copy of those libraries, since that
could render your system unusable. In general, using something like
``/usr`` is not advised, but ``/usr/local`` is fine.

* ``-DCMAKE_BUILD_TYPE=type`` --- Valid options for *type* are Debug,
Release, RelWithDebInfo, and MinSizeRel. Default is Debug.

* ``-DLLVM_ENABLE_ASSERTIONS=On`` --- Compile with assertion checks enabled
(default is Yes for Debug builds, No for all other build types).

* ``cmake --build build [-- [options] <target>]`` or your build system specified above
directly.

* The default target (i.e. ``ninja`` or ``make``) will build all of LLVM.

* The ``check-all`` target (i.e. ``ninja check-all``) will run the
regression tests to ensure everything is in working order.

* CMake will generate targets for each tool and library, and most
LLVM sub-projects generate their own ``check-<project>`` target.

* Running a serial build will be **slow**. To improve speed, try running a
parallel build. That's done by default in Ninja; for ``make``, use the option
``-j NNN``, where ``NNN`` is the number of parallel jobs to run.
In most cases, you get the best performance if you specify the number of CPU threads you have.
On some Unix systems, you can specify this with ``-j$(nproc)``.

* For more information see [CMake](https://llvm.org/docs/CMake.html).

Consult the
[Getting Started with LLVM](https://llvm.org/docs/GettingStarted.html#getting-started-with-llvm)
page for detailed information on configuring and compiling LLVM. You can visit
[Directory Layout](https://llvm.org/docs/GettingStarted.html#directory-layout)
to learn about the layout of the source code tree.

## Getting in touch

Join [LLVM Discourse forums](https://discourse.llvm.org/), [discord chat](https://discord.gg/xS7Z362) or #llvm IRC channel on [OFTC](https://oftc.net/).

The LLVM project has adopted a [code of conduct](https://llvm.org/docs/CodeOfConduct.html) for
participants to all modes of communication within the project.
2 changes: 1 addition & 1 deletion llvm/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ if(NOT DEFINED LLVM_VERSION_MINOR)
set(LLVM_VERSION_MINOR 0)
endif()
if(NOT DEFINED LLVM_VERSION_PATCH)
set(LLVM_VERSION_PATCH 4)
set(LLVM_VERSION_PATCH 6)
endif()
if(NOT DEFINED LLVM_VERSION_SUFFIX)
set(LLVM_VERSION_SUFFIX)
Expand Down
14 changes: 11 additions & 3 deletions llvm/cmake/modules/AddLLVM.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -2320,7 +2320,8 @@ function(llvm_setup_rpath name)
# FIXME: update this when there is better solution.
set(_install_rpath "${LLVM_LIBRARY_OUTPUT_INTDIR}" "${CMAKE_INSTALL_PREFIX}/lib${LLVM_LIBDIR_SUFFIX}" ${extra_libdir})
elseif(UNIX)
set(_install_rpath "\$ORIGIN/../lib${LLVM_LIBDIR_SUFFIX}" ${extra_libdir})
set(_build_rpath "\$ORIGIN/../lib${LLVM_LIBDIR_SUFFIX}" ${extra_libdir})
set(_install_rpath "\$ORIGIN/../lib${LLVM_LIBDIR_SUFFIX}")
if(${CMAKE_SYSTEM_NAME} MATCHES "(FreeBSD|DragonFly)")
set_property(TARGET ${name} APPEND_STRING PROPERTY
LINK_FLAGS " -Wl,-z,origin ")
Expand All @@ -2334,9 +2335,16 @@ function(llvm_setup_rpath name)
return()
endif()

# Enable BUILD_WITH_INSTALL_RPATH unless CMAKE_BUILD_RPATH is set.
# Enable BUILD_WITH_INSTALL_RPATH unless CMAKE_BUILD_RPATH is set and not
# building for macOS or AIX, as those platforms seemingly require it.
# On AIX, the tool chain doesn't support modifying rpaths/libpaths for XCOFF
# on install at the moment, so BUILD_WITH_INSTALL_RPATH is required.
if("${CMAKE_BUILD_RPATH}" STREQUAL "")
set_property(TARGET ${name} PROPERTY BUILD_WITH_INSTALL_RPATH ON)
if(${CMAKE_SYSTEM_NAME} MATCHES "Darwin|AIX")
set_property(TARGET ${name} PROPERTY BUILD_WITH_INSTALL_RPATH ON)
else()
set_property(TARGET ${name} APPEND PROPERTY BUILD_RPATH "${_build_rpath}")
endif()
endif()

set_target_properties(${name} PROPERTIES
Expand Down
5 changes: 5 additions & 0 deletions llvm/cmake/modules/LLVM-Config.cmake
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
cmake_policy(PUSH)
cmake_policy(SET CMP0057 NEW)

function(get_system_libs return_var)
message(AUTHOR_WARNING "get_system_libs no longer needed")
set(${return_var} "" PARENT_SCOPE)
Expand Down Expand Up @@ -343,3 +346,5 @@ function(explicit_map_components_to_libraries out_libs)
endforeach(c)
set(${out_libs} ${result} PARENT_SCOPE)
endfunction(explicit_map_components_to_libraries)

cmake_policy(POP)
9 changes: 9 additions & 0 deletions llvm/include/llvm/Analysis/AliasAnalysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,15 @@ class AliasResult {

operator Kind() const { return static_cast<Kind>(Alias); }

bool operator==(const AliasResult &Other) const {
return Alias == Other.Alias && HasOffset == Other.HasOffset &&
Offset == Other.Offset;
}
bool operator!=(const AliasResult &Other) const { return !(*this == Other); }

bool operator==(Kind K) const { return Alias == K; }
bool operator!=(Kind K) const { return !(*this == K); }

constexpr bool hasOffset() const { return HasOffset; }
constexpr int32_t getOffset() const {
assert(HasOffset && "No offset!");
Expand Down
12 changes: 6 additions & 6 deletions llvm/include/llvm/Analysis/TargetLibraryInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -408,14 +408,14 @@ class TargetLibraryInfo {
ShouldExtI32Param = true;
ShouldExtI32Return = true;
}
// Mips and riscv64, on the other hand, needs signext on i32 parameters
// corresponding to both signed and unsigned ints.
if (T.isMIPS() || T.isRISCV64()) {
// LoongArch, Mips, and riscv64, on the other hand, need signext on i32
// parameters corresponding to both signed and unsigned ints.
if (T.isLoongArch() || T.isMIPS() || T.isRISCV64()) {
ShouldSignExtI32Param = true;
}
// riscv64 needs signext on i32 returns corresponding to both signed and
// unsigned ints.
if (T.isRISCV64()) {
// LoongArch and riscv64 need signext on i32 returns corresponding to both
// signed and unsigned ints.
if (T.isLoongArch() || T.isRISCV64()) {
ShouldSignExtI32Return = true;
}
}
Expand Down
2 changes: 1 addition & 1 deletion llvm/include/llvm/Support/Compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@
#define LLVM_ATTRIBUTE_USED
#endif

#if defined(__clang__) && !defined(__INTELLISENSE__)
#if defined(__clang__)
#define LLVM_DEPRECATED(MSG, FIX) __attribute__((deprecated(MSG, FIX)))
#else
#define LLVM_DEPRECATED(MSG, FIX) [[deprecated(MSG)]]
Expand Down
11 changes: 1 addition & 10 deletions llvm/include/llvm/TableGen/StringMatcher.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
#ifndef LLVM_TABLEGEN_STRINGMATCHER_H
#define LLVM_TABLEGEN_STRINGMATCHER_H

#include "PrinterTypes.h"
#include "llvm/ADT/StringRef.h"
#include <string>
#include <utility>
Expand All @@ -36,26 +35,18 @@ class StringMatcher {
StringRef StrVariableName;
const std::vector<StringPair> &Matches;
raw_ostream &OS;
PrinterLanguage PL;

public:
StringMatcher(StringRef strVariableName,
const std::vector<StringPair> &matches, raw_ostream &os)
: StrVariableName(strVariableName), Matches(matches), OS(os), PL(PRINTER_LANG_CPP) {}
StringMatcher(StringRef strVariableName,
const std::vector<StringPair> &matches, raw_ostream &os, PrinterLanguage PL)
: StrVariableName(strVariableName), Matches(matches), OS(os), PL(PL) {}
: StrVariableName(strVariableName), Matches(matches), OS(os) {}

void Emit(unsigned Indent = 0, bool IgnoreDuplicates = false) const;
void EmitCPP(unsigned Indent = 0, bool IgnoreDuplicates = false) const;

private:
bool EmitStringMatcherForChar(const std::vector<const StringPair *> &Matches,
unsigned CharNo, unsigned IndentCount,
bool IgnoreDuplicates) const;
bool EmitStringMatcherForCharCPP(const std::vector<const StringPair *> &Matches,
unsigned CharNo, unsigned IndentCount,
bool IgnoreDuplicates) const;
};

} // end namespace llvm
Expand Down
16 changes: 0 additions & 16 deletions llvm/include/llvm/TableGen/StringToOffsetTable.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,10 @@
#ifndef LLVM_TABLEGEN_STRINGTOOFFSETTABLE_H
#define LLVM_TABLEGEN_STRINGTOOFFSETTABLE_H

#include "PrinterTypes.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringMap.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/TableGen/Error.h"
#include <cctype>

namespace llvm {
Expand All @@ -24,14 +22,10 @@ namespace llvm {
/// It can then output this string blob and use indexes into the string to
/// reference each piece.
class StringToOffsetTable {
PrinterLanguage PL;
StringMap<unsigned> StringOffset;
std::string AggregateString;

public:
StringToOffsetTable() : PL(PRINTER_LANG_CPP) {};
StringToOffsetTable(PrinterLanguage PL) : PL(PL) {};

bool Empty() const { return StringOffset.empty(); }

unsigned GetOrAddStringOffset(StringRef Str, bool appendZero = true) {
Expand All @@ -48,16 +42,6 @@ class StringToOffsetTable {
}

void EmitString(raw_ostream &O) {
switch(PL) {
default:
PrintFatalNote("No StringToOffsetTable method defined to emit the selected language.\n");
case PRINTER_LANG_CPP:
EmitStringCPP(O);
break;
}
}

void EmitStringCPP(raw_ostream &O) {
// Escape the string.
SmallString<256> Str;
raw_svector_ostream(Str).write_escaped(AggregateString);
Expand Down
9 changes: 3 additions & 6 deletions llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1724,12 +1724,9 @@ bool TargetLowering::SimplifyDemandedBits(
unsigned InnerBits = InnerVT.getScalarSizeInBits();
if (ShAmt < InnerBits && DemandedBits.getActiveBits() <= InnerBits &&
isTypeDesirableForOp(ISD::SHL, InnerVT)) {
EVT ShTy = getShiftAmountTy(InnerVT, DL);
if (!APInt(BitWidth, ShAmt).isIntN(ShTy.getSizeInBits()))
ShTy = InnerVT;
SDValue NarrowShl =
TLO.DAG.getNode(ISD::SHL, dl, InnerVT, InnerOp,
TLO.DAG.getConstant(ShAmt, dl, ShTy));
SDValue NarrowShl = TLO.DAG.getNode(
ISD::SHL, dl, InnerVT, InnerOp,
TLO.DAG.getShiftAmountConstant(ShAmt, InnerVT, dl));
return TLO.CombineTo(
Op, TLO.DAG.getNode(ISD::ANY_EXTEND, dl, VT, NarrowShl));
}
Expand Down
1 change: 0 additions & 1 deletion llvm/lib/Support/BLAKE3/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ if (LLVM_DISABLE_ASSEMBLY_FILES)
else()
set(CAN_USE_ASSEMBLER TRUE)
endif()
set(CAN_USE_ASSEMBLER FALSE)

macro(disable_blake3_x86_simd)
add_compile_definitions(BLAKE3_NO_AVX512 BLAKE3_NO_AVX2 BLAKE3_NO_SSE41 BLAKE3_NO_SSE2)
Expand Down
Loading

0 comments on commit eadead3

Please sign in to comment.