Skip to content

Commit

Permalink
[Clacc] Merge branch 'main' into clacc/main
Browse files Browse the repository at this point in the history
This merge brings in the following commits from upstream:

* 307bbd3, 4e34f06, b316126: These commits fix some
  exclusive access and race issues in libomptarget.  Many conflicts
  with Clacc's implementation result:
    * They make significant changes to the `HostDataToTargetMap` data
      structure in `openmp/libomptarget/include/device.h` and thus
      update code within `device.cpp`, `omptarget.cpp`, etc.  Parts of
      Clacc's OMPT offload prototype appear here.  This merge resolves
      conflicts in favor of upstream and then reapplies Clacc's
      changes based on the new data structure.  This merge also
      updates Clacc's `lookupHostPtr` and `getAccessibleBuffer`
      implementation in `device.cpp` to use the new
      `HostDataToTargetMap` interface.
	* They rearrange `InitLibrary` in
      `openmp/libomptarget/src/omptarget.cpp`.  Clacc adds an OMPT
      offload callback here.  This merge resolves conflicts in favor
      of upstream and then reinserts the OMPT callback.
    * They replace `DeallocTgtPtrInfo` with `PostProcessingInfo` in
      `openmp/libomptarget/src/omptarget.cpp`.  Clacc adds
      `HstPtrName` field to `DeallocTgtPtrInfo` for OMPT support and
      thus to related `emplace_back` calls.  However, that's now
      available via `PostProcessingInfo`'s `TPR` field, so this merge
      drops Clacc's change here.
    * They rewrite `targetDataEnd` in
      `openmp/libomptarget/src/omptarget.cpp`.  Clacc instantiates an
      `OmptMapVarInfoRAII` before the `Device.deallocTgtPtr` call
      there.  This merge resolves conflicts in favor of upstream and
      adds back the `OmptMapVarInfoRAII` instantiation.
    * As a drive-by fix, this merge adds `TIMESCOPE` calls to Clacc's
      implementations for `omp_target_is_accessible`,
      `omp_get_mapped_ptr`, `omp_get_mapped_hostptr`, and
      `omp_get_accessible_buffer`.
* c1a6fe1: This commit changes the way mapped variables are
  looked up in libomptarget:
    * An effect is that, if `arr[N:M]` is currently mapped, then
      `omp_target_is_present` now returns true and
      `omp_get_mapped_ptr` now returns a non-null device pointer when
      passed a host pointer within `arr[0:N]`.
    * Whether this new behavior is correct is being discussed in
      <llvm#54899> and at
      <omp-lang@openmp.org>.  No consensus has yet been reached.
	* The behavior of Clacc's implementations of
      `omp_target_is_accessible` and `omp_get_accessible_buffer` (a
      Clacc extension) are also affected for the case of size=0.  This
      merge updates comments on those to explain the issue.
	* The above OpenMP behavior changes affect Clacc's implementation
      of `acc_is_present` and `acc_deviceptr`.  This merge updates
      comments on those (and related comments on `acc_hostptr`) and
      adjusts their implementations so that they are immune to the
      OpenMP behavior change, even if it is later reverted.  It also
      adjusts the implementation of `checkPresence` in `api.cpp` so it
      is immune as well, but this adjustment is currently NFC, as
      explained in the new comments there.
	* This merge updates references to the OpenACC and OpenMP specs in
      many related comments.
* 79f661e, 6bd8dc9, 2cedaee, and f82ec55: These
  commits define new AST `Stmt` nodes (for the OpenMP loop construct)
  and thus insert new enumerators into `CXCursorKind` immediately
  before `CXCursor_LastStmt`.  Clacc does the same.  This merges
  combines those, keeping Clacc's enumerators last.
* Various commits with which this merge resolves contextual conflicts.
  • Loading branch information
jdenny-ornl committed Apr 15, 2022
2 parents c58310e + d16a631 commit ce30c07
Show file tree
Hide file tree
Showing 13,951 changed files with 666,445 additions and 332,457 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
23 changes: 12 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -335,7 +335,7 @@ Department of Energy under Contract No. DE-AC05-00OR22725.

# The LLVM Compiler Infrastructure

This directory and its sub-directories contain source code for LLVM,
This directory and its sub-directories contain the source code for LLVM,
a toolkit for the construction of highly optimized compilers,
optimizers, and run-time environments.

Expand All @@ -346,7 +346,7 @@ take a look at the

## Getting Started with the LLVM System

Taken from https://llvm.org/docs/GettingStarted.html.
Taken from [here](https://llvm.org/docs/GettingStarted.html).

### Overview

Expand All @@ -355,10 +355,10 @@ Welcome to the LLVM project!
The LLVM project has multiple components. The core of the project is
itself called "LLVM". This contains all of the tools, libraries, and header
files needed to process intermediate representations and convert them into
object files. Tools include an assembler, disassembler, bitcode analyzer, and
bitcode optimizer. It also contains basic regression tests.
object files. Tools include an assembler, disassembler, bitcode analyzer, and
bitcode optimizer. It also contains basic regression tests.

C-like languages use the [Clang](http://clang.llvm.org/) front end. This
C-like languages use the [Clang](http://clang.llvm.org/) frontend. This
component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode
-- and from there into object files, using LLVM.

Expand All @@ -368,7 +368,7 @@ the [LLD linker](https://lld.llvm.org), and more.

### Getting the Source Code and Building LLVM

The LLVM Getting Started documentation may be out of date. The [Clang
The LLVM Getting Started documentation may be out of date. The [Clang
Getting Started](http://clang.llvm.org/get_started.html) page might have more
accurate information.

Expand Down Expand Up @@ -435,12 +435,13 @@ This is an example work-flow and configuration to get and build the LLVM source:
* CMake will generate targets for each tool and library, and most
LLVM sub-projects generate their own ``check-<project>`` target.

* Running a serial build will be **slow**. To improve speed, try running a
parallel build. That's done by default in Ninja; for ``make``, use the option
``-j NNN``, where ``NNN`` is the number of parallel jobs, e.g. the number of
CPUs you have.
* Running a serial build will be **slow**. To improve speed, try running a
parallel build. That's done by default in Ninja; for ``make``, use the option
``-j NNN``, where ``NNN`` is the number of parallel jobs to run.
In most cases, you get the best performance if you specify the number of CPU threads you have.
On some Unix systems, you can specify this with ``-j$(nproc)``.

* For more information see [CMake](https://llvm.org/docs/CMake.html)
* For more information see [CMake](https://llvm.org/docs/CMake.html).

Consult the
[Getting Started with LLVM](https://llvm.org/docs/GettingStarted.html#getting-started-with-llvm)
Expand Down
2 changes: 2 additions & 0 deletions bolt/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ if (git_executable)
endif()
endif()

find_program(GNU_LD_EXECUTABLE NAMES ${LLVM_DEFAULT_TARGET_TRIPLE}-ld.bfd ld.bfd DOC "GNU ld")

# If we can't find a revision, set it to "<unknown>".
if (NOT BOLT_REVISION)
set(BOLT_REVISION "<unknown>")
Expand Down
20 changes: 19 additions & 1 deletion bolt/include/bolt/Core/BinaryContext.h
Original file line number Diff line number Diff line change
Expand Up @@ -489,7 +489,9 @@ class BinaryContext {
void adjustCodePadding();

/// Regular page size.
static constexpr unsigned RegularPageSize = 0x1000;
unsigned RegularPageSize{0x1000};
static constexpr unsigned RegularPageSizeX86 = 0x1000;
static constexpr unsigned RegularPageSizeAArch64 = 0x10000;

/// Huge page size to use.
static constexpr unsigned HugePageSize = 0x200000;
Expand Down Expand Up @@ -772,6 +774,22 @@ class BinaryContext {
return Itr != GlobalSymbols.end() ? Itr->second : nullptr;
}

/// Return registered PLT entry BinaryData with the given \p Name
/// or nullptr if no global PLT symbol with that name exists.
const BinaryData *getPLTBinaryDataByName(StringRef Name) const {
if (const BinaryData *Data = getBinaryDataByName(Name.str() + "@PLT"))
return Data;

// The symbol name might contain versioning information e.g
// memcpy@@GLIBC_2.17. Remove it and try to locate binary data
// without it.
size_t At = Name.find("@");
if (At != std::string::npos)
return getBinaryDataByName(Name.str().substr(0, At) + "@PLT");

return nullptr;
}

/// Return true if \p SymbolName was generated internally and was not present
/// in the input binary.
bool isInternalSymbolName(const StringRef Name) {
Expand Down
15 changes: 13 additions & 2 deletions bolt/include/bolt/Core/BinaryFunction.h
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,9 @@ class BinaryFunction {

mutable MCSymbol *FunctionConstantIslandLabel{nullptr};
mutable MCSymbol *FunctionColdConstantIslandLabel{nullptr};

// Returns constant island alignment
uint16_t getAlignment() const { return sizeof(uint64_t); }
};

static constexpr uint64_t COUNT_NO_PROFILE =
Expand Down Expand Up @@ -2047,6 +2050,10 @@ class BinaryFunction {
return *std::prev(CodeIter) <= *DataIter;
}

uint16_t getConstantIslandAlignment() const {
return Islands ? Islands->getAlignment() : 1;
}

uint64_t
estimateConstantIslandSize(const BinaryFunction *OnBehalfOf = nullptr) const {
if (!Islands)
Expand Down Expand Up @@ -2074,9 +2081,13 @@ class BinaryFunction {
Size += NextMarker - *DataIter;
}

if (!OnBehalfOf)
for (BinaryFunction *ExternalFunc : Islands->Dependency)
if (!OnBehalfOf) {
for (BinaryFunction *ExternalFunc : Islands->Dependency) {
Size = alignTo(Size, ExternalFunc->getConstantIslandAlignment());
Size += ExternalFunc->estimateConstantIslandSize(this);
}
}

return Size;
}

Expand Down
25 changes: 13 additions & 12 deletions bolt/include/bolt/Core/MCPlusBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -353,7 +353,7 @@ class MCPlusBuilder {
}

virtual bool isUnconditionalBranch(const MCInst &Inst) const {
return Analysis->isUnconditionalBranch(Inst);
return Analysis->isUnconditionalBranch(Inst) && !isTailCall(Inst);
}

virtual bool isIndirectBranch(const MCInst &Inst) const {
Expand Down Expand Up @@ -511,11 +511,6 @@ class MCPlusBuilder {
return 0;
}

virtual bool isADD64rr(const MCInst &Inst) const {
llvm_unreachable("not implemented");
return false;
}

virtual bool isSUB(const MCInst &Inst) const {
llvm_unreachable("not implemented");
return false;
Expand All @@ -526,11 +521,6 @@ class MCPlusBuilder {
return false;
}

virtual bool isMOVSX64rm32(const MCInst &Inst) const {
llvm_unreachable("not implemented");
return false;
}

virtual bool isLeave(const MCInst &Inst) const {
llvm_unreachable("not implemented");
return false;
Expand Down Expand Up @@ -1287,7 +1277,18 @@ class MCPlusBuilder {

/// Replace instruction with a shorter version that could be relaxed later
/// if needed.
virtual bool shortenInstruction(MCInst &Inst) const {
virtual bool shortenInstruction(MCInst &Inst,
const MCSubtargetInfo &STI) const {
llvm_unreachable("not implemented");
return false;
}

/// Convert a move instruction into a conditional move instruction, given a
/// condition code.
virtual bool
convertMoveToConditionalMove(MCInst &Inst, unsigned CC,
bool AllowStackMemOp = false,
bool AllowBasePtrStackMemOp = false) const {
llvm_unreachable("not implemented");
return false;
}
Expand Down
85 changes: 85 additions & 0 deletions bolt/include/bolt/Passes/CMOVConversion.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
//===- bolt/Passes/CMOVConversion.h ----------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This pass finds the following patterns:
// jcc
// / \
// (empty) mov src, dst
// \ /
//
// and replaces them with:
//
// cmovcc src, dst
//
// The advantage of performing this conversion in BOLT (compared to compiler
// heuristic driven instruction selection) is that BOLT can use LBR
// misprediction information and only convert poorly predictable branches.
// Note that branch misprediction rate is different from branch bias.
// For well-predictable branches, it might be beneficial to leave jcc+mov as is
// from microarchitectural perspective to avoid unneeded dependencies (CMOV
// instruction has a dataflow dependence on flags and both operands).
//
//===----------------------------------------------------------------------===//

#ifndef BOLT_PASSES_CMOVCONVERSION_H
#define BOLT_PASSES_CMOVCONVERSION_H

#include "bolt/Passes/BinaryPasses.h"

namespace llvm {
namespace bolt {

/// Pass for folding eligible hammocks into CMOV's if profitable.
class CMOVConversion : public BinaryFunctionPass {
struct Stats {
/// Record how many possible cases there are.
uint64_t StaticPossible = 0;
uint64_t DynamicPossible = 0;

/// Record how many cases were converted.
uint64_t StaticPerformed = 0;
uint64_t DynamicPerformed = 0;

/// Record how many mispredictions were eliminated.
uint64_t PossibleMP = 0;
uint64_t RemovedMP = 0;

Stats operator+(const Stats &O) {
StaticPossible += O.StaticPossible;
DynamicPossible += O.DynamicPossible;
StaticPerformed += O.StaticPerformed;
DynamicPerformed += O.DynamicPerformed;
PossibleMP += O.PossibleMP;
RemovedMP += O.RemovedMP;
return *this;
}
double getStaticRatio() { return (double)StaticPerformed / StaticPossible; }
double getDynamicRatio() {
return (double)DynamicPerformed / DynamicPossible;
}
double getMPRatio() { return (double)RemovedMP / PossibleMP; }

void dump();
};
// BinaryContext-wide stats
Stats Global;

void runOnFunction(BinaryFunction &Function);

public:
explicit CMOVConversion() : BinaryFunctionPass(false) {}

const char *getName() const override { return "CMOV conversion"; }

void runOnFunctions(BinaryContext &BC) override;
};

} // namespace bolt
} // namespace llvm

#endif
25 changes: 13 additions & 12 deletions bolt/include/bolt/Rewrite/RewriteInstance.h
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ class RewriteInstance {

/// Read info from special sections. E.g. eh_frame and .gcc_except_table
/// for exception and stack unwinding information.
void readSpecialSections();
Error readSpecialSections();

/// Adjust supplied command-line options based on input data.
void adjustCommandLineOptions();
Expand Down Expand Up @@ -260,9 +260,9 @@ class RewriteInstance {
void disassemblePLTSectionX86(BinarySection &Section, uint64_t EntrySize);

/// ELF-specific part. TODO: refactor into new class.
#define ELF_FUNCTION(FUNC) \
template <typename ELFT> void FUNC(object::ELFObjectFile<ELFT> *Obj); \
void FUNC() { \
#define ELF_FUNCTION(TYPE, FUNC) \
template <typename ELFT> TYPE FUNC(object::ELFObjectFile<ELFT> *Obj); \
TYPE FUNC() { \
if (auto *ELF32LE = dyn_cast<object::ELF32LEObjectFile>(InputFile)) \
return FUNC(ELF32LE); \
if (auto *ELF64LE = dyn_cast<object::ELF64LEObjectFile>(InputFile)) \
Expand All @@ -277,25 +277,25 @@ class RewriteInstance {
void patchELFPHDRTable();

/// Create section header table.
ELF_FUNCTION(patchELFSectionHeaderTable);
ELF_FUNCTION(void, patchELFSectionHeaderTable);

/// Create the regular symbol table and patch dyn symbol tables.
ELF_FUNCTION(patchELFSymTabs);
ELF_FUNCTION(void, patchELFSymTabs);

/// Read dynamic section/segment of ELF.
ELF_FUNCTION(readELFDynamic);
ELF_FUNCTION(Error, readELFDynamic);

/// Patch dynamic section/segment of ELF.
ELF_FUNCTION(patchELFDynamic);
ELF_FUNCTION(void, patchELFDynamic);

/// Patch .got
ELF_FUNCTION(patchELFGOT);
ELF_FUNCTION(void, patchELFGOT);

/// Patch allocatable relocation sections.
ELF_FUNCTION(patchELFAllocatableRelaSections);
ELF_FUNCTION(void, patchELFAllocatableRelaSections);

/// Finalize memory image of section header string table.
ELF_FUNCTION(finalizeSectionStringTable);
ELF_FUNCTION(void, finalizeSectionStringTable);

/// Return a name of the input file section in the output file.
template <typename ELFObjType, typename ELFShdrTy>
Expand Down Expand Up @@ -498,7 +498,8 @@ class RewriteInstance {
};

/// AArch64 PLT sections.
const PLTSectionInfo AArch64_PLTSections[2] = {{".plt"}, {nullptr}};
const PLTSectionInfo AArch64_PLTSections[3] = {
{".plt"}, {".iplt"}, {nullptr}};

/// Return PLT information for a section with \p SectionName or nullptr
/// if the section is not PLT.
Expand Down
1 change: 1 addition & 0 deletions bolt/include/bolt/Utils/CommandLineOpts.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ extern llvm::cl::OptionCategory BoltInstrCategory;
extern llvm::cl::OptionCategory HeatmapCategory;

extern llvm::cl::opt<unsigned> AlignText;
extern llvm::cl::opt<unsigned> AlignFunctions;
extern llvm::cl::opt<bool> AggregateOnly;
extern llvm::cl::opt<unsigned> BucketsPerLine;
extern llvm::cl::opt<bool> DiffOnly;
Expand Down
11 changes: 10 additions & 1 deletion bolt/lib/Core/BinaryContext.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@ BinaryContext::BinaryContext(std::unique_ptr<MCContext> Ctx,
InstPrinter(std::move(InstPrinter)), MIA(std::move(MIA)),
MIB(std::move(MIB)), MRI(std::move(MRI)), DisAsm(std::move(DisAsm)) {
Relocation::Arch = this->TheTriple->getArch();
RegularPageSize = isAArch64() ? RegularPageSizeAArch64 : RegularPageSizeX86;
PageAlign = opts::NoHugePages ? RegularPageSize : HugePageSize;
}

Expand Down Expand Up @@ -154,12 +155,17 @@ BinaryContext::createBinaryContext(const ObjectFile *File, bool IsPIC,
Twine("BOLT-ERROR: no register info for target ", TripleName));

// Set up disassembler.
std::unique_ptr<const MCAsmInfo> AsmInfo(
std::unique_ptr<MCAsmInfo> AsmInfo(
TheTarget->createMCAsmInfo(*MRI, TripleName, MCTargetOptions()));
if (!AsmInfo)
return createStringError(
make_error_code(std::errc::not_supported),
Twine("BOLT-ERROR: no assembly info for target ", TripleName));
// BOLT creates "func@PLT" symbols for PLT entries. In function assembly dump
// we want to emit such names as using @PLT without double quotes to convey
// variant kind to the assembler. BOLT doesn't rely on the linker so we can
// override the default AsmInfo behavior to emit names the way we want.
AsmInfo->setAllowAtInName(true);

std::unique_ptr<const MCSubtargetInfo> STI(
TheTarget->createMCSubtargetInfo(TripleName, "", FeaturesStr));
Expand Down Expand Up @@ -1528,6 +1534,9 @@ void BinaryContext::preprocessDebugInfo() {
}

bool BinaryContext::shouldEmit(const BinaryFunction &Function) const {
if (Function.isPseudo())
return false;

if (opts::processAllFunctions())
return true;

Expand Down
Loading

0 comments on commit ce30c07

Please sign in to comment.