Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to LLVM 14 #11274

Closed
Tracked by #89
gwenzek opened this issue Mar 23, 2022 · 17 comments · Fixed by #12001
Closed
Tracked by #89

Upgrade to LLVM 14 #11274

gwenzek opened this issue Mar 23, 2022 · 17 comments · Fixed by #12001
Labels
backend-llvm The LLVM backend outputs an LLVM IR Module. enhancement Solving this issue will likely involve adding new logic or components to the codebase. frontend Tokenization, parsing, AstGen, Sema, and Liveness.
Milestone

Comments

@gwenzek
Copy link
Contributor

gwenzek commented Mar 23, 2022

Relevant LLVM doc: https://llvm.org/docs/OpaquePointers.html

Traditionally, LLVM IR pointer types have contained a pointee type. For example, i32* is a pointer that points to an i32 somewhere in memory. However, due to a lack of pointee type semantics and various issues with having pointee types, there is a desire to remove pointee types from pointers.
The opaque pointer type project aims to replace all pointer types containing pointee types in LLVM with an opaque pointer type. The new pointer type is tentatively represented textually as ptr.

This lead to the deprecation of a number of C API function that now required to be given both the pointer LLVMValueRef and the pointee LLVMTypeRef:

LLVMBuildLoad -> LLVMBuildLoad2
LLVMBuildCall -> LLVMBuildCall2
LLVMBuildInvoke -> LLVMBuildInvoke2
LLVMBuildGEP -> LLVMBuildGEP2
LLVMBuildInBoundsGEP -> LLVMBuildInBoundsGEP2
LLVMBuildStructGEP -> LLVMBuildStructGEP2
LLVMBuildPtrDiff -> LLVMBuildPtrDiff2
LLVMConstGEP -> LLVMConstGEP2
LLVMConstInBoundsGEP -> LLVMConstInBoundsGEP2
LLVMAddAlias -> LLVMAddAlias2
Additionally, it will no longer be possible to call LLVMGetElementType() on a pointer type.

codegen.cpp make extensive use of all the above functions so a number of changes need to happen there.
Trying to compile Zig with LLVM14 currentlly yield 200 warnings and 20 errors which look all related to the above changes.

I've started working on some changes, but I'm not really familiar with codegen.cpp and I'm afraid of making errors and passing the wrong types. But hopefully doing the most of the refactoring myself will allow @andrewrk to focus on the semantic.

gwenzek added a commit to gwenzek/zig that referenced this issue Mar 23, 2022
@gwenzek
Copy link
Contributor Author

gwenzek commented Mar 23, 2022

Preview of ongoing changes: #11374

@Vexu Vexu added this to the 0.10.0 milestone Mar 27, 2022
@Vexu Vexu added the backend-llvm The LLVM backend outputs an LLVM IR Module. label Mar 27, 2022
@gwenzek
Copy link
Contributor Author

gwenzek commented Mar 28, 2022

I've hit an issue, which I don't quite understand

zig0 fails to compile stage1:

(gdb) r
Starting program: /home/guw/github/zig-bootstrap/out/build-zig-host/zig0 src/stage1.zig -target native -mcpu=baseline --name zig1 --zig-lib-dir /home/guw/github/zig-bootstrap/zig/lib -femit-bin=/home/guw/github/zig-bootstrap/out/build-zig-host/zig1.o -fcompiler-rt -OReleaseFast --strip -lc --pkg-begin build_options /home/guw/github/zig-bootstrap/out/build-zig-host/config.zig --pkg-end --pkg-begin compiler_rt /home/guw/github/zig-bootstrap/zig/lib/std/special/compiler_rt.zig --pkg-end

zig0: /home/guw/github/zig-bootstrap/out/host/include/llvm/IR/InstrTypes.h:1535: void llvm::CallBase::addParamAttr(unsigned int, llvm::Attribute): Assertion `ArgNo < arg_size() && "Out of bounds"' failed.
Program received signal SIGABRT, Aborted.

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
#1  0x00007ffff7a098a4 in __GI_abort () at abort.c:79
#2  0x00007ffff7a09789 in __assert_fail_base (fmt=<optimized out>, assertion=<optimized out>, 
    file=<optimized out>, line=<optimized out>, function=<optimized out>) at assert.c:92
#3  0x00007ffff7a18a16 in __GI___assert_fail (assertion=0x43ac8a8 "ArgNo < arg_size() && \"Out of bounds\"", 
    file=0x43ac808 "/home/guw/github/zig-bootstrap/out/host/include/llvm/IR/InstrTypes.h", line=1535, 
    function=0x43ac860 "void llvm::CallBase::addParamAttr(unsigned int, llvm::Attribute)") at assert.c:101
#4  0x0000000000931ca2 in llvm::CallBase::addParamAttr (this=0x170630450, ArgNo=1, Attr=...)
    at /home/guw/github/zig-bootstrap/out/host/include/llvm/IR/InstrTypes.h:1535
#5  0x000000000092d1e0 in ZigLLVMSetCallSret (Call=0x170630450, return_type=0x16e98e400)
    at /home/guw/github/zig-bootstrap/zig/src/zig_llvm.cpp:1091
#6  0x00000000007ee428 in ir_render_call (g=0x7626690, executable=0x897cdd8, instruction=0x27249e50)
    at /home/guw/github/zig-bootstrap/zig/src/stage1/codegen.cpp:5068
#7  0x00000000007f8022 in ir_render_instruction (g=0x7626690, executable=0x897cdd8, instruction=0x27249e50)
    at /home/guw/github/zig-bootstrap/zig/src/stage1/codegen.cpp:7492
#8  0x00000000007f89f6 in ir_render (g=0x7626690, fn_entry=0x897cd50)
    at /home/guw/github/zig-bootstrap/zig/src/stage1/codegen.cpp:7666
#9  0x00000000007fef96 in do_code_gen (g=0x7626690)
    at /home/guw/github/zig-bootstrap/zig/src/stage1/codegen.cpp:9116
#10 0x00000000008040f8 in codegen_build_object (g=0x7626690)
    at /home/guw/github/zig-bootstrap/zig/src/stage1/codegen.cpp:10354
#11 0x00000000007d5e25 in zig_stage1_build_object (stage1=0x7626690)
    at /home/guw/github/zig-bootstrap/zig/src/stage1/stage1.cpp:132
#12 0x00000000007d22af in main (argc=22, argv=0x7fffffffdb68)
    at /home/guw/github/zig-bootstrap/zig/src/stage1/zig0.cpp:477

This happen when trying to compile the empty_workaround function:

zig/lib/std/target.zig

Lines 648 to 650 in 5466e87

pub fn empty_workaround() Set {
return Set{ .ints = [1]usize{0} ** usize_count };
}

IIUC we are transforming this function of 0 arguments to a function that takes a pointer to return the value, then we are trying to add an attribute to this new parameter, but we crash because we are writing to the attribute array at a non exisiting offset.
To "fix" this I've changed the index in ZigLLVMSetCallSret to 0 instead of 1. Looking at downstream code it look like they are also offsetting the offset, so maybe that could explain the off by one error ? I need to check that the attribute ends up on the correct argument, but this allows me to go further.

void ZigLLVMSetCallSret(LLVMValueRef Call, LLVMTypeRef return_type) {
    CallInst *call_inst = unwrap<CallInst>(Call);
    Type *llvm_type = unwrap<Type>(return_type);
    // fixed to 0 ???
    call_inst->addParamAttr(1, Attribute::getWithStructRetType(call_inst->getContext(), llvm_type));
}

LLVM_NODISCARD AttributeList addParamAttribute(
      LLVMContext &C, unsigned ArgNo, Attribute::AttrKind Kind) const {
    return addAttributeAtIndex(C, ArgNo + FirstArgIndex, Kind);
  }

gwenzek added a commit to gwenzek/zig that referenced this issue Mar 28, 2022
gwenzek added a commit to gwenzek/zig that referenced this issue Mar 29, 2022
@gwenzek
Copy link
Contributor Author

gwenzek commented Apr 1, 2022

I think my change of attribute argument was indeed correct.
The rest of the code seems to consider the struct return value as being the first argument (see want_first_arg_sret)

I also had to change the corresponding AddSretArg to also modify the first argument.
WIth this changes I can know compile stage1 with zig0. But the resulting zig binary segfaults when trying to build stage2.

I need to run the tests for the stage1 and see what's going on, but enough for today.

@gwenzek
Copy link
Contributor Author

gwenzek commented May 4, 2022

Relevant video from Nikita Popov Opaque Pointers Are Coming

My own take on the video:

  • Typed pointer provide no semantic information (or redundant)
  • Typed pointer uses memory
  • Typed pointer often requires bitcasting between different typed pointers
  • Those bitcasting need to be ignored by the optimizations passes and most of them don't
  • Self refrencing struct types become simpler to instantiate for LLVM: struct T { *T } become simply a struct T { ptr }
  • Typed pointers are deprecated and "best-effort" starting from LLVM15 and will be removed in LLVM16 (fixed based on nikic comment)

Zig LLVM-backend isn't too much impacted AFAICT because we already store the pointee type in ZIR, we just need to pass the type down to LLVM. The "hard" part is just to pass the correct type in the correct place. And we should also be able to remove some pointer bitcast instructions.

The transition should be smooth:

  1. replace all untyped Load/GEP by their corresponding typed version by passing the correct pointee type.
    LLVM will assert that the receive type is the same than the pointer type (and therefore that we didn't change the semantic)
  2. replace pointer type creation to opaque pointers (at which point there won't be LLVM asserts)
  3. remove pointer to pointer bitcasts

@gwenzek
Copy link
Contributor Author

gwenzek commented May 8, 2022

Status update, after replacing most LLVMBuildLoad by LLVMBuildLoad2 I was able to run 971 behavior tests.

#11587 prevents me from running the vector related tests: (vector.zig, select.zig, shuffle.zig and math.zig)

There also seems to be some changes to Assembly in LLVM14 preventing me from running asm.zig and forcing me to rewrite x86BitRmw to a simple implementation not using inline assembly.

I also had to replace the call to zig ld.lld by a direct call to ld.lld.

Next steps for me:

  • understand the changes to assembly
  • finish replacing the calls to the deprecated LLVM C API
  • remove unnecessary bitcasts between pointers of different types

Then I'll dig deeper into the ld.lld regression.

@nikic
Copy link

nikic commented May 12, 2022

Typed pointers are deprecated and "best-effort" starting from LLVM14 and will be removed in LLVM15

This is off-by-one: Opaque pointers are the default in LLVM 15, and typed pointers will be removed in LLVM 16. LLVM 14 doesn't have production-ready opaque pointers support yet.

The deprecations in LLVM 14 are preparatory work -- the new APIs are compatible with both typed and opaque pointers, while the old ones only support typed pointers.

@gwenzek
Copy link
Contributor Author

gwenzek commented May 19, 2022

I've hit two issues which are related to a change of behavior between LLVM 13 and LLVM 14.

First one I believe it's a bug on LLVM side related to constant vector of bools. I've opened an issue there: llvm/llvm-project#55522

The second one is about the saturating left shift arithmetic (implemented in #9679)
From LLVM language ref doing a saturating left shift of 8 on an u8 results in a poison value.
But the tests expects a result of 255:

try expect((lhs_var <<| 8) == 255);

Under LLVM13, (8 <<| 8) was returning 255 like in the test case, but under LLVM 14 I observe other values.

To reproduce:

sat.ll

; Function Attrs: nofree nosync nounwind readnone speculatable willreturn
declare i8 @llvm.ushl.sat.i8(i8 %0, i8 %1) #3

define i8 @main() {
Entry:
  %lhs_var = alloca i8, align 1
  store i8 1, i8* %lhs_var, align 1
  %0 = load i8, i8* %lhs_var, align 1
  %1 = call i8 @llvm.ushl.sat.i8(i8 %0, i8 7)
  ret i8 %1
}
$ ~/Downloads/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/lli sat.ll ; echo $?
255

$ ~/Downloads/clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/lli sat.ll ; echo $?
136

So what should be the Zig behavior for (lhs_var_u8 <<| 8) ? I guess in the case of constant we could detect it at comptime but what about vars ?

Tagging @andrewrk who approved #1284 and @travisstaloch who worked on the implementation.

@andrewrk
Copy link
Member

Thanks for looking into this @gwenzek.

So what should be the Zig behavior for (lhs_var_u8 <<| 8) ? I guess in the case of constant we could detect it at comptime but what about vars ?

The Zig language definition of saturating arithmetic is that there is no possibility of undefined or poison. Any kind of overflow or large value will saturate the destination type. In cases where that cannot lower directly to LLVM saturating arithmetic, the LLVM backend must emit different instructions that maintain the Zig semantics. In this case, the lowering will need to be different depending on whether the RHS is comptime known or not. If it is comptime known and the value does not exceed the number of bits of the LHS, then it can be lowered with llvm.ushl.sat. Otherwise, it will have to be lowered as something like this:

%a = call i8 @llvm.ushl.sat.i8(i8 %lhs, i8 %rhs)
%b = icmp gt i8 %rhs, 8
%result = select i1 %b, i8 255, i8 %a

I suspect when using the LLVM IR Builder API, it will do the comptime logic for us automatically with basic constant folding.

@gwenzek
Copy link
Contributor Author

gwenzek commented Jun 27, 2022

I've been working a bit more on this. The result are visible on gwenzek#4
The opaque pointer change is taking me some time because I need to understand every piece of code to see what is the correct type of each pointer and what's the best way to fetch the type information.
I'm also making heavy use of gdb, but the cycle compile / load in gdb isn't very fast especially with my new laptop.
I think I can handle it, but I also feel we should start thinking how to not block Zig 0.10 on this.
I don't think the opaque pointer stuff is a huge problem, just compiling Zig with -Wno-deprecated-declarations works.

But there are more important / harder blocker where I could need help.

  1. Saturating bit shift changes: we can't rely on ushl.sat anymore, we have to roll out our own (see Andrew previous message)
    is probably easy and is also easy for anyone to grab and implement. You don't need to use my branch, you can implement this in master and checks that the generate LLVM IR looks good.

  2. Inline assembly seems to be working differently: I had to disable behavior/asm.zig and rollback some changes in x86BitRmw. I was hoping to find documentation about the inline ASM changes, but didn't found any. If someone has pointers I'd appreciate it. LLVM14 release notes are silent on the subject.

  3. In my branch zig ld.lld behaves differently from ld.lld, I probably need to build lld with debug symbols to be able to understand what's going on. I'd be curious to see if I'm the only one else is seeing this issue.

  4. LLVM14.05 generates huge .rodata section when it sees constant vector of u1, u2, u4. Regression: huge .rodata section generated because of constant vector of bool. llvm/llvm-project#55522 This tends to create issues when compiling behavior/vector.zig and probably to build stage2 but I haven't checked.

@andrewrk andrewrk added frontend Tokenization, parsing, AstGen, Sema, and Liveness. enhancement Solving this issue will likely involve adding new logic or components to the codebase. labels Jun 28, 2022
@andrewrk andrewrk changed the title LLVM14: Opaque pointer changes Upgrade to LLVM 14 Jun 28, 2022
@andrewrk
Copy link
Member

andrewrk commented Jul 2, 2022

I pushed a few commits to the llvm14 branch. It's now synced to master, it has CPU features updated, and I believe I fixed 2 and 3 in your list above.

The biggest problem I'm dealing with now is getting a ton of these:

ld.lld: error: ../test/zig-cache/o/898ecf677898a0bb3f352d937beeea76/test.o:(function std.heap.PageAllocator.alloc: .text+0x320): relocation R_X86_64_32S out of range: 8593863000 is not in [-2147483648, 2147483647]
>>> referenced by heap.zig:323 (/home/andy/dev/zig/lib/std/heap.zig:323)

when running the behavior tests. Is this your issue (4) above?

Unfortunately for llvm/llvm-project#55522 we did not follow one crucial procedure which is ask the release manager to add the issue into the 14 milestone. So they are now done with 14 and will not be releasing any more bug fix releases. We have to live with this regression now.

@gwenzek
Copy link
Contributor Author

gwenzek commented Jul 2, 2022

Wow thanks a lot Andrew for your help.

I believe that the relocation errors you're seeing are a symptom of llvm/llvm-project#55522
If you disable vector.zig and math.zig behaviort tests you shouldn't see those anymore.

I'm sorry about not following the procedure, but I didn't knew about it. I spent some time to reduce the issue and make it easy to reproduce.

How long as 14.0.6 been out? It does seem something important enough, can we reach out to the release manager ? Who are they?

@andrewrk
Copy link
Member

andrewrk commented Jul 2, 2022

Wow thanks a lot Andrew for your help.

No problem! It is me who should be thanking you since I looked at your diff for inspiration, especially on the SRet one.

I believe that the relocation errors you're seeing are a symptom of llvm/llvm-project#55522 If you disable vector.zig and math.zig behaviort tests you shouldn't see those anymore.

Good to know, thank you!

I'm sorry about not following the procedure, but I didn't knew about it. I spent some time to reduce the issue and make it easy to reproduce.

This is entirely my fault. I was overwhelmed with trying to polish the self-hosted compiler that I dropped the ball on the LLVM 14 upgrade.

How long as 14.0.6 been out? It does seem something important enough, can we reach out to the release manager ? Who are they?

Here is a conversation I had on IRC with Tom Stellard:

<andrewrk> now that we're on GitHub do we have a release blocker issue like we did with bugzilla?
<tstellar> andrewrk: We use milestones, but we are done with 14.x releases.
<andrewrk> rip. yeah I was a bit late catching up on this one
<andrewrk> will try to make it in time for 15 release candidates
<andrewrk> can I still request an issue to be a release blocker for 15? https://github.com/llvm/llvm-project/issues/55522
<andrewrk> if you trust me with github perms I would be responsible with labels & milestones
<tstellar> andrewrk: I can create the milestone.
<tstellar> andrewrk: https://github.com/llvm/llvm-project/milestone/11
<tstellar> andrewrk: I think any members of the LLVM organization can create labels.
<andrewrk> I think I am not a member of the LLVM organization
<andrewrk> hmm, it does show me in there
<andrewrk> I'm listed as only on team bz-contributors
<tstellar> andrewrk: OK, I think you need write access to the repo, do you have commit access?
<andrewrk> I do not
<tstellar> andrewrk: You can email Chris to get it: https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access
<andrewrk> on it, thanks

After this I sent an email to Chris Lattner asking for permissions to edit milestones in the LLVM organization.

I think that we are stuck with LLVM as it is until at least 15.0.0. My suggestion is that we should work around this bug in LLVM by lowering vector constants as if they were array constants.

@gwenzek
Copy link
Contributor Author

gwenzek commented Jul 2, 2022

I think that we are stuck with LLVM as it is until at least 15.0.0. My suggestion is that we should work around this bug in LLVM by lowering vector constants as if they were array constants.

OK, I'll try that. I'm moving next week so I won't have a lot of time for Zig in the next 10 days.

@andrewrk
Copy link
Member

andrewrk commented Jul 2, 2022

I hope the move goes well! I think in the meantime, I will take on this task and drive it to completion.

@JacekJagosz
Copy link

Sorry, might be an uninformed comment here. But I think even if patches to fix zig on LLVM 14 can't get accepted into next point release, I am sure some distro maintainers would still include include them, so they can get rid of LLVM 13.
On Solus zig is the last blocker, and Arch has llvm 13 just for zig too.
So even if they won't get into main release of LLVM, they will certainly get used to. But I know there are still a few bugs to fix.

@gwenzek
Copy link
Contributor Author

gwenzek commented Jul 19, 2022

Congrats @andrewrk on your work here ! IIUC LLVM14 migration is over, but I still need to work on the opaque pointer deprecation. I will create a new issue for that. Is there a list of LLVM bugs we want to be fixed for LLVM 15 (branch cutoff is next week) ?

@nektro
Copy link
Contributor

nektro commented Jul 19, 2022

we've been contributing to the official list now here https://github.com/llvm/llvm-project/milestone/11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend-llvm The LLVM backend outputs an LLVM IR Module. enhancement Solving this issue will likely involve adding new logic or components to the codebase. frontend Tokenization, parsing, AstGen, Sema, and Liveness.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants