-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for the recover()
builtin function
#2331
Conversation
// * The return value (eax, rax, r0, etc) is set to zero in the inline | ||
// assembly but set to an unspecified non-zero value when jumping using | ||
// a longjmp. | ||
asmType := llvm.FunctionType(b.uintptrType, []llvm.Type{b.deferFrame.Type()}, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure that this is safe? As is, I think the compiler is still permitted to assume that this returns once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried several alternatives, and this was the most reliable one so far.
I've tried:
setjmp
at the beginning of a function. LLVM is smart enough to optimize away somedefer
instructions later after alongjmp
, so this is not usable.- Using exception handling instructions together with
blockaddress
. Basically, act as if we're unwinding as usual withinvoke
andlandingpad
, but instead of actually unwinding, return back to the function by capturing the stack pointer at function entry and doinglongjmp
by jumping back to the basic block with thelandingpad
instruction with a capturedblockaddress
. Doesn't work because LLVM moves thelandingpad
instruction to a different basic block after which the entire system falls down.
The only other alternative is using setjmp
directly, which would be much more expensive than it is already (because we need to do it before every function call that might panic).
I think this is safe because this is what LLVM says about returns_twice
:
This attribute indicates that this function can return twice. The C
setjmp
is an example of such a function. The compiler disables some optimizations (like tail calls) in the caller of these functions.
Tail calls are not relevant to inline assembly. I'm not 100% convinced, but it appears to be mostly safe.
(Sidenote: I might investigate using the callbr
instruction instead, which might be slightly less expensive by avoiding a compare and branch).
... actually, this gives me an idea. It might be possible to change this scheme to do the setjmp
after every defer instruction, instead of before every call. Not entirely sure whether that's safe though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does gollvm support defer/recover? How does it convince llvm to do the Right Thing? ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like gollvm uses full C++ exceptions. I don't think that is practical here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might start using C++ exceptions on some platforms (Windows, Linux, MacOS) but wanted to start with something simple and mostly portable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@niaow I'm afraid you are correct. It looks like the remaining issue is a tail call on RISC-V, which would probably have been prevented with a returns_twice
. Now thinking what the best solution to this is...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that was easy. Apparently it's possible to add returns_twice
as a call site attribute, even to inline assembly.
}
asm := llvm.InlineAsm(asmType, asmString, constraints, false, false, 0)
result := b.CreateCall(asm, []llvm.Value{b.deferFrame}, "setjmp")
+ result.AddCallSiteAttribute(-1, b.ctx.CreateEnumAttribute(llvm.AttributeKindID("returns_twice"), 0))
isZero := b.CreateICmp(llvm.IntEQ, result, llvm.ConstInt(b.uintptrType, 0, false), "setjmp.result")
continueBB := b.insertBasicBlock("")
b.CreateCondBr(isZero, continueBB, b.landingpad)
This fixes the miscompilation on RISC-V.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sidenote: I might investigate using the callbr instruction instead, which might be slightly less expensive by avoiding a compare and branch).
It seems like that might avoid more than just a compare and branch. I think something like this could theoretically work:
callbr void asm returns_twice "", "=~{rax},..."()
to label %fallthrough [label %indirect]
Seems a bit cursed though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the code does there? I see no assembly.
My biggest hope is that I can manage to avoid clobbering all registers for every function call, and instead can hopefully do that in the %lpad
block. That will certainly improve code quality. But I'd like to do that later, first I'd like to get this PR actually working.
Thank you for working on this. Verified, works for my use case: reset system on panic. Stats for example code below
|
Rebased and fixed two bugs, hopefully the tests pass this time.
Note that this PR is only for when the code explicitly calls |
Oh, I see. Means I can’t use it yet, but thanks for the clarification! |
I've now also added a new flag: Feel free to suggest different names/values for this flag :) |
compiler/compiler_test.go
Outdated
@@ -56,6 +56,7 @@ func TestCompiler(t *testing.T) { | |||
{"float.go", "", ""}, | |||
{"interface.go", "", ""}, | |||
{"func.go", "", "coroutines"}, | |||
{"defer.go", "", ""}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we maybe do this on a target that can actually run recover so that we know that the code gen is actually working?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
3cb1da9
to
95011c6
Compare
For me, as an end user who knows little to nothing about compilers and tinygo internals, Will defer at all work with "none" value, but "recover()" will be no-op? In that case probably |
With I'm not sure about So in summary, before this PR and with
With
So in summary, the flag controls whether jumping back up the stack and running defers works when |
I see. Any benefit of making defer no-op, size-wise or else? If so, I can imagine
|
Not much. I don't think we should do that anyway, because it diverges from the Go spec too much. If the code size increase is too much, then you shouldn't use
I would like to add support for recovering from most runtime panics, such as nil pointer dereference, slice out of bounds, etc. My idea about the
But the only thing most people need to know is Alternative idea: add a possible value to the |
Add In microcontroller world, IoT specifically, it can be beneficial to have a possibility and make the tiny device restart on some unexpected state instead of hanging up. Even big OSes restart, sometimes, on hard faults :D |
What kind of issue were you hitting, a panic or a hard fault? Those two are similar but different. |
I believe runtime panics nowadays, like nil pointer. Mostly from network stack, I recon, wifinina driver, http client, all that. I did have hard faults all over due to lack of ram while prototyping on Nano 33 IoT. |
Hmm, yeah, I see.
Lack of RAM should not result in a hard fault! It should result in an out of memory panic. However, a stack overflow might not be immediately obvious (it's checked at every blocking operation, not right away) and result in weird behavior. |
Not really viable. Would need to have the debugger attached to the board for days and even then catching the error requires some luck, chances are it be non-reproducible in the lab at all... |
Hmm, yeah maybe we should change the default from hanging on panic to rebooting. But that's a different issue :) But what do you think of my |
I can make a separate ticket to keep things organised.
It's an improvement, for sure. |
Good point. At the moment, yes. But eventually, all panics (including nil pointer dereference panics) should be recoverable to match upstream Go. (But note that a HardFault is not a panic - at least not currently). |
What's |
A The upstream Go implementation does have |
I guess this is the subject of #891 |
So, how do we use this? What flag is implemented with what values? |
@ysoldak there is no flag, if it is supported on a given platform it is enabled. I might add a flag in the future to control it. |
@aykevl looks like possibly legit test failure here: https://github.com/tinygo-org/tinygo/runs/6834101399?check_suite_focus=true#step:19:51 |
Any chance to look into this @aykevl seems like so close! |
I have reverted the suggestion from @niaow to see whether that fixes the issue. I can't see how but if it does, it at least reduces the scope a lot to search for this issue. EDIT: it sadly does not. |
It did pass before. Not sure what happened in the interim... |
Converted to draft while I'm trying to figure out what's going on... |
Mystery solved. The problem only happens in CI because GitHub Actions merges the branch with the dev branch before testing. Rebasing the recover3 branch on top of the dev branch locally results in the same issue locally. Now I just need to figure out what changed recently in the dev branch to cause this issue... |
Found the offending commit: #2879 |
Interestingly #2884 also fails with a weird error in archive/zip, this could be related. |
All the globals are between _etext and _end. We were scanning only between _edata and _end, which mainly consists of the .bss section. Scanning from _etext ensures that the .data section is also included. This bug didn't result in issues in CI, but did result in a bug in the recover branch: #2331. This patch fixes this bug.
Found it! It's a very small bug with big implications: #2909 |
Previously we used to scan between _edata and _end. This is not correct: the .data section starts *before* _edata. Fixing this would mean changing _edata to _etext, but that didn't quite work either. It appears that there are inaccessible pages between _etext and _end on ARM. Therefore, a different solution was needed. What I've implemented is similar to Windows and MacOS: namely, finding writable segments by parsing the program header of the currently running program. It's a lot more verbose, but it should be correct on all architectures. It probably also reduces the globals to scan to those that _really_ need to be scanned. This bug didn't result in issues in CI, but did result in a bug in the recover branch: #2331. This patch fixes this bug.
llvm.AddBasicBlock should never be used. Instead, we should use the AddBasicBlock method of the current LLVM context. This didn't lead to any bugs... yet. But probably would, eventually.
For example, this commit moves the 'throw' branch of an assertion (nil check, slice index check, etc) to the end of the function while inserting the "continue" branch right after the insert location. This makes the resulting IR easier to follow. For some reason, this also reduces code size a bit on average. The TinyGo smoke tests saw a reduction of 0.22%, mainly from WebAssembly. The drivers repo saw little average change in code size (-0.01%). This commit also adds a few compiler tests for the defer keyword.
At last! Great work @aykevl and thank you everyone who helped get this PR completed. Now merging. |
Previously we used to scan between _edata and _end. This is not correct: the .data section starts *before* _edata. Fixing this would mean changing _edata to _etext, but that didn't quite work either. It appears that there are inaccessible pages between _etext and _end on ARM. Therefore, a different solution was needed. What I've implemented is similar to Windows and MacOS: namely, finding writable segments by parsing the program header of the currently running program. It's a lot more verbose, but it should be correct on all architectures. It probably also reduces the globals to scan to those that _really_ need to be scanned. This bug didn't result in issues in CI, but did result in a bug in the recover branch: #2331. This patch fixes this bug.
Not all architectures are supported yet. In particular, the current design doesn't support WebAssembly. Other architectures (avr, xtensa, riscv64) can be supported but are left as a TODO.
This PR does result in an increase in code size if either
defer
orpanic
is used. There is no way to turn this feature off with this PR, but I don't think we should do that for the following reasons:defer
orpanic
is used (especiallydefer
). If you really care about code size, you may want to avoid those.-gc
,-scheduler
).To give an idea of the code size change, here is the difference in binary size before and after the last commit of this PR:
What you can see here:
defer
orpanic
.defer
and /recover
.