-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shutdown gracefully without os.Exit #7480
Conversation
7d3d6bf
to
32f528c
Compare
Codecov Report
@@ Coverage Diff @@
## master #7480 +/- ##
==========================================
- Coverage 55.01% 54.82% -0.19%
==========================================
Files 448 597 +149
Lines 30321 37996 +7675
==========================================
+ Hits 16681 20833 +4152
- Misses 11994 15049 +3055
- Partials 1646 2114 +468 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. utACK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow why this approach is better. Can you provide concrete examples? Having a single signal trap that encapsulates all cleanup logic is preferable to me.
The method we use to collect test coverage when doing external testing is described here. The example code is here. Basically what we do is wrapping the main function in a test function: func TestBincoverRunMain(t *testing.T) {
bincover.RunTest(main)
} And built it with |
Is there a way in the TrapSignal to ensure you capture the coverage report? |
The report generation is handled by go internal after the test functions returned, what I observe currently is it's not generated with |
I see. What if we added a time/sleep before os.Exit? |
I think it won't work unless the main function returns normally, because the wrapping code don't have a chance to run. |
And it'd be very, very ugly... |
server/start.go
Outdated
if tmNode.IsRunning() { | ||
_ = tmNode.Stop() | ||
} | ||
|
||
if apiSrv != nil { | ||
_ = apiSrv.Close() | ||
} | ||
if cpuProfileCleanup != nil { | ||
cpuProfileCleanup() | ||
} | ||
|
||
if grpcSrv != nil { | ||
grpcSrv.Stop() | ||
} | ||
if apiSrv != nil { | ||
_ = apiSrv.Close() | ||
} | ||
|
||
ctx.Logger.Info("exiting...") | ||
}) | ||
if grpcSrv != nil { | ||
grpcSrv.Stop() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok can we at least group these in a single cleanup function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made it into a defer expression now.
server/start.go
Outdated
if tmNode.IsRunning() { | ||
_ = tmNode.Stop() | ||
} | ||
WaitForQuitSignals() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not using the return value here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the _ =
now.
af475f3
to
0834b90
Compare
df74d17
to
696d612
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. But @alessio, don't we want to exit with the status code resulting from WaitForQuitSignals
?
we might need a |
That'd be great |
78e74cf
to
3619164
Compare
sigs := make(chan os.Signal, 1) | ||
signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM) | ||
sig := <-sigs | ||
return ErrorCode{Code: int(sig.(syscall.Signal)) + 128} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why adding 128?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because that is the standard way you handle -signal process termination in UNIX (n+128)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are failing - looks great otherwise
b684f8e
to
84c1387
Compare
84c1387
to
51e2a6d
Compare
"github.com/cosmos/cosmos-sdk/simapp/simd/cmd" | ||
) | ||
|
||
func main() { | ||
rootCmd, _ := cmd.NewRootCmd() | ||
if err := cmd.Execute(rootCmd); err != nil { | ||
os.Exit(1) | ||
switch e := err.(type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we log an error as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exit code + logging information already provided by the underlying goroutines should do it all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current error output when received SIGTERM
is like this:
Error: 143
Usage:
chain-maind start [flags]
Flags:
...
maybe add an error string into the ErrorCode
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me
* shutdown gracefully without os.Exit * Update server/util.go Co-authored-by: Alessio Treglia <quadrispro@ubuntu.com> Co-authored-by: Alessio Treglia <alessio@tendermint.com>
* shutdown gracefully without os.Exit * Update server/util.go Co-authored-by: Alessio Treglia <quadrispro@ubuntu.com> Co-authored-by: Alessio Treglia <alessio@tendermint.com>
Description
Change the behaviour of signal handling to gracefully shutdown rather than
os.Exit
.The purpose is to give defer function or wrapper function a chance to run after it quit.
Particularly in our case, we want to check test coverage of our externally running integration tests, we take this approach. basically we wrap the main function in a test function, and hope the main function returns normally when signaled with SIGINT/SIGTERM, so the test binary will output the coverage report.
Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.
docs/
) or specification (x/<module>/spec/
)godoc
comments.Unreleased
section inCHANGELOG.md
Files changed
in the Github PR explorerCodecov Report
in the comment section below once CI passes