Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exit Code 137 for newest release #11

Closed
moritz31 opened this issue Nov 27, 2019 · 4 comments
Closed

Exit Code 137 for newest release #11

moritz31 opened this issue Nov 27, 2019 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@moritz31
Copy link

We are triggering fargate tasks with a step function. For the newest fluentbit container this fails because the container always exits with exit code 137 when the actual container in the task finishs.
For the previous version 1.3.2 this is not a problem.

@PettitWesley
Copy link
Contributor

@moritz31 I've reproduced this... it is very strange- 137 usually means Out of Memory AFAIK.

@PettitWesley PettitWesley self-assigned this Dec 3, 2019
@PettitWesley PettitWesley added the bug Something isn't working label Dec 3, 2019
@PettitWesley
Copy link
Contributor

PettitWesley commented Dec 6, 2019

I think I've got a fix for this.

One key thing to note- the old behavior (afaict) was that the container would always with code 0 even if Fluent Bit exited with a non-zero code.

@moritz31 Is that what you've seen? My fix will restore that old behavior- the container will (afaict) always exit 0.

Edit: It seems that with 1.3.2 the container exits with 0 if it received a SIGTERM and gracefully shut down. If it did not (and docker had to send it a SIGKILL) it exited with 137.

@PettitWesley
Copy link
Contributor

PettitWesley commented Dec 12, 2019

I think I partly understand this.

137 means that the container did not gracefully handle the SIGTERM, and docker had to SIGKILL it: http://tldp.org/LDP/abs/html/exitcodes.html

Normally when Fluent Bit receives a SIGTERM, it prints a warning that it will quit soon- it has SIGTERM handling built in. I checked my logs and I noticed that with 2.0.0 this does not happen. This means that something about how I set up the entry point script prevents the signals from properly being passed to it.

The exec -c I added in #14 fixes the signal passing somehow.

@allisaurus
Copy link

Fix released in v2.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants