Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support graceful shutdown of haproxy #156

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

drewrobb
Copy link

The purpose of this feature is to allow bamboo to shutdown haproxy gracefully in response to a SIGTERM. In my particular use case we have bamboo running in docker behind an aws ELB. The goal is to generate a health check that can remove bamboo from the ELB before bamboo actually exits, so that we can redeploy bamboo without any requests being lost. My particular way of shutting down bamboo is to run docker stop on the container. By default this gives a SIGTERM followed by a SIGKILL 10 seconds later, so a value of GraceSeconds < 10s is reasonable, but the value should be large enought for an upstream balancer to detect that bamboo is unhealthy. Some changes to the dockerization were necessary so that bamboo would actually get the signal-- child processes of bash or sh need to be run with 'exec'.

I've been testing this by running the container, then running something like:

while true; do curl --connect-timeout 2 --max-time 2 localhost:2000/health  -sL -w "%{http_code} %{time_total}  " -o /dev/null; echo $(($(date +%s%N)/1000000)); sleep 0.2; done

And then running docker stop $(docker ps | grep bamboo | awk '{print $1}') the http status should change from 200 to 503 for 5 seconds.

I'm not sure if people would want this on by default, but GraceSeconds is configurable and setting to 0 allows immediate exit. Also, port 2000 is used for health checking. This could be problematic if not running in docker, so maybe my changes to the haproxy_template should be commented out by default.

@@ -34,6 +36,24 @@ defaults
errorfile 504 /etc/haproxy/errors/504.http


frontend graceful_stop_check
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't think of a way to get rid of this extra frontend just for checking if shutdown is happening

@drewrobb
Copy link
Author

There is a tiny issue here-- when using GraceSeconds, the old haproxy process after a restart will continue to bind on port 80. The kernel will distribute requests between processes in this case rather than send to newest process as we would want. If servers change sufficiently quickly, you might get 503s. I'm looking at a work around sending SIGTTOU and SIGUSR1 to the old haproxy PID to force it to unbind after restarting with -sf option. The haproxy docs say that this should be necessary, but I'm seeing otherwise.

@j1n6
Copy link
Contributor

j1n6 commented Aug 31, 2015

This is an interesting and valid use case. The only concern I have is avoid Bamboo shutting down HAProxy, it would help with upgrading and maintenance.

@timoreimann
Copy link
Contributor

@drewrob:, IIUC, your intention is to facilitate a way to disable Bamboo smoothly for maintenance reasons without any downtime involved. Just wondering whether you could tell ELB to take whatever Bamboo/HAProxy combo you want to run maintenance on out of balancing, thus avoiding any Bamboo-stopping-HAProxy control flows.

I am no way familiar with ELB so let me know if there's a blocker on the AWS end I am missing.

@drewrobb
Copy link
Author

@timoreimann, yes that is my intention. Your idea would work as well, I wanted to implement it this way so that I didn't have to worry about that process. In fact I'm running bamboo on marathon as well (on a subset of mesos slaves), so I don't have any special procedure to decommission a mesos slave.

@activars it would be possible to have the signal handler only shutdown haproxy on a SIGTERM, and just shutdown bamboo on a SIGINT (although that convention would be a bit weird?). Another idea-- have grace seconds = -1 by default and in that case don't shutdown haproxy, just shutdown bamboo?

@timoreimann
Copy link
Contributor

@drewrobb: How do you make sure that you do not lose any requests when Bamboo shuts down HAProxy (presumably gracefully) on the load balancer end? Does ELB come with some kind of mechanism to retransmit packets to other hosts if one is deemed unavailable?

@drewrobb
Copy link
Author

@timoreimann I use the /health endpoint as defined in this PR as a health check for the ELB, with settings such that it will be marked unhealthy in less than GraceSeconds as defined here. I also made sure that the mesos setting docker_top_timeout is large enough. Thus, the ELB will stop sending requests to bamboo well before it has shutdown. Important to note that during the shutdown process, the bamboo instance will keep handling requests as usual, it just will stop getting new requests from the ELB once marked unhealthy. This approach wouldn't work for long running connections such as websockets, but any request that takes less than some amount of time (GraceSeconds minus time it takes for bamboo to be marked unhealthy).

@mlerner
Copy link

mlerner commented Mar 1, 2016

This would be great to have, @drewrobb!

@KidkArolis
Copy link

Cleaning up old PRs, feel free to reopen if still relevant.

@KidkArolis KidkArolis closed this Aug 24, 2016
@j1n6 j1n6 reopened this Sep 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants