-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Severe memory leak in beta.470 and beyond crashes Traefik server #462
Comments
Hi @tooda02 |
Here is traefik.toml
I have confirmed that the leak was introduced by PR #305. |
OK thanks. It seems from my tests, that the leak is not due to marathon backend. Still investigating. |
@tooda02 if that's possible, it would be helpful if, during a leak:
Thanks for your help :) |
Here is SIGQUIT output from killing the published beta.470 binary after about 10 minutes of execution and already showing 50% more memory usage that a pre-beta.470 version that had been running for days. Also, prior to this I tried running a binary built from source modified from beta.470 by reverting the marathon.go changes. It still showed the leak behavior, so you're likely right that it's not Marathon.
|
Here is a second dump, this time after running about two hours and with the Traefik process taking approximately ten times the memory as that in the previous dump.
|
Thanks a lot @tooda02 :) |
I've tested PR Fix memory leak in listenProviders #464 on the system where I encountered the problem, and I no longer see a memory leak. Thanks very much for the quick and effective response. |
For those who want to test the fix from PR #464, you can grab the docker image |
@emilevauge yes, tested & looks 👍 |
Cisco recently attempted to deploy rc2 in production and was forced to back off due to a severe memory leak that brought down the entire VM running Traefik after about 14 hours or so. We've done some research and found that this was introduced in v1.0.0-beta.470: Merge pull request #305 from containous/fix-races. Here is a comparison between beta.470 and beta.453. (Beta453 does not exhibit the leak behavior).
The issue may be Marathon related. We are using the Marathon provider and currently have an issue that its configuration refreshes very frequently due to instability in some of the tasks it's running. We currently suspect PR #305 Fix races, but are still researching to narrow it down. I will update this issue with additional information as we find it.
This issue is a complete showstopper for us.
The text was updated successfully, but these errors were encountered: