Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sandbox cleanup #1805

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

fcrisciani
Copy link

Driver and Sanbox have 2 different stores where the endpoints are saved
It is possible that the 2 store go out of sync if the endpoint is added to the driver
but there is a crash before the sandbox join.
On restart now we take the list of endpoints from the network and we assign
them back to the sandbox

Signed-off-by: Flavio Crisciani flavio.crisciani@docker.com

Driver and Sanbox have 2 different stores where the endpoints are saved
It is possible that the 2 store go out of sync if the endpoint is added to the driver
but there is a crash before the sandbox join.
On restart now we take the list of endpoints from the network and we assign
them back to the sandbox

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
for _, ep := range epl {
ep, err = n.getEndpointFromStore(ep.id)
if err != nil {
logrus.Warnf("Could not get endpoint in network %s during sandbox cleanup: %v", n.name, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe logrus.Warnf("Could not get endpoint %s in network %s during sandbox cleanup: %v", ep.id, n.name, err)

@vieux
Copy link
Contributor

vieux commented Aug 11, 2017

besides a small nit, LGTM

lmbarros pushed a commit to balena-os/balena-libnetwork that referenced this pull request Apr 17, 2023
Driver and Sanbox have 2 different stores where the endpoints are saved
It is possible that the 2 store go out of sync if the endpoint is added to the driver
but there is a crash before the sandbox join.
On restart now we take the list of endpoints from the network and we assign
them back to the sandbox

(This is a balena cherry-pick of moby#1805)

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
lmbarros pushed a commit to balena-os/balena-libnetwork that referenced this pull request Apr 18, 2023
Driver and Sandbox have 2 different stores where the endpoints are saved
It is possible that the 2 store go out of sync if the endpoint is added to the driver
but there is a crash before the sandbox join.
On restart now we take the list of endpoints from the network and we assign
them back to the sandbox

(This is a balena cherry-pick of moby#1805)

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
lmbarros added a commit to balena-os/balena-engine that referenced this pull request Apr 18, 2023
This new version has a patch cherry-picked from here:
moby/libnetwork#1805

This patch is meant to avoid cases in which libnetwork internal state
gets inconsistent in case of crashes.

Signed-off-by: Leandro Motta Barros <leandro@balena.io>
Change-type: patch
lmbarros added a commit to balena-os/balena-engine that referenced this pull request Apr 20, 2023
This new version has a patch cherry-picked from here:
moby/libnetwork#1805

This patch is meant to avoid cases in which libnetwork internal state
gets inconsistent in case of crashes.

Signed-off-by: Leandro Motta Barros <leandro@balena.io>
Change-type: patch
@lmbarros
Copy link

Hi folks! Not sure what are your plans regarding this PR, but let me share some positive experience we had with it at balena:

  • Historically, we saw the port already allocated #1790 issue occasionally in some of our users. Recently, however, one customer reported quite frequent occurrences.
  • We fixed the conflicts with recent libnetwork versions. Small stuff, no big deal.
  • We did some work to reproduce the issue and test this patch, with good results. This is described here.
  • So we decided to integrated this PR into balenaEngine.
  • This has been working fine for just over a month now. The one user that was seeing this issue frequently hasn't seen any other case since upgrading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants