Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Fix panic after UpdateAnnotation error #3612

Closed
wants to merge 2 commits into from
Closed

Fix panic after UpdateAnnotation error #3612

wants to merge 2 commits into from

Conversation

scritchley
Copy link

@scritchley scritchley commented Mar 11, 2019

Fixes a panic in kube-utils. If there is an error calling (*configMapAnnotations).UpdateAnnotation it currently assigns an empty v1.ConfigMap to the receiver. This results in a panic when assigning to the annotations map on further updates.

The impact of this issue is fairly large as it results in the reclaim daemon exiting, which in turn results in dead peers not being removed. In a cluster that uses auto scaling this is a big problem.

The proposed fix checks for errors before assigning the ConfigMap and adds an additional nil check before making updates to the annotation map.

Panic attached:
kube-utils-panic.txt

@scritchley scritchley changed the title Fix panic after UpdateAnnotations error Fix panic after UpdateAnnotation error Mar 11, 2019
@bboreham
Copy link
Contributor

Hi @scritchley, and thanks for this PR.

I don't think this change is right - the code changes the local copy of the map then contacts the server, expecting the server will send back an up-to-date map. (And, as you say, it may send back nil, which is not what the code is expecting)

If you simply exit the function on error the map is now wrong, since it contains the "speculative" update.

Can I suggest you open an issue indicating the path that led to a crash, in case that suggests any other ways to address it.

@scritchley
Copy link
Author

Ok, will do.

@bboreham
Copy link
Contributor

Linking to issue: #3613

Do you think you will come back to this? If not, we will pick it up.

@loshz
Copy link

loshz commented Mar 20, 2019

We are more than happy to help fix this if you want to have a discussion about potential solutions? :)

@bboreham
Copy link
Contributor

Two possible routes:

  • set the map to nil on error, update all callers to re-fetch from source, perhaps update all uses of the map to error on nil instead of crashing
  • copy the map before changing it, so it can be left as it was on failure to update.

I think I prefer the first, but it's hard to be sure without seeing an implementation

@loshz
Copy link

loshz commented Mar 20, 2019

Route 1 sounds good to me.

Could you clarify what you mean by:

update all callers to re-fetch from source

Do you mean every function that calls UpdateAnnotation? If so, how to envisage doing this?

@bboreham
Copy link
Contributor

Like this for example:

err := cml.LoopUpdate(func() error {
var err error
list, err = cml.GetPeerList()
if err != nil {
return err
}
if !list.contains(peerName) {
list.add(peerName, name)
err = cml.UpdatePeerList(*list)
if err != nil {
return err
}
}
return nil
})

Every time round, LoopUpdate() calls cml.Init() which re-fetches from the server.

@bboreham
Copy link
Contributor

I'm going to close this in favour of #3623, for reasons given at #3612 (comment)

I didn't follow my own suggestion above to use LoopUpdate in #3623, because we want to abort a reclaim on a conflict error, not keep trying.

Many thanks for your interest, and I hope you will consider contributing again.

@bboreham bboreham closed this Mar 26, 2019
murali-reddy added a commit that referenced this pull request Apr 24, 2019
…ecedence

(if there is any conflict in merging entry) over other opertaions on the entry
that increments the version

Fixes #3612
murali-reddy added a commit that referenced this pull request Apr 26, 2019
…ecedence

(if there is any conflict in merging entry) over other opertaions on the entry
that increments the version

Fixes #3612
murali-reddy added a commit that referenced this pull request Apr 26, 2019
takes precedence (if there is any conflict in merging entry) over
other operations on the entry that increments the version

Fixes #3612
@bboreham bboreham added this to the n/a milestone May 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants