Fix panic after UpdateAnnotation error #3612

scritchley · 2019-03-11T15:34:00Z

Fixes a panic in kube-utils. If there is an error calling (*configMapAnnotations).UpdateAnnotation it currently assigns an empty v1.ConfigMap to the receiver. This results in a panic when assigning to the annotations map on further updates.

The impact of this issue is fairly large as it results in the reclaim daemon exiting, which in turn results in dead peers not being removed. In a cluster that uses auto scaling this is a big problem.

The proposed fix checks for errors before assigning the ConfigMap and adds an additional nil check before making updates to the annotation map.

Panic attached:
kube-utils-panic.txt

bboreham · 2019-03-11T15:43:39Z

Hi @scritchley, and thanks for this PR.

I don't think this change is right - the code changes the local copy of the map then contacts the server, expecting the server will send back an up-to-date map. (And, as you say, it may send back nil, which is not what the code is expecting)

If you simply exit the function on error the map is now wrong, since it contains the "speculative" update.

Can I suggest you open an issue indicating the path that led to a crash, in case that suggests any other ways to address it.

scritchley · 2019-03-11T15:51:09Z

Ok, will do.

bboreham · 2019-03-19T10:35:22Z

Linking to issue: #3613

Do you think you will come back to this? If not, we will pick it up.

loshz · 2019-03-20T16:50:04Z

We are more than happy to help fix this if you want to have a discussion about potential solutions? :)

bboreham · 2019-03-20T17:04:24Z

Two possible routes:

set the map to nil on error, update all callers to re-fetch from source, perhaps update all uses of the map to error on nil instead of crashing
copy the map before changing it, so it can be left as it was on failure to update.

I think I prefer the first, but it's hard to be sure without seeing an implementation

loshz · 2019-03-20T17:39:50Z

Route 1 sounds good to me.

Could you clarify what you mean by:

update all callers to re-fetch from source

Do you mean every function that calls UpdateAnnotation? If so, how to envisage doing this?

bboreham · 2019-03-20T17:54:09Z

Like this for example:

weave/prog/kube-utils/peerlist.go

Lines 76 to 90 in 0d481ef

    
           err := cml.LoopUpdate(func() error { 
        
           	var err error 
        
           	list, err = cml.GetPeerList() 
        
           	if err != nil { 
        
           		return err 
        
           	} 
        
           	if !list.contains(peerName) { 
        
           		list.add(peerName, name) 
        
           		err = cml.UpdatePeerList(*list) 
        
           		if err != nil { 
        
           			return err 
        
           		} 
        
           	} 
        
           	return nil 
        
           })

Every time round, LoopUpdate() calls cml.Init() which re-fetches from the server.

bboreham · 2019-03-26T17:33:35Z

I'm going to close this in favour of #3623, for reasons given at #3612 (comment)

I didn't follow my own suggestion above to use LoopUpdate in #3623, because we want to abort a reclaim on a conflict error, not keep trying.

Many thanks for your interest, and I hope you will consider contributing again.

…ecedence (if there is any conflict in merging entry) over other opertaions on the entry that increments the version Fixes #3612

takes precedence (if there is any conflict in merging entry) over other operations on the entry that increments the version Fixes #3612

scritchley and others added 2 commits March 11, 2019 12:54

fixes subsequent panic after any UpdateAnnotation error

b3f2c98

kube-utils: check for nil Annotations

4cd7f06

scritchley changed the title ~~Fix panic after UpdateAnnotations error~~ Fix panic after UpdateAnnotation error Mar 11, 2019

bboreham closed this Mar 26, 2019

murali-reddy added a commit that referenced this pull request Apr 24, 2019

bump up the version by large value so that transfer of range takes pr…

e0aca6a

…ecedence (if there is any conflict in merging entry) over other opertaions on the entry that increments the version Fixes #3612

murali-reddy mentioned this pull request Apr 24, 2019

bump up the version of entry by large value so that transfer of range takes precedence #3635

Merged

murali-reddy added a commit that referenced this pull request Apr 26, 2019

bump up the version by large value so that transfer of range takes pr…

3a3ed6a

…ecedence (if there is any conflict in merging entry) over other opertaions on the entry that increments the version Fixes #3612

murali-reddy added a commit that referenced this pull request Apr 26, 2019

bump up the version of entry by large value so that transfer of range

873781a

takes precedence (if there is any conflict in merging entry) over other operations on the entry that increments the version Fixes #3612

bboreham added this to the n/a milestone May 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix panic after UpdateAnnotation error #3612

Fix panic after UpdateAnnotation error #3612

scritchley commented Mar 11, 2019 •

edited

Loading

bboreham commented Mar 11, 2019

scritchley commented Mar 11, 2019

bboreham commented Mar 19, 2019

loshz commented Mar 20, 2019

bboreham commented Mar 20, 2019

loshz commented Mar 20, 2019

bboreham commented Mar 20, 2019

bboreham commented Mar 26, 2019

Fix panic after UpdateAnnotation error #3612

Fix panic after UpdateAnnotation error #3612

Conversation

scritchley commented Mar 11, 2019 • edited Loading

bboreham commented Mar 11, 2019

scritchley commented Mar 11, 2019

bboreham commented Mar 19, 2019

loshz commented Mar 20, 2019

bboreham commented Mar 20, 2019

loshz commented Mar 20, 2019

bboreham commented Mar 20, 2019

bboreham commented Mar 26, 2019

scritchley commented Mar 11, 2019 •

edited

Loading