Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sdk#918] Enable restore NSE name in setid #932

Closed

Conversation

Bolodya1997
Copy link

Description

Now setid checks if it is able to restore NSE name from registry instead of generating a new one.

Issue link

#918

How Has This Been Tested?

  • Added unit testing to cover
  • Tested manually
  • Tested by integration testing
  • Have not tested

Types of changes

  • Bug fix
  • New functionallity
  • Documentation
  • Refactoring
  • CI

Signed-off-by: Vladimir Popov <vladimir.popov@xored.com>
@Bolodya1997 Bolodya1997 marked this pull request as draft May 18, 2021 11:51
@edwarnicke
Copy link
Member

This looks good as far as it goes... but it feels like our basic underlying problem here is that the nse name that we are given by the NSE registering is not the one we actually put in the registry. I feel like we may need to simply respect the name we are given, and treat the issue of duplication as one of authorization from the upstream registry.

Please note in the K8s case, a good approach to this is likely to start with simply having the NSEs register themselves with their Pod name.

Thoughts?

@Bolodya1997
Copy link
Author

@edwarnicke

I feel like we may need to simply respect the name we are given, and treat the issue of duplication as one of authorization from the upstream registry.

Do you mean simply not checking for duplication at all?

Our logic for now is the following - it is OK to take NSE registration with already registered name if it has the same URL as the existing registry entry. Here is a problem with simple using pod name for the NSE registration without some additional generation in such case - in unix + restart case we will get the following:

1. Register NSE (pod-name, unix-socket-1)
2. Endpoint restarts
3. Register NSE (pod-name, unix-socket-2) <-- failure

Now there is a chain element on the Endpoint side that is responsible to generate a unique name for the NSE registration and a server chain element on the NSMgr, Registry sides that is responsible to give a feedback of uniqueness of the name:

-(name)-> setid -(uuid-name)-> checkid
* found duplication *
setid <-(aready exists error)- checkid
setid -(uuid2-name)-> checkid --> registry
<-(new uuid-name)- setid

So it solves the problem with the unix + restart case, but provides a new problem with the IP + restart case:

1. Register NSE (uuid-pod-name, ip)
2. Endpoint restart
3. Register NSE (uuid2-pod-name, ip) <-- not a failure in terms of registration but it is now
                                         a new NSE instead of refreshing the old one

And this PR should solve this problem with restoring back the same name in such case.

Not checking for duplication should solve both of these problems, but here can be an issue, when for some reason we have 2 Endpoints registering with the same name (probably from the different clusters, so it even can be the pod name) and it would totally break NSM work for both of these Endpoints.

What do you think of this?

@edwarnicke
Copy link
Member

@Bolodya1997 I understand the issue with NSE restart causing a different URL for the unix file socket, which means you can't match it.

I do not understand

-(name)-> setid -(uuid-name)-> checkid
* found duplication *
setid <-(aready exists error)- checkid
setid -(uuid2-name)-> checkid --> registry
<-(new uuid-name)- setid

Its not clear to me at all what's happening there, other than a uuid is coming from somewhere... and appears to be being prepended to the name?

@edwarnicke
Copy link
Member

@Bolodya1997 Is 'uuid' for you the Pod.metadata.uid ?

@Bolodya1997
Copy link
Author

@edwarnicke
Sorry for this raw scheme. We have such NSE registry chain:

setid -> ... -> checkid (NSMgr) -> ... -> checkid (Registry) -> ... -> memory
  1. Endpoint registers itself with some name name, in k8s case it would be pod-name.
    -(name)-> setid
  2. setid generates some random prefix, so uuid here is just some random string.
    setid -(uuid-name)-> ... -> checkid
  3. checkid checks if it already has such name registered with another URL, if so, it returns gRPC error containing already exists error code
    setid <-(aready exists error)- checkid
  4. setid receives this error, generates a new random prefix for the name and tries to register again.
    setid -(uuid2-name)-> checkid
  5. checkid doesn't find any duplication and so simply passes this registration further to the chain
    checkid -> ... -> memory
  6. So the final name returned back with the registration will be uuid2-name.

@edwarnicke edwarnicke closed this Dec 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants