Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple IP addresses appearing in NSE and NSC after upgrade #339

Closed
rpiceage opened this issue Nov 22, 2021 · 13 comments
Closed

Multiple IP addresses appearing in NSE and NSC after upgrade #339

rpiceage opened this issue Nov 22, 2021 · 13 comments
Assignees

Comments

@rpiceage
Copy link

During upgrade test with nse-icmp-responder and kernel2kernel example multiple interfaces appear in NSE with multiple IP addresses appear in NSE. NSC pods also get multiple addresses.
Before upgrade, the NSE looks like:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
4: eth0@if28559: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1430 qdisc noqueue state UP group default 
    link/ether f6:78:de:28:16:66 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.149.48/32 scope global eth0
       valid_lft forever preferred_lft forever
5: icmp-respo-37e0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UNKNOWN group default qlen 1000
    link/ether 02:fe:4f:e2:31:12 brd ff:ff:ff:ff:ff:ff
    inet 172.16.1.96/32 scope global icmp-respo-37e0
       valid_lft forever preferred_lft forever
    inet6 fe80::fe:4fff:fee2:3112/64 scope link 
       valid_lft forever preferred_lft forever

After upgrade:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
4: eth0@if28559: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1430 qdisc noqueue state UP group default 
    link/ether f6:78:de:28:16:66 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.149.48/32 scope global eth0
       valid_lft forever preferred_lft forever
6: icmp-respo-fb43: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UNKNOWN group default qlen 1000
    link/ether 02:fe:60:83:30:c0 brd ff:ff:ff:ff:ff:ff
    inet 172.16.1.96/32 scope global icmp-respo-fb43
       valid_lft forever preferred_lft forever
    inet 172.16.1.98/32 scope global icmp-respo-fb43
       valid_lft forever preferred_lft forever
    inet6 fe80::fe:60ff:fe83:30c0/64 scope link 
       valid_lft forever preferred_lft forever
7: icmp-respo-6de2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UNKNOWN group default qlen 1000
    link/ether 02:fe:74:98:66:90 brd ff:ff:ff:ff:ff:ff
    inet 172.16.1.96/32 scope global icmp-respo-6de2
       valid_lft forever preferred_lft forever
    inet 172.16.1.98/32 scope global icmp-respo-6de2
       valid_lft forever preferred_lft forever
    inet 172.16.1.100/32 scope global icmp-respo-6de2
       valid_lft forever preferred_lft forever
    inet6 fe80::fe:74ff:fe98:6690/64 scope link 
       valid_lft forever preferred_lft forever

NSC before upgrade:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
      valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
4: eth0@if28558: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1430 qdisc noqueue state UP group default 
    link/ether aa:b1:79:d5:54:8c brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.149.18/32 scope global eth0
       valid_lft forever preferred_lft forever
5: nsm-1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UNKNOWN group default qlen 1000
    link/ether 02:fe:f8:1e:a1:43 brd ff:ff:ff:ff:ff:ff
    inet 172.16.1.97/32 scope global nsm-1
       valid_lft forever preferred_lft forever
    inet6 fe80::fe:f8ff:fe1e:a143/64 scope link tentative 
       valid_lft forever preferred_lft forever

NSC after upgrade:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
4: eth0@if28558: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1430 qdisc noqueue state UP group default 
    link/ether aa:b1:79:d5:54:8c brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.149.18/32 scope global eth0
       valid_lft forever preferred_lft forever
6: nsm-1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UNKNOWN group default qlen 1000
    link/ether 02:fe:10:f1:00:35 brd ff:ff:ff:ff:ff:ff
    inet 172.16.1.97/32 scope global nsm-1
       valid_lft forever preferred_lft forever
    inet 172.16.1.99/32 scope global nsm-1
       valid_lft forever preferred_lft forever
    inet 172.16.1.101/32 scope global nsm-1
       valid_lft forever preferred_lft forever
    inet6 fe80::fe:afff:fe2e:dad6/64 scope link 
       valid_lft forever preferred_lft forever

Traffic was OK during the upgrade using the original IP (it stopped during pod restart, but after that came back as it should), but after the upgrade, the multiple addresses confused our tests, this is how the issue was detected.
The issue is reproducable, sometimes only one new IP appears, sometimes 2, as in the example.
I attached logs for the test case execution.

Versions used:
nsmgr=f2f421a
registry-memory=4b0bb64
forwarder-vpp=4e1d713
nse-icmp-responder-vpp=9b4b3fa
nsc-vpp=4c53be1

As upgrade base, we used versions from 19th of November:
nsmgr=47451a4
registry-memory=f24d424
forwarder-vpp=1b74b40
nse-icmp-responder-vpp=5ac470e
nsc-vpp=888fbd4

endpoint-nsc-684f87c977-2npct-nsc.txt
endpoint-nse-588cc4f7bb-xrnts-nse.txt
forwarder-vpp-jv7xh-forwarder-vpp.txt
nsmgr-v7spq-nsmgr.txt
nsm-registry-5c6dc57b69-hd7wz-registry-memory.txt

@denis-tingaikin
Copy link
Member

@edwarnicke Could you provide priority for this one?

@denis-tingaikin
Copy link
Member

@rpiceage Hmm. Am I getting it correctly that you upgraded nsmgr/forwarder?

@rpiceage
Copy link
Author

Yes, nsmgr, forwarder and memory-registry, and also Spire. NSE and NSC run traffic during the operation.

@denis-tingaikin
Copy link
Member

@rpiceage

Got it.

So current behaviour look valid for me. The old interface from previous connection (before upgrade) should be deleted by nse timeout.

@denis-tingaikin
Copy link
Member

denis-tingaikin commented Nov 22, 2021

@glazychev-art Could you check if this version of deployments includes our fix for healing path?

@rpiceage
Copy link
Author

So current behaviour look valid for me. The old interface from previous connection (before upgrade) should be deleted by nse timeout.

Do you mean the deleted icmp-respo-37e0 in the NSE?
But I suppose it should have a single new one instead of 2 (icmp-respo-fb43 and icmp-respo-6de2) with colliding multiple addresses.

@denis-tingaikin
Copy link
Member

denis-tingaikin commented Nov 22, 2021

Do you mean the deleted icmp-respo-37e0 in the NSE?
But I suppose it should have a single new one instead of 2 (icmp-respo-fb43 and icmp-respo-6de2) with colliding multiple addresses.

OK, I saw at it so quickly, you're correct it looks unhealthy.

@glazychev-art
Copy link
Contributor

@denis-tingaikin
No, this version doesn't include fix for healing path

@denis-tingaikin
Copy link
Member

denis-tingaikin commented Nov 22, 2021

@rpiceage Could you please retest it with the latest version of NSM? We added a major fix for healing recently.

@denis-tingaikin
Copy link
Member

Latest version at this moment is networkservicemesh/deployments-k8s@445fe16

@denis-tingaikin denis-tingaikin moved this to Potentially fixed in Release 1.1.0 Nov 22, 2021
@rpiceage
Copy link
Author

We build the images from source. We used the above versions for build, I was under the impression they were quite new (Friday for the base, and yesterday for the one to upgrade to). Is the healing fix more recent yet?

@rpiceage rpiceage changed the title Multiple IP addresses appearin in NSE and NSC after upgrade Multiple IP addresses appearing in NSE and NSC after upgrade Nov 22, 2021
@rpiceage
Copy link
Author

OK, tested with versions from today. It's OK. We can close this.

Repository owner moved this from Potentially fixed to Done in Release 1.1.0 Nov 22, 2021
@edwarnicke
Copy link
Member

@rpiceage Woot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

4 participants