Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ioBundle->NetworkPortStatus matching tightening for IsPort() #4460

Closed

Conversation

andrewd-zededa
Copy link
Contributor

@andrewd-zededa andrewd-zededa commented Dec 4, 2024

Check all three identifiers: ifname, usbaddr, pciaddr
Some io types can have an empty string in one of these fields.
Expand matching conditions to ensure ioBundles are not incorrectly matched to a different NetworkPort.

While modifying the parameters to types.IsPort() I discovered an unused isPort() which referenced a lock that appears to be used nowhere else and removed both.

@github-actions github-actions bot requested a review from eriknordmark December 4, 2024 16:19
@andrewd-zededa andrewd-zededa changed the title Guard against empty ifname argument. ioBundle->NetworkPortStatus matching tightening for IsPort() Dec 9, 2024
Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but are there other functions which compare on ifname which should also compare on all three fields?

@andrewd-zededa
Copy link
Contributor Author

andrewd-zededa commented Dec 9, 2024

LGTM but are there other functions which compare on ifname which should also compare on all three fields?

@eriknordmark there are a series of other helper functions in types/dns.go (possibly elsewhere too): IsL3Port(), IsMgmtPort(), GetPortCost(), ...and all of these appear to only check Ifname for equivalence of the device. At the moment I'm not sure how many of these are relevant for device passthrough specifically.

They all follow the same pattern of

		if us.IfName != ifname {
			continue
		}

Maybe this should be split out to some form of a IsSameDevice() call (or some better name).

@andrewd-zededa andrewd-zededa marked this pull request as ready for review December 10, 2024 00:02
@eriknordmark
Copy link
Contributor

@eriknordmark there are a series of other helper functions in types/dns.go (possibly elsewhere too): IsL3Port(), IsMgmtPort(), GetPortCost(), ...and all of these appear to only check Ifname for equivalence of the device. At the moment I'm not sure how many of these are relevant for device passthrough specifically.

Maybe this should be split out to some form of a IsSameDevice() call (or some better name).

You'd actually need to create some Lookup function if you want to break out the comparison of the different identifiers.

Looking at the functions you listed:

  1. IsMgmtPort() is only called for ifnames which are in DeviceNetworkStatus.Ports. But based on Milan's comment we can have modems which do not have an ifname, so makes sense to add the args there.
  2. IsL3Port() is no longer used hence can be deleted.
  3. GetPortCost() has the same considerations as IsMgmtPort.

@eriknordmark
Copy link
Contributor

But I don't understand if we can have empty ifname arguments to these functions. There are other functions in dns.go which build a list of ifnames and then use that to get a list of IP addresses, and that will produce the wrong thing if a modem interface without an ifname is an actual Port which has an IP address. So I'm a bit confused - need to re-read Milan's earlier comment.

@andrewd-zededa
Copy link
Contributor Author

@eriknordmark Milan's comment references propagatePhyioAttrsToPort() which calls handleMissingIfname() so the NetworkPortConfig seems like it should be safe but somehow NetworkPortStatus had an empty field.

@andrewd-zededa
Copy link
Contributor Author

Latest pushes include USBProd plumbing through NetworkPortConfig/Status removal of unused IsL3Port() and spreading the expanded check of usbaddr, usbprod, pciaddr to IsMgmtPort() and GetPortCost()

@andrewd-zededa
Copy link
Contributor Author

@eriknordmark @milan-zededa when this is approved/merged I need to backport to one or more stable/lts branches.

@rene rene added the stable Should be backported to stable release(s) label Dec 11, 2024
@rene
Copy link
Contributor

rene commented Dec 11, 2024

@andrewd-zededa , please, update the commit message to add USBProd as well and mention why you have removed the isPort() function (not used anywhere)...

Copy link
Contributor

@rene rene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM... kicking off tests...

@rucoder
Copy link
Contributor

rucoder commented Dec 11, 2024

LGTM... kicking off tests...

@rene still not clear whether @andrewd-zededa has tested on real device

@rene
Copy link
Contributor

rene commented Dec 11, 2024

LGTM... kicking off tests...

@rene still not clear whether @andrewd-zededa has tested on real device

ok, let's run the tests at least, he still needs to update the commit message, so when he pushes the final version we can review it again....

Check all four identifiers: ifname, usbaddr, usbprod, pciaddr
Some io types can have an empty string in one of these fields.
Expand matching conditions to ensure ioBundles are not incorrectly
matched to a different NetworkPort.

Removed unused code isPort() in domainmgr and IsL3Port().

Signed-off-by: Andrew Durbin <andrewd@zededa.com>
@andrewd-zededa
Copy link
Contributor Author

@rene updated commit message with your requests

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this is all we need - I'm concerned that we do not set ifname for all WWAN ports which would make them not be used in zedcloud/send.go. But hard to tell how much of an issue this is when we erroneously think COM ports should be used by TCP/IP. So it makes sense to merge this and keep on testing.

@andrewd-zededa
Copy link
Contributor Author

I'm reviewing the eden test failures and here are the first three failures all in the Log/Info/Metric lim test, unable to ssh. Not sure why yet

  1. smoke ext4 true
    FAIL: ../lim/testdata/log_test.txt:16: command failure
    [background] bash ssh.sh: signal: killed
    [stdout]
    time="2024-12-11T21:11:25Z" level=fatal msg="command ssh failed: exit status 255"
    Connection timed out during banner exchange
    Connection to 127.0.0.1 port 2222 timed out
  2. smoke ext4 false
    FAIL: ../lim/testdata/log_test.txt:16: command failure
    [background] bash ssh.sh: exit status 0
    [stdout]
    time="2024-12-11T21:11:30Z" level=fatal msg="command ssh failed: exit status 255"
    Connection timed out during banner exchange
    Connection to 127.0.0.1 port 2222 timed out
  3. smoke zfs true
    FAIL: ../lim/testdata/log_test.txt:16: command failure
    [background] bash ssh.sh: exit status 0
    [stdout]
    time="2024-12-11T21:11:01Z" level=fatal msg="command ssh failed: exit status 255"
    Connection timed out during banner exchange
    Connection to 127.0.0.1 port 2222 timed out

@OhmSpectator
Copy link
Member

@uncleDecart, do you have any ideas on why the test failed? =)

@uncleDecart
Copy link
Member

@uncleDecart, do you have any ideas on why the test failed? =)

@OhmSpectator, @andrewd-zededa does it run locally? Could be that EVE crashes and then you can't ssh

@eriknordmark
Copy link
Contributor

FWIW I kicked off ztests on old of the old test machines in the lab and that seems to be fine.

But I also did a qemu run on my laptop using
make rootfs installer-raw run-installer-raw run-target
and after that virtual device is onboarded to the controller it gets stuck.

tail -F /run/diag.out shows among other things
ERROR: App test-ztest-ubunu uuid 91a07a2f-2606-4a50-8067-84a9300e21b9 state DOWNLOADING error: Found error in content tree Ztest-xenial-amd64 attached to volume test-ztest-ubunu_0_m_1: No IP management port addresses with cost <= 0

cat /run/nim/DeviceNetworkStatus/* |jq shows an error
"LastError": "link not up for interface eth0 (down)",

And ip link has eth0 but not the actual hardware interface keth0 (eth0 is a bridge so we can attach e.g., switch network instances to eth0, while keth0 is the actual hardware device driver).

So something is seriously amiss here, even though I can update a running physical device to the image from this PR and it runs fine.

func IsL3Port(dns DeviceNetworkStatus, ifname string) bool {
for _, us := range dns.Ports {
if us.IfName != ifname {
if us.USBAddr != usbaddr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we need to treat empty strings as wildcards in some of these comparisons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out that is at least required for PCIAddr in IsPort since the comparison is between the model aka SystemAdapter, which might have an empty PCIAddr and the discovered PCIAddr on the running system, which is not empty.

@@ -732,7 +751,7 @@ func GetPortCost(dns DeviceNetworkStatus, ifname string) uint8 {

func GetPort(dns DeviceNetworkStatus, ifname string) *NetworkPortStatus {
for _, us := range dns.Ports {
if us.IfName != ifname {
if ifname != "" && us.IfName != ifname {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have callers which pass an empty string as the ifname argument? If they do this will match on the first management port which might not be what we want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getLocalAddrListImpl() only calls it with a non empty ifname.

Could VerifyAllIntf() pass an empty string?

@andrewd-zededa
Copy link
Contributor Author

andrewd-zededa commented Dec 12, 2024

@eriknordmark @uncleDecart @mikem-zed @milan-zededa @OhmSpectator @rene

Maybe a simpler and quicker to test fix would be to guard against the ib.Type == types.IoCom

In the scope of domainmgr.go where it calls types.IsPort() change it to

if types.IsPort(ctx.deviceNetworkStatus, ib.Ifname) && (ib.Type != types.IoCom){
    isPort = true
    keepInHost = true
}

Or something similar.

for _, us := range dns.Ports {
if us.IfName != ifname {
if ifname != "" && us.IfName != ifname {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? It causes the complete breakage since a lookup of e.g. COM1 or audio0 (which do not have an ifname by definition) will match the first item in dns.Ports. If that first item is a network port like eth0, then IsPort will incorrectly return true.

@OhmSpectator
Copy link
Member

Do we still need this PR after merging #4482 ?

@eriknordmark
Copy link
Contributor

Superceeded by #4482

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stable Should be backported to stable release(s)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants