You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
TAPA has a service to announce application Targets with opened Stream(s) towards the nVIP cluster. The information is consumed by the LB component to provide a load balancing functionality towards available application Targets. Reliable operation of the load balancing feature requires data connectivity, which is supplied by NSM.
Currently, once the initial NSM connection request successfully connects the TAPA to a Proxy, it is considered safe to announce the application Target from data connectivity point of view. However, the NSM connection connecting TAPA to a Meridio Proxy might experience problems which might lead to traffic disturbance including outage.
These problems include for example restart/upgrade of NSM infrastructure components, restart/upgrade of the Meridio Proxy serving the application's TAPA sidecar. (But other non POD availability related infrastructure issues also belong here.)
It should be investigated if the current behaviour could be improved.
improvement idea:
NSM offers a monitoring feature that allows for learning NSM connection state changes. This monitoring "tool" could be used to update consumers of Target announcements such as the LB, so that Targets with possible connectivity problems could be excluded from loadbalancing.
Also, if NSM connection between TAPA and Proxy was utilizing a datapath monitoring functionality as part of NSM heal, other infrastructure related issues causing datapath connectivity problems could be also learnt through NSM connection monitoring. (That's because NSM heal would first close the non-working connection as part of the heal procedure, which would trigger a monitoring event.)
To Reproduce
Steps to reproduce the behavior:
Deploy a working Trench (with Conduit, Attractor, Stream etc.). Deploy target-example that opens the Stream.
Delete the vpp-forwarder POD located on the same worker as a target-example POD. The application Target belonging to the affected example-target POD will remain in available in NSP and thus in LBs.
Expected behavior
It should be avoided to announce Targets with non-working TAPA->Proxy NSM connections. Thus, LBs could exclude them from the pool of working Targets.
Context
Network Service Mesh: v1.12.0
Meridio: v1.0.16
...
Logs
Add logs here.
The text was updated successfully, but these errors were encountered:
Describe the bug
TAPA has a service to announce application Targets with opened Stream(s) towards the nVIP cluster. The information is consumed by the LB component to provide a load balancing functionality towards available application Targets. Reliable operation of the load balancing feature requires data connectivity, which is supplied by NSM.
Currently, once the initial NSM connection request successfully connects the TAPA to a Proxy, it is considered safe to announce the application Target from data connectivity point of view. However, the NSM connection connecting TAPA to a Meridio Proxy might experience problems which might lead to traffic disturbance including outage.
These problems include for example restart/upgrade of NSM infrastructure components, restart/upgrade of the Meridio Proxy serving the application's TAPA sidecar. (But other non POD availability related infrastructure issues also belong here.)
It should be investigated if the current behaviour could be improved.
improvement idea:
NSM offers a monitoring feature that allows for learning NSM connection state changes. This monitoring "tool" could be used to update consumers of Target announcements such as the LB, so that Targets with possible connectivity problems could be excluded from loadbalancing.
Also, if NSM connection between TAPA and Proxy was utilizing a datapath monitoring functionality as part of NSM heal, other infrastructure related issues causing datapath connectivity problems could be also learnt through NSM connection monitoring. (That's because NSM heal would first close the non-working connection as part of the heal procedure, which would trigger a monitoring event.)
To Reproduce
Steps to reproduce the behavior:
Expected behavior
It should be avoided to announce Targets with non-working TAPA->Proxy NSM connections. Thus, LBs could exclude them from the pool of working Targets.
Context
...
Logs
Add logs here.
The text was updated successfully, but these errors were encountered: