-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait for 5 seconds when getting of_port number #830
Wait for 5 seconds when getting of_port number #830
Conversation
Instead of 1 second on Linux. We have observed on some production clusters that it sometimes takes more second for ovs-vswitch to report the port number to OVSDB, although we are not yet sure why. Because the wait operation actually returns when the port is available, this does not increase execution time of CNI Add in the general case.
Thanks for your PR. The following commands are available:
These commands can only be run by members of the vmware-tanzu organization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with this quick solution.
Do we observe how much time it takes for container to start after CNI ADD returns? If it takes long, a possibility is to change the OFPort reading to async, not to block CNI ADD. But if container can be started fast, this can increase risks of application failures after starting.
@tnqn
Setting up network is the last step of creating sandbox container, and the following step is creating init container (or normal container), so it could be very fast to start container after that. I feel the original 1 second might be too small to get response given that the worker Node could be overloaded. I see openstack sets vsctl timeout to 10 seconds by default and ansible sets 5 seconds. I think we could add prometheus metrics for durations of CNI requests to understand how long they take. |
/test-all |
/test-windows-conformance |
1 similar comment
/test-windows-conformance |
Windows job seems stuck. I will merge this now but will make sure the test passes before releasing 0.7.2. |
Instead of 1 second on Linux. We have observed on some production
clusters that it sometimes takes more second for ovs-vswitch to report
the port number to OVSDB, although we are not yet sure why. Because the
wait operation actually returns when the port is available, this does
not increase execution time of CNI Add in the general case.