-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retry around wclayer operations for process isolated containers #1091
Conversation
e0edb8f
to
0df4d76
Compare
Should hopefully help #919 |
b483e89
to
76cd63c
Compare
@msscotb I did when trying to get the PrepareLayer issue to reproduce 😆 I got ERROR_DEVICE_NOT_CONNECTED |
@msscotb Any other feedback for this? |
This change adds a simple retry loop to handle some behavior on RS5. Loopback VHDs used to be mounted in a different manor on RS5 (ws2019) which led to some very odd cases where things would succeed when they shouldn't have, or we'd simply timeout if an operation took too long. Many parallel invocations of this code path and stressing the machine seem to bring out the issues, but all of the possible failure paths that bring about the errors we have observed aren't known. On 19h1+ this retry loop shouldn't be needed, but the logic is to leave the loop if everything succeeded so this is harmless and shouldn't need a version check. Signed-off-by: Daniel Canter <dcanter@microsoft.com>
} | ||
|
||
defer func() { | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't err need to be set to PrepareLayer result for the deferred DeactivateLayer to execute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, if you have a named return value, e.g. (err error)
then the return value of line 107 or the PrepareLayer
call will get assigned to err
after completion. So when defer runs it will have the return value of PrepareLayer
to check against.
Here's a quick example: https://play.golang.org/p/cID3RHPwl88
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Related work items: microsoft#930, microsoft#962, microsoft#1004, microsoft#1008, microsoft#1039, microsoft#1045, microsoft#1046, microsoft#1047, microsoft#1052, microsoft#1053, microsoft#1054, microsoft#1057, microsoft#1058, microsoft#1060, microsoft#1061, microsoft#1063, microsoft#1064, microsoft#1068, microsoft#1069, microsoft#1070, microsoft#1071, microsoft#1074, microsoft#1078, microsoft#1079, microsoft#1081, microsoft#1082, microsoft#1083, microsoft#1084, microsoft#1088, microsoft#1090, microsoft#1091, microsoft#1093, microsoft#1094, microsoft#1096, microsoft#1098, microsoft#1099, microsoft#1102, microsoft#1103, microsoft#1105, microsoft#1106, microsoft#1108, microsoft#1109, microsoft#1115, microsoft#1116, microsoft#1122, microsoft#1123, microsoft#1126
Add retry around wclayer operations for process isolated containers
This change adds a simple retry loop to handle some behavior on RS5. Loopback VHDs
used to be mounted in a different manor on RS5 (ws2019) which led to some
very odd cases where things would succeed when they shouldn't have, or we'd simply
timeout if an operation took too long. Many parallel invocations of this code path
and stressing the machine seem to bring out the issues, but all of the possible failure
paths that bring about the errors we have observed aren't known.
On 19h1+ this retry loop shouldn't be needed, but the logic is to leave the loop if everything succeeded so this is harmless
and shouldn't need a version check.
Signed-off-by: Daniel Canter dcanter@microsoft.com