-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some computestorage.FormatWritableLayerVhd breakage in master vs 0.9.2? #1375
Comments
Okay, that's fun. I tried the obvious fix and now on Windows Server 2019, we get
and on Windows Server 2022 we get
i.e. both fail the way Windows Server 2022 did earlier, which means that something has changed elsewhere. The earlier On Windows Server 2019, this might be another behaviour change, that the hcshim API expects the VHD to not be attached now, but used to expect the VHD to be attached? Actually, looking closer, this is probably also related to the workaround lowering, since the containerd-level workaround didn't attach the VHD first, because the calling code already did. The comment on Oddly enough, I do note that the containerd version of the workaround explicitly avoided relying on being manifested correctly, so perhaps the issue was that (without my quick-fix) on Server 2019 the workaround was being applied twice instead of once, and on Windows Server 2022 irrespective of my quick-fix, the workaround is being applied once instead of nonce because the containerd unit tests are not manifested so |
@TBBle It's likely because containerd isn't manifested as you mentioned in the end. I believe we'll always enter the if statement as it'll return windows 8's build num. I think doing the same build number check as we do in the test might make sense for this package to avoid anyone using it to manifest, but maybe the right move is to just manifest containerd like we do for docker. Let me think about this |
On my last point, I also changed the containerd test to not attach the VHD before calling the format API, and now Windows Server 2019 passes, and Windows Server 2022 reports
which makes sense as Windows Server 2022 was expected a VHD Handle, but because this code relies on the manifested caller, and containerd's unit tests are not that, it got a Disk handle through the RS5 code-path. Also, for the record, we hit I'm mildly concerned about hcsshim now requiring its callers to be manifested when they aren't using There was also the recent discussion (I forget if it was here, containerd, or moby...) that perhaps libraries should use the registry to determine the Windows build rather than requiring callers to be manifested to return the right results. I feel like this is more user-friendly when the actual "os version check" interaction is not exposed to the user, as it's easier to not get wrong, particularly if your test coverage is not as wide as containerd's is. |
I'm not ecstatic about this either 😔 I was hoping we could find a way around or converge on changing osversion to not use GetVersion; I hear conflicting opinions internally on what's the safe way to version check. My ideal world is anything that has branching behavior based on version differences and is just a library we expose should grab a version number in a different manner, but that might not be the "right" way. |
@TBBle I still think the easiest route here is finding what the downsides are of swapping osversion to a non GetVersion approach, but we'll see. Glad this didn't slip into a released tag though thankfully |
Took me a while, but the discussion I was thinking of was containerd/containerd#6697 (comment), for which you already have context. ^_^ |
@TBBle Reached out to some peeps internally to try and close on this, waiting to hear back. Using the build number for feature detection is generally frowned upon but sometimes it's the easiest route.. so it'd be nice to not require manifesting your application because of leakage of the osversion pkg to others in this repo. |
#ugh... |
Isn't this fun 😌 |
@TBBle Well the internal vote has been cast and.. manifesting is still the only supported approach 😐 Looks like we'll have to manifest our friend Containerd it seems |
I opened containerd/containerd#6896 for the containerd side of this discussion. |
A passing thought: How would we feel about panicking if |
@TBBle Not a bad idea, I was almost thinking for cases like this to make it more flexible on our side for APIs that call it internally, we just have a constant for the build number that will be returned if it's not manifested. So for computestorage we can grab the build number and if it's the non-manifested number, we just take the handle as is and hope the caller handled things. wdyt? |
e.g. // Not the final naming 😆
if build := osversion.Build(); build == osversion.PleaseManifest {
// Don't do anything fancy if we were going to, up to the caller to handle
// anything needed on a different os version.
} else if build < osversion.V19H1 {
// Do stuff
} |
@jterry75 I'm curious on your thoughts on the above also. We've got some folks out this week but I'm thinking of just handling the computestorage situation for now in the meantime |
I kinda feel it's bordering on too much to worry about, but needing to do the dance for FormatWritableLayerVhd yourself because of an OS breaking change is also gross. I think ideally we should handle it for you |
Do any of the HCS API's that check for capabilities return the OS version? Could we proxy the call to the HCS basically, get the result, and then act in our code base. Do it in a |
@jterry75 We might be able to do something with grabbing the current supported schema version, but the internal docs for RS5 for what this returns are a bit odd. It looks like at one point it started off at a lower version and then got bumped (probably from a KB, or the docs are wrong lol). I also don't know if they ever plan to bump schema versions on past OS'. Trying to get an answer here |
Ok, horrid answer, what if we do a |
The osversion in general. It will all be in the same process, so if we fix it there we fix it everywhere. And honestly, Windows processes start so fast anyways.... we wont even notice the change |
I was attempting to rebase containerd/containerd#4419 onto #901 to verify it, and this involves upgrading hcsshim from v0.9.2 to master (+my PR).
I noticed a couple of unrelated test failures in containerd, and so I put together a branch that was just the hcsshim upgrade (specific commit in CI) and it produces the same test failures, which are different failures for the same test in Windows Server 2019 and Windows Server 2022.
The failing test is
pkg/os TestResolvePath
line 153, and on Windows Serve 2019 it reports:while on Windows Server 2022 it reports
My guess is that there's some conflict between the RS5 workaround for disk handle vs VHD handle in containerd and the same workaround in hcsshim master which wasn't present in v0.9.2.
That said, since neither workaround should be active on Windows Server 2022, perhaps there's something else going on.
I haven't tested it (I might quickly do so by removing the workaround from containerd), but perhaps the function should be renamed or have its type changed in some way to highlight to callers that the API behaviour has changed and now always expects the VHD Handle, never the disk handle?
The text was updated successfully, but these errors were encountered: