-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Windows] Optimize the containerized OVS installation #6383
Conversation
@XinShuYang @antoninbas I have introduced a ps module "OVSHelper.psm1" in this change, which is used to provide general functions independent from the exact versions of OVS. So I place it in the container image directly, rather than put the functions in I am thinking how to co-work with the existing script |
} | ||
} else { | ||
Write-Host "OVS driver is not installed via the host-process containers, please install it manually." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When Test-Path $OVSHelperScript
fails, we can't determine if the driver has been installed on the host. I think please install it manually
might be misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this what you want?
if(-not (Test-Path $OVSHelperScript)) {
$driverStatus = netcftg -q ovsext
if ($driverStatus -like "*not installed*") {
Write-Host "OVS driver is not installed on the Windows Node, please install it manually.
}
}
If there is no $OVSHelperScript provided in the image, we would ask the user to manually install the driver or update it but not operate from container.
Let's merge #6360 first, given that the mentorship program is coming to an end and I don't want to have @shikharish to resolve conflicts. |
Please add proper description on PR summary and commit. |
417e126
to
93a0b0e
Compare
ea32285
to
8e090cf
Compare
} | ||
$OVSInstallScript = "$mountPath\k\antrea\Install-OVS.ps1" | ||
if (Test-Path $OVSInstallScript) { | ||
& $OVSInstallScript -InstallUserspace $false -LocalFile $mountPath\openvswitch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case user want ot bring their own signed driver, User must be able to know if driver is already installed or not or failed to install , it might not be preferable to check log for failure later.
Instead Start-Job can also be used to schedule the script in background asynchronously, while capturing the status of the background script Install-OVS.ps1 installation using wait-job as and when it returns success or error ! and then just validate driver installation success or failure here as it may allow user to know if driver is installed ? and then they might want to uninstall it and use their own signed driver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it , I can try with command "Start-Job" to run the script.
hack/windows/Install-OVS.ps1
Outdated
if ($ovsFile.BaseName -eq "openvswitch") { | ||
return $ovsFile.FullName | ||
} | ||
Log "Unsupported local directory: $LocalFile. Only directories with a base name 'openvswitch' are supported" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit surprising IMO. Why does the basename matter given that we will copy everything we need to OVSInstallDir
anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use base name "openvswitch" as a check here becauseI don't have a good idea on how to check if the local OVS directory is valid or not, and both our opensource OVS files (injected in the container image) and the commercial OVS files are using this name. So I hard-code it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have changed it so that it no longer relies on the name? I think it makes more sense that way, because only the contents matter.
d1b3224
to
514cf85
Compare
/test-windows-all |
hack/windows/Install-OVS.ps1
Outdated
# Copy $OVSDriverPath to host path "c:/openvwitch/driver", and then install ovsext.inf with the host path. | ||
# This is a workaround for error code "0x80070003" with containerd v1.7+. The error happens when Windows | ||
# utitilty netcfg tries to access the container's mount path "C:/hpc/". | ||
$hostDriverPath="${OVSInstallDir}\driver" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename it to driverStagingPath
. By the way I assume that we can actually delete ${OVSInstallDir}\driver
after running netcfg
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used to try to remove the ${OVSInstallDir}\driver
from the host, but I found that the ovsext driver was shown as "not installed" some time after the Pod is down. This is different from my expectation that OVS driver is supposed to be loaded after installed. So I keep the files on the disk in my final change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's interesting. I guess that is one more reason that it needs to be copied to a "host path", as it needs to be somewhat permanent.
hack/windows/Install-OVS.ps1
Outdated
# This is a workaround for error code "0x80070003" with containerd v1.7+. The error happens when Windows | ||
# utitilty netcfg tries to access the container's mount path "C:/hpc/". | ||
$hostDriverPath="${OVSInstallDir}\driver" | ||
if ($OVSDriverPath -ne $hostDriverPath){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove this test in all the copy functions.
I cannot think of any remaining legitimate case in which $OVSLocalPath
and $OVSInstallDir
would be the same? In order for this to happen, the user would have to invoke the script with LocalFile == OVSInstallDir
, which IMO does not need to be a supported case (we could exit with an error). Maybe I am missing something though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the use scenario, sometimes we may need to replace OVS driver on the Windows host for test purpose, e.g., adding logs in the ovsext driver to perform troubleshooting sessions, at this time we can directly replace the driver files in $hostDriverPath
, and then call the script.
This is not for production usage, so if you prefer we ignore it, I can remove the compare and this check in the start,
if (($LocalFile -ne "") -and ("$LocalFile" -eq "$OVSInstallDir")) {
Log "LocalFile and OVSInstallDir are the same, exiting"
exit 1
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. But for troubleshooting wouldn't you also need to somehow overwrite the OVS driver version, or CheckAndInstallOVSDriver
would return early?
If this is convenient for debugging and there is no simple alternative, then I think you should keep it, and maybe just log a warning when "$LocalFile" -eq "$OVSInstallDir"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my last change, I remove the path compare on driver path and usr/bin as you suggested, and add a precheck at the start to reject the input if $LocalFile
equals to $OVSInstallDir
. For the local troubleshooting (not production env), I can call UnInstall-OVS.ps1 then call Install-OVS.ps1 as an alternative. Please help review the patch again. @antoninbas
hack/windows/Install-OVS.ps1
Outdated
$hostUsrBinPath = "C:\openvswitch\usr\bin" | ||
$usrBinPath="${OVSInstallPath}\usr\bin" | ||
if ("$usrBinPath" -ne "$hostUsrBinPath") { | ||
if (-not $global:updateOVS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't really sound like an issue for an initContainer?
@@ -208,7 +208,7 @@ your own OVS package with a signed OVS kernel driver, you would run: | |||
|
|||
```powershell | |||
curl.exe -LO https://raw.githubusercontent.com/antrea-io/antrea/main/hack/windows/Install-OVS.ps1 | |||
.\Install-OVS.ps1 -ImportCertificate $false -Local -LocalFile <PathToOVSPackage> | |||
.\Install-OVS.ps1 -InstallUserspace $true -LocalFile <PathToOVSPackage> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since InstallUserspace
now is true by default, I think we don't need to specify it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, should we update the guidance of preparing containerized OVS in line 196-204?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to highlight "-InstallUserspace $true" with the context expecting to install Windows native Services.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to highlight "-InstallUserspace $true" with the context expecting to install Windows native Services.
Got it, I am ok with it.
Also, should we update the guidance of preparing containerized OVS in line 196-204?
@wenyingd Any follow-ups regarding this question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, remove the original descriptions and example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, thanks for making all these changes
I know that as part of CI, we test the containerized case. Is there any validation as part of CI for the "service" case (.\Install-OVS.ps1 -InstallUserspace $true
run directly on the host), or alternatively can you confirm that you tested the most recent version on the script manually?
if ($InstallUserspace -eq $true) { | ||
$OVSBinPaths="${OVSBinPaths};${OVSLocalPath}\usr\sbin" | ||
} | ||
InstallOpenSSLFiles "$OVSBinPaths" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that we copy the SSL files to $OVSBinPaths
("${OVSLocalPath}\usr\bin"
), and then we rely on CopyOVSUtilities
to copy them to the OVS install directory? Was that intentional? I guess it will work but shouldn't InstallOpenSSLFiles
copy the files to their "final" installation location directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current solution is on desired that copy the ssl files to the LocalFile first, then copy the final files to the install dir. This is because the ssl dll files are required by both usr/bin and usr/sbin, and the two directories are copied in different step: usr/bin is copied in step CopyOVSUtilities
, and usr/sbin is copied in step InstallOVSServices
. If we plan to copy ssl dll files directly to the final path, the coy operations are distributed into different steps, and we also need to record the path where the dll files are extracted until the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sounds good
I have manually tested the script on my local testbed with both cases to install userspace services and not install services. |
@wenyingd For the OVS host service test, We have jenkins-windows-ovs in CI to verify related code changes. In the |
66be333
to
4647e64
Compare
2e86758
to
a5e9d15
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-windows-all |
} | ||
Write-Host "Completed OVS installation" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, only have some comments about logs.
1. Add logic to check the installed OVSext drivers, if only the desired version of driver is already installed, skip the installation; otherwise, remove the existing the drivers and re-install. 2. Add logic to check the installed VC redistributable files, if the existing installd vc_redist version is equal to or higher than the lowest requirement, skip the installation; otherwise re-install with the provided files. 3. Optimize the logic of maintaining the env paths by removing the duplicated paths. 4. Optimize Uninstall-OVS script by removing OVS bin paths from the system path after it is deleted. Signed-off-by: Wenying Dong <wenyingd@vmware.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, since the latest change is only about log updates, we don't need to rerun windows jenkins jobs.
/test-windows-all |
of driver is already installed, skip the installation; otherwise, remove the
existing the drivers and re-install.
installd vc_redist version is equal to or higher than the lowest requirement,
skip the installation; otherwise re-install the files provided in the image.
path.
after it is deleted.