-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for confidential VMs #3265
Add support for confidential VMs #3265
Conversation
f1afdb5
to
f0e5485
Compare
/test pull-cluster-api-provider-azure-e2e |
I started looking into this PR, will take sometime in going over the Azure Docs and the new fields being added. |
@nawazkh are you still reviewing this? |
Extremely sorry, It slipped my radar after I was half way done reviewing this PR. I will get started on it again! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all your hard work on this! I just had one comment and a general question: I think it'd be great to have an E2E test for this feature. This may be out of scope for this PR but this seems like a great feature that can be tested E2E.
@willie-yao Thank you for taking the time to review this change :) I agree that having an E2E test for this feature would be quite useful. In order to keep this PR at its current size, would it be an option to open an issue for the E2E test and then tackle it in a future PR? Or would you prefer to have the E2E test as part of this PR? (I would be happy to propose a change for the E2E test either way, I would just need a bit more time) |
Ideally, we would want to have an E2E test be a part of this PR in order to ensure the feature is fully working E2E. Since the next release is planned for May 2, it would be totally okay to take some time. The feature won't be available to users until then anyways. Would this be enough time for you to implement an E2E test? I apologize for the rush on this.. it's on us for not reviewing this PR sooner 😔 |
Absolutely no problem, I'll get started as soon as I can and ping you here when I have something up for review. Thanks again for your time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great effort @mresvanis , thanks for taking the time in putting this together!
Also, I apologize for sharing my review so late.
I would generally share all the review at once, but I am breaking it in chunks here to get the ball rolling. I plan on completing reviewing this PR in couple more hours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part 2/2 of the review.
I have skipped reviewing azuremachine_validation(.go and _test.go), azuremachine_validation_test.go since we are expecting for some refactor.
if storageProfile.OsDisk.ManagedDisk != nil && | ||
storageProfile.OsDisk.ManagedDisk.SecurityProfile != nil && | ||
storageProfile.OsDisk.ManagedDisk.SecurityProfile.SecurityEncryptionType != "" { | ||
if s.SecurityProfile.EncryptionAtHost != nil && *s.SecurityProfile.EncryptionAtHost { | ||
return nil, azure.WithTerminalError(errors.New("encryption at host is not supported when SecurityEncrytionType is set")) | ||
} | ||
|
||
securityProfile.UefiSettings = &compute.UefiSettings{} | ||
|
||
if s.SecurityProfile.SecureBoot != "" && s.SecurityProfile.SecureBoot == infrav1.SecureBootPolicyEnabled { | ||
securityProfile.UefiSettings.SecureBootEnabled = pointer.Bool(true) | ||
} else { | ||
securityProfile.UefiSettings.SecureBootEnabled = pointer.Bool(false) | ||
} | ||
|
||
if storageProfile.OsDisk.ManagedDisk.SecurityProfile.SecurityEncryptionType == compute.SecurityEncryptionTypesDiskWithVMGuestState && !*securityProfile.UefiSettings.SecureBootEnabled { | ||
return nil, azure.WithTerminalError(errors.Errorf("secure boot should be enabled when SecurityEncrytionType is set to %s", compute.SecurityEncryptionTypesDiskWithVMGuestState)) | ||
} | ||
|
||
if s.SecurityProfile.VirtualizedTrustedPlatformModule != "" && s.SecurityProfile.VirtualizedTrustedPlatformModule == infrav1.VirtualizedTrustedPlatformModulePolicyDisabled { | ||
return nil, azure.WithTerminalError(errors.New("vTPM should be enabled when SecurityEncrytionType is set")) | ||
} | ||
|
||
securityProfile.UefiSettings.VTpmEnabled = pointer.Bool(true) | ||
|
||
securityProfile.SecurityType = compute.SecurityTypesConfidentialVM | ||
|
||
return securityProfile, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better if we enforce these validations at webhook level so that there are lesser chances for a user to run into Terminal Error ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think, if we were to refactor the securityProfile
as per earlier comments, then we would run into lesser (almost none) validations and have more of assignments at this stage of v1beta1
-> azure
conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better if we enforce these validations at webhook level so that there are lesser chances for a user to run into Terminal Error ?
I absolutely agree, these validations are quite useful at the webhook level and we have them here. The latter is initially called through here. Does this makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thanks!
@CecileRobertMichon do you know why we perform validations in generateSecurityProfile()
func?
I am trying to understand the reasoning behind generateSecurityProfile()
returning azure.WithTerminalError()
.
Can we not perform all the validations at webhook level ? Is it because we need s
of type *VMSpec
populated for these validations, and s
can only be populated later in the reconciliation and not at the webhook level?
if s.SecurityProfile.SecureBoot != "" { | ||
if s.SKU.HasCapability(resourceskus.TrustedLaunchDisabled) && s.SecurityProfile.SecureBoot == infrav1.SecureBootPolicyEnabled { | ||
return nil, azure.WithTerminalError(errors.Errorf("secure boot is not supported for VM type %s", s.Size)) | ||
} | ||
|
||
if s.SecurityProfile.SecureBoot == infrav1.SecureBootPolicyEnabled { | ||
securityProfile.UefiSettings.SecureBootEnabled = pointer.Bool(true) | ||
securityProfile.SecurityType = compute.SecurityTypesTrustedLaunch | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I am not sure if s.SKU.HasCapability(...)
can be run at webhook level. If we cannot perform s.SKU.HasCapability(...)
validations at webhook, then we might have to leave some of these dependent validations here.
What do you think @CecileRobertMichon @willie-yao ?
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #3265 +/- ##
==========================================
- Coverage 53.33% 53.10% -0.24%
==========================================
Files 185 182 -3
Lines 18433 18367 -66
==========================================
- Hits 9832 9754 -78
Misses 8063 8063
- Partials 538 550 +12
☔ View full report in Codecov by Sentry. |
if profile != nil && securityEncryptionType != "" { | ||
if profile.EncryptionAtHost != nil && *profile.EncryptionAtHost && securityEncryptionType == SecurityEncryptionTypeDiskWithVMGuestState { | ||
allErrs = append(allErrs, field.Invalid(fieldPath.Child("EncryptionAtHost"), profile.EncryptionAtHost, | ||
"EncryptionAtHost cannot be set to 'true' when securityEncryptionType is set to DiskWithVMGuestState")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @anujmaheshwari1 I wonder if we do this correctly for AKS...
VirtualizedTrustedPlatformModulePolicyEnabled))) | ||
} | ||
|
||
if securityEncryptionType == SecurityEncryptionTypeDiskWithVMGuestState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure DiskWithVMGuestState is what you want? I've only used CVM with VMGuestStateOnly
either way, VMGuestStateOnly should be a valid setting for CVM, I think, and works with secure boot.
securityProfile.UefiSettings = &compute.UefiSettings{} | ||
|
||
if s.SecurityProfile.SecureBoot != "" { | ||
if s.SKU.HasCapability(resourceskus.TrustedLaunchDisabled) && s.SecurityProfile.SecureBoot == infrav1.SecureBootPolicyEnabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a gen1/gen2 check here as well. I believe you only get TrustedLaunchDisabled from SKU API on gen2 sizes which do not support it, it's annoyingly inverted from most capabilities.
as-is I believe you allow a gen1 VM size with TL + secure boot which is invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could sanity check sku api to confirm --
az vm list-skus -r virtualMachines -s Something_Thats_only_gen1 (Standard_NC6?)
apologies for drive-by review but hope I added some useful comments -- happen to have worked a little on CVM/TL from AKS perspective so was curious :) |
some other random high level thoughts
|
2fc3e0c
to
6e04db1
Compare
This PR looks good to me! |
I still don't see any documentation in the PR, @mresvanis are you still working on adding it? |
@CecileRobertMichon yes, I was waiting for the image-builder PR to get merged first, but I will add the documentation today so that you have time to review it. |
8574389
to
353a109
Compare
@CecileRobertMichon I added documentation for both Confidential VMs and Trusted Launch for VMs, by adding 2 new topics. Please feel free to review. Thank you. |
I looked into the newly added docs and they look good to me! 🚀 /lgtm |
LGTM label has been added. Git tree hash: ab33867f4da255c6876eb53b839ff1231377fc59
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for your work and your patience. This is great!
I only found some typos and had some minor doc comments, feel free to ignore those you disagree with.
353a109
to
3582327
Compare
3582327
to
cbaaaca
Compare
This change adds support for Confidential VMs and Trusted launch for VMs. Azure Confidential VMs are defined by their SecurityProfile.SecurityType ConfidentialVM, which should be defined along with the OSDisk.ManagedDisk.SecurityProfile.SecurityEncryptionType field. Trusted launch for VMs is defined by the SecurityProfile.SecurityType TrustedLaunch, which should be defined along with the SecurityProfile.UefiSettings section, i.e. the SecureBootEnabled and VTpmEnabled fields. Signed-off-by: Michail Resvanis <mresvani@redhat.com>
cbaaaca
to
720566f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
LGTM label has been added. Git tree hash: 4410040d6da82f0cc54a283e907b1a716c33f59d
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CecileRobertMichon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This change adds support for Confidential VMs and Trusted launch for VMs.
Azure Confidential VMs are defined by their SecurityProfile.SecurityType
ConfidentialVM
, which should be defined along with the OSDisk.ManagedDisk.SecurityProfile.SecurityEncryptionType field. Trusted launch for VMs is defined by the SecurityProfile.SecurityTypeTrustedLaunch
, which should be defined along with the SecurityProfile.UefiSettings section, i.e. theSecureBootEnabled
andVTpmEnabled
fields.Related image-builder PR.
Which issue(s) this PR fixes:
Fixes #3264
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs:
Release note: