Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix VM creation and update when sshAccess is disabled. #80

Merged

Conversation

AleksandarSavchev
Copy link
Member

@AleksandarSavchev AleksandarSavchev commented Feb 21, 2023

What this PR does / why we need it:
This PR fixes creation and update of azure VMs when sshAccess is disabled by adding a "fake" PublicKey.

Currently, creating an azure shoot with sshAccess disabled is not possible since the shoot would fail in reconciliation. It fails with the following cloud provider validation error: The value of parameter linuxConfiguration.ssh.publicKeys.keyData is invalid. since its value is "".

After removing SSH from LinuxConfiguration it starts failing with this cloud provider error: Authentication using either SSH or by user name and password must be enabled in Linux profile., which would suggest we need to provide either SSH or an adminPassword. We do not include the adminPassword and it is ill-advised to do so. The only option left would be to pass a "fake" PublicKey.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Release note:

Fixed VM creation and update when `sshAccess` is disabled.

@AleksandarSavchev AleksandarSavchev requested review from a team as code owners February 21, 2023 13:44
@gardener-robot gardener-robot added needs/review Needs review size/s Size of pull request is small (see gardener-robot robot/bots/size.py) labels Feb 21, 2023
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 21, 2023
@gardener-robot-ci-1 gardener-robot-ci-1 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Feb 21, 2023
@himanshu-kun himanshu-kun self-assigned this Mar 17, 2023
@himanshu-kun
Copy link
Contributor

@AleksandarSavchev could you mention why an SSH access disablement is required ? Any gardener issue , slack thread ? We would like to know the context , as this will affect all our mcm-providers.
cc @ashwani2k @unmarshall

@AleksandarSavchev
Copy link
Member Author

@himanshu-kun disabling SSH access is needed to make our clusters compliant with DISA STIGs rules 242393 and 242394. You can find the backlog issue for these changes here. For more information you can check the PR that added SSHAccess field to shoot description: ref.

@unmarshall
Copy link
Contributor

As per MicrosoftDocs/azure-docs#14447 it should be possible to block SSH access by configuring network security group. Have you already tried this out?

@kon-angelo
Copy link
Contributor

@unmarshall there are 2 issues:
a) The links that @AleksandarSavchev posted show that the endgoal is to disable the sshd daemon of the nodes - not disable SSH access somehow e.g. with NSGs.
b) If when shoot.spec.sshAccess is disabled, gardener does not create any SSH public key, then the machineclass will be invalid anyway. I think this was the original problem, that the since there isn't a key anymore, machine creation fails anyway.

@himanshu-kun
Copy link
Contributor

himanshu-kun commented Apr 3, 2023

a) The links that @AleksandarSavchev posted show that the endgoal is to disable the sshd daemon of the nodes - not disable SSH access somehow e.g. with NSGs.

But then the current solution of passing a fake ssh key also doesn't solve the problem as the sshd is still enabled. @AleksandarSavchev could you confirm if the requirement is to disable ssh access and not a specific requirement of disable ssh daemon ?

@AleksandarSavchev
Copy link
Member Author

AleksandarSavchev commented Apr 3, 2023

When a shoot is configured with SSHAccess disabled a service called sshd-ensurer runs a script every 30 seconds which disables and stops the sshd service if enabled and/or running, it also kills any ssh connections. With SSHAccess disabled no public keys are created and the creation of bastions is disabled as well(ref: gardener/gardener#7188)

The problem of disabling ssh access has already been solved by the upper mentioned functionality. The problem in provider-azure is that we cannot create shoots with SSHAccess disabled, since there are no SSH public keys.

Copy link
Contributor

@himanshu-kun himanshu-kun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already have a test for admin name

Entry("#15 Create machine without Admin Username in providerSpec",
&mock.AzureProviderSpecWithoutAdminUserName,
&driver.CreateMachineRequest{
Machine: newMachine("dummy-machine"),
MachineClass: newAzureMachineClass(mock.AzureProviderSpecWithoutAdminUserName),
Secret: newSecret(azureProviderSecret),
},
nil,
nil,
nil,
nil,
nil,
true,
fmt.Errorf(internalErrorPrefix, "properties.osProfile.adminUsername: Required value: AdminUsername is required").Error(),

Similarly , pls also add the following test case:

  • when providerSpec.Properties.OsProfile.LinuxConfiguration.SSH.PublicKeys.KeyData is not supplied , we get the following error

Failure sending request: StatusCode=400 -- Original Error: Code="InvalidParameter" Message="The value of parameter linuxConfiguration.ssh.publicKeys.keyData is invalid." Target="linuxConfiguration.ssh.publicKeys.keyData"

pkg/azure/utils.go Outdated Show resolved Hide resolved
pkg/azure/utils.go Outdated Show resolved Hide resolved
@gardener-robot gardener-robot added the needs/changes Needs (more) changes label Apr 11, 2023
@himanshu-kun
Copy link
Contributor

When a shoot is configured with SSHAccess disabled a service called sshd-ensurer runs a script every 30 seconds which disables and stops the sshd service if enabled and/or running, it also kills any ssh connections. With SSHAccess disabled no public keys are created and the creation of bastions is disabled as well(ref: gardener/gardener#7188)

The problem of disabling ssh access has already been solved by the upper mentioned functionality. The problem in provider-azure is that we cannot create shoots with SSHAccess disabled, since there are no SSH public keys.

thanks for the context and sorry for the delay. I tried this out myself ..
the machineClass looked like this when workersSettings.sshAccess.enabled: false

osProfile:
      adminUsername: core
      linuxConfiguration:
        disablePasswordAuthentication: true
        ssh:
          publicKeys:
            keyData: null
            path: /home/core/.ssh/authorized_keys

the error from provider was

Cloud provider message - machine codes error: code = [Internal] message = [compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="InvalidParameter" Message="The value of parameter linuxConfiguration.ssh.publicKeys.keyData is invalid." Target="linuxConfiguration.ssh.publicKeys.keyData"]

I have reviewed the PR, kindly address , then we'll merge it.

pkg/azure/utils.go Outdated Show resolved Hide resolved
@gardener-robot gardener-robot added size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) and removed size/s Size of pull request is small (see gardener-robot robot/bots/size.py) labels Apr 13, 2023
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 13, 2023
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 13, 2023
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Apr 13, 2023
Copy link
Contributor

@himanshu-kun himanshu-kun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Thnx , this looks better!

@gardener-robot gardener-robot added the reviewed/lgtm Has approval for merging label Apr 13, 2023
@gardener-robot gardener-robot removed needs/changes Needs (more) changes needs/review Needs review labels Apr 13, 2023
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 13, 2023
@himanshu-kun himanshu-kun merged commit 7c79593 into gardener:master Apr 17, 2023
@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/lgtm Has approval for merging reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants