Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AzureFile NFS network settings may block terraform access once applied #890

Open
sgibson91 opened this issue Dec 10, 2021 · 11 comments
Open

Comments

@sgibson91
Copy link
Member

Description

sgibson91#94 partnered with #887 represents an effort to get NFS working on AzureFile storage and involved making some network changes in terraform so that the NFS share could be accessed and mounted by the k8s nodes.

While working on the Carbon Plan Azure cluster, I applied this new terraform config and then ran another terraform plan command, mostly to confirm to myself that the infrastructure was up-to-date, however I ran into this error message:

│ Error: shares.Client#GetProperties: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailure" Message="This request is not authorized to perform this operation.\nRequestId:8150832e-d01a-0012-63ad-edeedb000000\nTime:2021-12-10T10:02:58.0047296Z"
│ 
│   with azurerm_storage_share.homes,
│   on storage.tf line 21, in resource "azurerm_storage_share" "homes":
│   21: resource "azurerm_storage_share" "homes" {

I am now worried that by making the NFS accessible to k8s, we have locked ourselves out from managing the infrastructure via terraform.

Value / benefit

We need to retain access via terraform to sustainably manage infrastructure.

Implementation details

No response

Tasks to complete

No response

Updates

No response

@yuvipanda
Copy link
Member

This is probably hashicorp/terraform-provider-azurerm#2977. Looks like hashicorp/terraform-provider-azurerm#14220 is supposed to fix it.

In the meantime, can we 'ignore' that particular change somehow in terraform so we can move forward with other changes?

@sgibson91
Copy link
Member Author

sgibson91 commented Jan 17, 2022

In the meantime, can we 'ignore' that particular change somehow in terraform so we can move forward with other changes?

This is probably my fault for excessively trimming the error message. This error crops up during the "refreshing state" phase of terraform plan, it hasn't even got the the point of calculating the change yet, because it can't check the current state of the file share. So there's nothing to 'ignore'.

@sgibson91
Copy link
Member Author

Full error message:

$ tf plan -var-file=projects/carbonplan.tfvars -out=carbonplan -refresh-only
azurerm_resource_group.jupyterhub: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourceGroups/2i2c-carbonplan-cluster]
azurerm_virtual_network.jupyterhub: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourceGroups/2i2c-carbonplan-cluster/providers/Microsoft.Network/virtualNetworks/k8s-network]
azurerm_container_registry.container_registry: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourceGroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerRegistry/registries/2i2ccarbonplanhubregistry]
azurerm_subnet.node_subnet: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourceGroups/2i2c-carbonplan-cluster/providers/Microsoft.Network/virtualNetworks/k8s-network/subnets/k8s-nodes-subnet]
azurerm_storage_account.homes: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourceGroups/2i2c-carbonplan-cluster/providers/Microsoft.Storage/storageAccounts/2i2ccarbonplanhubstorage]
azurerm_kubernetes_cluster.jupyterhub: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster]
azurerm_storage_share.homes: Refreshing state... [id=https://2i2ccarbonplanhubstorage.file.core.windows.net/homes]
azurerm_kubernetes_cluster_node_pool.user_pool["small"]: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster/agentPools/nbsmall]
azurerm_kubernetes_cluster_node_pool.dask_pool["small"]: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster/agentPools/dasksmall]
azurerm_kubernetes_cluster_node_pool.dask_pool["huge"]: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster/agentPools/daskhuge]
azurerm_kubernetes_cluster_node_pool.user_pool["vhuge"]: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster/agentPools/nbvhuge]
azurerm_kubernetes_cluster_node_pool.dask_pool["vvhuge"]: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster/agentPools/daskvvhuge]
azurerm_kubernetes_cluster_node_pool.dask_pool["medium"]: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster/agentPools/daskmedium]
azurerm_kubernetes_cluster_node_pool.user_pool["large"]: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster/agentPools/nblarge]
azurerm_kubernetes_cluster_node_pool.dask_pool["large"]: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster/agentPools/dasklarge]
azurerm_kubernetes_cluster_node_pool.dask_pool["vhuge"]: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster/agentPools/daskvhuge]
azurerm_kubernetes_cluster_node_pool.user_pool["vvhuge"]: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster/agentPools/nbvvhuge]
azurerm_kubernetes_cluster_node_pool.user_pool["medium"]: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster/agentPools/nbmedium]
azurerm_kubernetes_cluster_node_pool.user_pool["huge"]: Refreshing state... [id=/subscriptions/c5e7a734-3dbf-4285-80e5-4c0afb1f65dc/resourcegroups/2i2c-carbonplan-cluster/providers/Microsoft.ContainerService/managedClusters/hub-cluster/agentPools/nbhuge]
kubernetes_namespace.homes: Refreshing state... [id=azure-file]
kubernetes_secret.homes: Refreshing state... [id=azure-file/access-credentials]

│ Error: shares.Client#GetProperties: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailure" Message="This request is not authorized to perform this operation.\nRequestId:a7c9006d-d01a-004f-5e8d-0be45f000000\nTime:2022-01-17T10:34:16.3962556Z"

│ with azurerm_storage_share.homes,
│ on storage.tf line 21, in resource "azurerm_storage_share" "homes":
│ 21: resource "azurerm_storage_share" "homes" {

@GeorgianaElena
Copy link
Member

I just ran terraform plan on the toronto cluster and I can confirm the same behavior :(

@yuvipanda
Copy link
Member

Ah, while terraform doesn't support excluding certain resources from runs (hashicorp/terraform#2253), you can pass -target to apply to only look at specific resources. Temporarily as a way to unblock us, we can use that to explicitly list the cluster related resources so we can ignore the AzureFile. hashicorp/terraform-provider-azurerm#14220 is the 'real' fix, but we needn't wait for that...

@sgibson91
Copy link
Member Author

sgibson91 commented Jan 17, 2022

I can confirm that the following command worked (at least to give me access again, haven't attempted to make a change yet!)

$ tf plan -var-file=projects/carbonplan.tfvars -out=carbonplan -refresh-only -target=azurerm_kubernetes_cluster.jupyterhub -target=azurerm_kubernetes_cluster_node_pool.user_pool -target=azurerm_kubernetes_cluster_node_pool.dask_pool

@yuvipanda
Copy link
Member

Note that this is still a problem, and the cause is hashicorp/terraform-provider-azurerm#2977

yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this issue Jun 7, 2023
- Mark optional parts of node / dask node definition as optional,
  so utoronto.tfvars will actually apply
- Parameterize core node size, and specify it explicitly.
- Remove default for k8s version, specify it explicitly. This
  matches the current k8s version
- Parameterize storage size, and match it to current reality.
  Note that this can't be applied via tf quite yet, due
  to 2i2c-org#890.

Ref 2i2c-org#2594
@consideRatio
Copy link
Contributor

consideRatio commented Jan 5, 2024

I've not yet understood the details here, but I did a terraform plan and ran into a 403 permissions error when terraform were inspecting the infra. Googling my way around I concluded that I could add my computers public IP to a firewall below temporarily to not run into the 403 from the UI seen below under the "Firewall" heading.

image


I've now tested and concluded that both terraform plan and terraform apply worked after adding my own IP to the firewall.

@consideRatio
Copy link
Contributor

@yuvipanda do you think the proxycommand.py script you've created could be used for the purpose of allowing terraform to inspect things in the NFS for this as well?

Looking at a NFS mount command provided, it sais...

sudo mkdir -p /mount/2i2cutorontohubstorage/homes
sudo mount -t nfs 2i2cutorontohubstorage.file.core.windows.net:/2i2cutorontohubstorage/homes /mount/2i2cutorontohubstorage/homes -o vers=4,minorversion=1,sec=sys,nconnect=4

Do you think we can with a few commands route traffic from our local computers to 2i2cutorontohubstorage.file.core.windows.net via a pod created by the proxycommand.py script?

@yuvipanda
Copy link
Member

@consideRatio oh, yeah it could probably do that! Will need to be some sort of HTTP proxy (rather than ssh one), may be a fun project to build. The current setup probably won't work because it's just for ssh, which is in some ways easier.

I think very temporarily adding your own IP and then unadding it is easier for sure :D But must remember to unadd it though.

@GeorgianaElena
Copy link
Member

GeorgianaElena commented Jul 23, 2024

I think very temporarily adding your own IP and then unadding it is easier for sure :D But must remember to unadd it though.

I'm not sure what can we do to enforce this and make sure we don't forget to remove the IP.
I've just added my IP to the list as part of #890 and deleted the old entry that was still in the list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Needs Shaping / Refinement
Development

No branches or pull requests

5 participants