Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure Availability Zones #586

Closed
feiskyer opened this issue Jul 11, 2018 · 36 comments
Closed

Azure Availability Zones #586

feiskyer opened this issue Jul 11, 2018 · 36 comments
Assignees
Labels
area/provider/azure Issues or PRs related to azure provider kind/feature Categorizes issue or PR as related to a new feature. sig/storage Categorizes an issue or PR as relevant to SIG Storage. stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status
Milestone

Comments

@feiskyer
Copy link
Member

feiskyer commented Jul 11, 2018

Feature Description

  • One-line feature description (can be used as a release note): Add support of Azure Availability Zones.
  • Primary contact (assignee): @feiskyer
  • Responsible SIGs: sig-azure
  • Design proposal link (community repo): KEP for Azure Availability Zones community#2364
  • KEP: 20180711-azure-availability-zones
  • Link to e2e and/or unit tests:
  • Reviewer(s) - (for LGTM) recommend having 2+ reviewers (at least one from code-area OWNERS file) agreed to review. Reviewers from multiple companies preferred: @khenidak
  • Approver (likely from SIG/area to which feature belongs): @brendandburns
  • Feature target (which target equals to which milestone):
    • Alpha release target (1.12)
    • Beta release target (1.13)
    • Stable release target (1.15)
@feiskyer
Copy link
Member Author

feiskyer commented Jul 11, 2018

/sig azure
/sig storage

@k8s-ci-robot k8s-ci-robot added area/provider/azure Issues or PRs related to azure provider sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels Jul 11, 2018
@feiskyer
Copy link
Member Author

/milestone v1.12

@justaugustus
Copy link
Member

@feiskyer --

It looks like this feature is currently in the Kubernetes 1.12 Milestone.

If that is still accurate, please ensure that this issue is up-to-date with ALL of the following information:

  • One-line feature description (can be used as a release note):
  • Primary contact (assignee):
  • Responsible SIGs:
  • Design proposal link (community repo):
  • Link to e2e and/or unit tests:
  • Reviewer(s) - (for LGTM) recommend having 2+ reviewers (at least one from code-area OWNERS file) agreed to review. Reviewers from multiple companies preferred:
  • Approver (likely from SIG/area to which feature belongs):
  • Feature target (which target equals to which milestone):
    • Alpha release target (x.y)
    • Beta release target (x.y)
    • Stable release target (x.y)

Set the following:

  • Description
  • Assignee(s)
  • Labels:
    • stage/{alpha,beta,stable}
    • sig/*
    • kind/feature

Once this feature is appropriately updated, please explicitly ping @justaugustus, @kacole2, @robertsandoval, @rajendar38 to note that it is ready to be included in the Features Tracking Spreadsheet for Kubernetes 1.12.


Please note that the Features Freeze is July 31st, after which any incomplete Feature issues will require an Exception request to be accepted into the milestone.

In addition, please be aware of the following relevant deadlines:

  • Docs deadline (open placeholder PRs): 8/21
  • Test case freeze: 8/28

Please make sure all PRs for features have relevant release notes included as well.

Happy shipping!

/kind feature
/stage alpha

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status labels Jul 18, 2018
k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this issue Jul 20, 2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add initial availability zones support for Azure nodes

**What this PR does / why we need it**:

The first part of [Azure Availability Zone feature](kubernetes/enhancements#586).

This PR adds initial availability zone (AZ) support for Azure nodes. With this PR, Azure nodes with AZ will have label `failure-domain.beta.kubernetes.io/zone=<region>-<zoneID>`, e.g. `southeastasia-1`.

It also updates instance metadata api-version to 2017-12-01, which is required for AZ.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

VirtualMachineScaleSetVM doesn't have AZ info yet. It will be supported later after new Azure Go SDK releases.

**Release note**:

```release-note
Azure nodes with availability zone now will have label `failure-domain.beta.kubernetes.io/zone=<region>-<zoneID>`.
```

/kind feature
/sig azure

/assign @brendandburns @khenidak @andyzhangx
@justaugustus justaugustus added the tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team label Jul 29, 2018
k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this issue Aug 6, 2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add availability zones support to Azure managed disks

**What this PR does / why we need it**:

Continue of [Azure Availability Zone feature](kubernetes/enhancements#586).

This PR adds availability zone support for Azure managed disks and its storage class. Zoned managed disks is enabled by default if there are zoned nodes in the cluster.

The zone could also be customized by `zone` or `zones` parameter, e.g.

```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
  name: managed-disk-zone-1
parameters:
  zone: "southeastasia-1"
  # zones: "southeastasia-1,"southeastasia-2"
  cachingmode: None
  kind: Managed
  storageaccounttype: Standard_LRS
provisioner: kubernetes.io/azure-disk
reclaimPolicy: Delete
volumeBindingMode: Immediate
```

All zoned AzureDisk PV will also be labeled with its availability zone, e.g.

```sh
$ kubectl get pvc pvc-azuredisk-az-1
NAME                 STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS          AGE
pvc-azuredisk-az-1   Bound     pvc-5ad0c7b8-8f0b-11e8-94f2-000d3a07de8c   5Gi        RWO            managed-disk-zone-1   2h

$ kubectl get pv pvc-5ad0c7b8-8f0b-11e8-94f2-000d3a07de8c --show-labels
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                        STORAGECLASS          REASON    AGE       LABELS
pvc-5ad0c7b8-8f0b-11e8-94f2-000d3a07de8c   5Gi        RWO            Delete           Bound     default/pvc-azuredisk-az-1   managed-disk-zone-1             2h        failure-domain.beta.kubernetes.io/region=southeastasia,failure-domain.beta.kubernetes.io/zone=southeastasia-1
```

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

See also the [KEP](kubernetes/community#2364).

DynamicProvisioningScheduling feature would be added in a following PR.

**Release note**:

```release-note
Azure managed disks now support availability zones and new parameters `zoned`, `zone` and `zones` are added for AzureDisk storage class.
```

/kind feature
/sig azure
/assign @brendandburns @khenidak @andyzhangx
k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this issue Aug 8, 2018
Automatic merge from submit-queue (batch tested with PRs 67061, 66589, 67121, 67149). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add DynamicProvisioningScheduling and VolumeScheduling support for Azure managed disks

**What this PR does / why we need it**:

Continue of [Azure Availability Zone feature](kubernetes/enhancements#586).

This PR adds `VolumeScheduling` and `DynamicProvisioningScheduling` support to Azure managed disks.

When feature gate `VolumeScheduling` disabled, no NodeAffinity set for PV:

```yaml
kubectl describe pv
Name:              pvc-d30dad05-9ad8-11e8-94f2-000d3a07de8c
Labels:            failure-domain.beta.kubernetes.io/region=southeastasia
                   failure-domain.beta.kubernetes.io/zone=southeastasia-2
Annotations:       pv.kubernetes.io/bound-by-controller=yes
                   pv.kubernetes.io/provisioned-by=kubernetes.io/azure-disk
                   volumehelper.VolumeDynamicallyCreatedByKey=azure-disk-dynamic-provisioner
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      default
Status:            Bound
Claim:             default/pvc-azuredisk
Reclaim Policy:    Delete
Access Modes:      RWO
Capacity:          5Gi
Node Affinity:
  Required Terms:
    Term 0:        failure-domain.beta.kubernetes.io/region in [southeastasia]
                   failure-domain.beta.kubernetes.io/zone in [southeastasia-2]
Message:
Source:
    Type:         AzureDisk (an Azure Data Disk mount on the host and bind mount to the pod)
    DiskName:     k8s-5b3d7b8f-dynamic-pvc-d30dad05-9ad8-11e8-94f2-000d3a07de8c
    DiskURI:      /subscriptions/<subscription-id>/resourceGroups/<rg-name>/providers/Microsoft.Compute/disks/k8s-5b3d7b8f-dynamic-pvc-d30dad05-9ad8-11e8-94f2-000d3a07de8c
    Kind:         Managed
    FSType:
    CachingMode:  None
    ReadOnly:     false
Events:           <none>
```

When feature gate `VolumeScheduling` enabled, NodeAffinity will be populated for PV:

```yaml
kubectl describe pv
Name:              pvc-0284337b-9ada-11e8-a7f6-000d3a07de8c
Labels:            failure-domain.beta.kubernetes.io/region=southeastasia
                   failure-domain.beta.kubernetes.io/zone=southeastasia-2
Annotations:       pv.kubernetes.io/bound-by-controller=yes
                   pv.kubernetes.io/provisioned-by=kubernetes.io/azure-disk
                   volumehelper.VolumeDynamicallyCreatedByKey=azure-disk-dynamic-provisioner
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      default
Status:            Bound
Claim:             default/pvc-azuredisk
Reclaim Policy:    Delete
Access Modes:      RWO
Capacity:          5Gi
Node Affinity:
  Required Terms:
    Term 0:        failure-domain.beta.kubernetes.io/region in [southeastasia]
                   failure-domain.beta.kubernetes.io/zone in [southeastasia-2]
Message:
Source:
    Type:         AzureDisk (an Azure Data Disk mount on the host and bind mount to the pod)
    DiskName:     k8s-5b3d7b8f-dynamic-pvc-0284337b-9ada-11e8-a7f6-000d3a07de8c
    DiskURI:      /subscriptions/<subscription-id>/resourceGroups/<rg-name>/providers/Microsoft.Compute/disks/k8s-5b3d7b8f-dynamic-pvc-0284337b-9ada-11e8-a7f6-000d3a07de8c
    Kind:         Managed
    FSType:
    CachingMode:  None
    ReadOnly:     false
Events:           <none>
```

When both  `VolumeScheduling` and `DynamicProvisioningScheduling` are enabled, storage class also supports `allowedTopologies` and `volumeBindingMode: WaitForFirstConsumer` for volume topology aware dynamic provisioning:

```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
  name: managed-disk-dynamic
parameters:
  cachingmode: None
  kind: Managed
  storageaccounttype: Standard_LRS
provisioner: kubernetes.io/azure-disk
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - southeastasia-2
    - southeastasia-1
```

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
DynamicProvisioningScheduling and VolumeScheduling is not supported for Azure managed disks. Feature gates DynamicProvisioningScheduling and VolumeScheduling should be enabled before using this feature.
```

/kind feature
/sig azure
/cc @brendandburns @khenidak @andyzhangx
/cc @ddebroy @msau42 @justaugustus
hh pushed a commit to ii/kubernetes that referenced this issue Aug 15, 2018
Automatic merge from submit-queue (batch tested with PRs 66884, 67410, 67229, 67409). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add node affinity for Azure unzoned managed disks

**What this PR does / why we need it**:

Continue of [Azure Availability Zone feature](kubernetes/enhancements#586).

Add node affinity for Azure unzoned managed disks, so that unzoned disks only scheduled to unzoned nodes.

This is required because Azure doesn't allow attaching unzoned disks to zoned VMs.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

Unzoned nodes would label `failure-domain.beta.kubernetes.io/zone=0` and the value is fault domain ( while availability zone is used for zoned nodes). So fault domain is used to populate unzoned disks.

Since there are at most 3 fault domains in each region, the PR adds 3 terms for them:

```yaml
kubectl describe pv pvc-bdf93a67-9c45-11e8-ba6f-000d3a07de8c
Name:              pvc-bdf93a67-9c45-11e8-ba6f-000d3a07de8c
Labels:            <none>
Annotations:       pv.kubernetes.io/bound-by-controller=yes
                   pv.kubernetes.io/provisioned-by=kubernetes.io/azure-disk
                   volumehelper.VolumeDynamicallyCreatedByKey=azure-disk-dynamic-provisioner
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      azuredisk-unzoned
Status:            Bound
Claim:             default/unzoned-pvc
Reclaim Policy:    Delete
Access Modes:      RWO
Capacity:          5Gi
Node Affinity:     
  Required Terms:  
    Term 0:        failure-domain.beta.kubernetes.io/region in [southeastasia]
                   failure-domain.beta.kubernetes.io/zone in [0]
    Term 1:        failure-domain.beta.kubernetes.io/region in [southeastasia]
                   failure-domain.beta.kubernetes.io/zone in [1]
    Term 2:        failure-domain.beta.kubernetes.io/region in [southeastasia]
                   failure-domain.beta.kubernetes.io/zone in [2]
Message:           
Source:
    Type:         AzureDisk (an Azure Data Disk mount on the host and bind mount to the pod)
    DiskName:     k8s-5b3d7b8f-dynamic-pvc-bdf93a67-9c45-11e8-ba6f-000d3a07de8c
    DiskURI:      /subscriptions/<subscription>/resourceGroups/<rg-name>/providers/Microsoft.Compute/disks/k8s-5b3d7b8f-dynamic-pvc-bdf93a67-9c45-11e8-ba6f-000d3a07de8c
    Kind:         Managed
    FSType:       
    CachingMode:  None
    ReadOnly:     false
Events:           <none>
```

**Release note**:

```release-note
Add node affinity for Azure unzoned managed disks
```

/sig azure
/kind feature

/cc @brendandburns @khenidak @andyzhangx @msau42
@zparnold
Copy link
Member

Hey there! @feiskyer I'm the wrangler for the Docs this release. Is there any chance I could have you open up a docs PR against the release-1.12 branch as a placeholder? That gives us more confidence in the feature shipping in this release and gives me something to work with when we start doing reviews/edits. Thanks! If this feature does not require docs, could you please update the features tracking spreadsheet to reflect it?

@msau42
Copy link
Member

msau42 commented Aug 20, 2018

Doc placeholder for only the topology-aware dynamic provisioning part: kubernetes/website#9939 (not for the entire feature)

@justaugustus
Copy link
Member

@feiskyer --
Any update on docs status for this feature? Are we still planning to land it for 1.12?
At this point, code freeze is upon us, and docs are due on 9/7 (2 days).
If we don't here anything back regarding this feature ASAP, we'll need to remove it from the milestone.

cc: @zparnold @jimangel @tfogo

@feiskyer
Copy link
Member Author

feiskyer commented Sep 6, 2018

@justaugustus Yep, codes have been merged now and docs will be added to https://github.com/kubernetes/cloud-provider-azure

@justaugustus
Copy link
Member

Thanks for the update!

@claurence
Copy link

Kubernetes 1.13 is going to be a 'stable' release since the cycle is only 10 weeks. We encourage no big alpha features and only consider adding this feature if you have a high level of confidence it will make code slush by 11/09. Are there plans for this enhancement to graduate to alpha/beta/stable within the 1.13 release cycle? If not, can you please remove it from the 1.12 milestone or add it to 1.13?

We are also now encouraging that every new enhancement aligns with a KEP. If a KEP has been created, please link to it in the original post. Please take the opportunity to develop a KEP

@kacole2
Copy link

kacole2 commented Apr 12, 2019

I'm the Enhancement Lead for 1.15. Is this feature going to be graduating alpha/beta/stable stages in 1.15? Please let me know so it can be tracked properly and added to the spreadsheet.

Once coding begins, please list all relevant k/k PRs in this issue so they can be tracked properly.

@kacole2 kacole2 added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team and removed tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels Apr 12, 2019
@feiskyer
Copy link
Member Author

yep, this is on tracked.

/milestone v1.15

@k8s-ci-robot k8s-ci-robot added this to the v1.15 milestone Apr 12, 2019
@feiskyer feiskyer added stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status and removed stage/beta Denotes an issue tracking an enhancement targeted for Beta status labels Apr 12, 2019
@kacole2 kacole2 added tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team and removed tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team labels Apr 14, 2019
@kacole2
Copy link

kacole2 commented Apr 29, 2019

@feiskyer, we're doing a KEP review for enhancements to be included in the Kubernetes v1.15 milestone. After reviewing your KEP, it's currently missing test plans and graduation criteria which is required information per the KEP Template. Please update the KEP to include the required information before the Kubernetes 1.15 Enhancement Freeze date of 4/30/2019. Thank you

@kacole2
Copy link

kacole2 commented May 1, 2019

@feiskyer, Enhancement Freeze for Kubernetes 1.15 has passed and this did not meet the deadline. Still missing Test Plans and Graduation Criteria from the KEP. This is now being removed from the 1.15 milestone and the tracking sheet. If there is a need for this to be in 1.15, please file an Enhancement Exception. Thank you. @claurence

/milestone clear

@k8s-ci-robot k8s-ci-robot removed this from the v1.15 milestone May 1, 2019
@kacole2 kacole2 added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team and removed tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels May 1, 2019
@feiskyer
Copy link
Member Author

feiskyer commented Jul 5, 2019

/milestone v1.16

@k8s-ci-robot k8s-ci-robot added this to the v1.16 milestone Jul 5, 2019
@feiskyer feiskyer added tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team and removed tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team labels Jul 5, 2019
@evillgenius75
Copy link

Hi @feiskyer , I'm the 1.16 Enhancement Shadow. Is this feature going to be graduating alpha/beta/stable stages in 1.16? Please let me know so it can be added to the 1.16 Tracking Spreadsheet. If not's graduating, I will remove it from the milestone and change the tracked label.

Milestone dates are Enhancement Freeze 7/30 and Code Freeze 8/29.

Thank you.

@feiskyer
Copy link
Member Author

feiskyer commented Jul 9, 2019

@evillgenius75 Yep, and it has already added v1.16 milestone.

@simplytunde
Copy link

Hey, @feiskyer I'm the v1.16 docs release lead.

Does this enhancement (or the work planned for v1.16) require any new docs (or modifications)?

Just a friendly reminder we're looking for a PR against k/website (branch dev-1.16) due by Friday,August 23rd. It would be great if it's the start of the full documentation, but even a placeholder PR is acceptable. Let me know if you have any questions!

@feiskyer
Copy link
Member Author

docs are not required for this feature. However, we're working on the e2e tests.

@simplytunde
Copy link

Thanks @feiskyer

@feiskyer
Copy link
Member Author

feiskyer commented Aug 6, 2019

@k8s-ci-robot
Copy link
Contributor

@feiskyer: Closing this issue.

In response to this:

E2e tests are now added via a set of changes:

Documentation has been added to https://github.com/kubernetes/cloud-provider-azure/blob/master/docs/using-availability-zones.md.

So we are ok to close this now.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mrbobbytables mrbobbytables removed the tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team label Oct 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/azure Issues or PRs related to azure provider kind/feature Categorizes issue or PR as related to a new feature. sig/storage Categorizes an issue or PR as relevant to SIG Storage. stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status
Projects
None yet
Development

No branches or pull requests