Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for io2 Block Express volumes #1409

Merged
merged 1 commit into from
Oct 4, 2022

Conversation

ConnorJC3
Copy link
Contributor

Signed-off-by: Connor Catlett conncatl@amazon.com

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 30, 2022
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 30, 2022
@ConnorJC3
Copy link
Contributor Author

/hold for reviews

Me and @torredil manually tested, this is good for review.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 30, 2022
@torredil
Copy link
Member

$ kubectl describe node i-0a8088f85b6d3bedb          
                             
Name:               i-0a8088f85b6d3bedb
Roles:              node
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=r5b.large
                    node.kubernetes.io/instance-type=r5b.large
$ kubectl get pvc           
                           
NAME        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
ebs-claim   Bound    pvc-862780bc-9b3b-4abb-8c3f-6f3c303a64d2   2000Gi     RWO            ebs-sc         2m51s
$ cat storageclass.yaml   
                                                           
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ebs-sc
parameters:
  type: io2
  iops: "100000"
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
$ cat claim.yaml      
                                                            
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ebs-claim
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ebs-sc
  resources:
    requests:
      storage: 2000Gi

image

/lgtm
/hold

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 30, 2022
@torredil
Copy link
Member

cc: @rdpsin

@rdpsin
Copy link
Contributor

rdpsin commented Sep 30, 2022

/lgtm

Can we also confirm that setting IOPS greater than 64K on non-bx instances will return an error? (Tbf, I don't see why not, but would still like to confirm).

@rdpsin
Copy link
Contributor

rdpsin commented Sep 30, 2022

FYI, it'd be useful to add some details in the commit message for posterity.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 30, 2022
@rdpsin
Copy link
Contributor

rdpsin commented Sep 30, 2022

Thanks.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 30, 2022
@ConnorJC3
Copy link
Contributor Author

@rdpsin there's no way to know at the time of creation what type of instance the volume will be attached to. so it will be created, but fail to attach to instances that do not support block express.

Unfortunately the error message is somewhat poor (Could not attach volume "vol-006100676decdeaa0" to node "i-01365b727b32e9f74": attachment of disk "vol-006100676decdeaa0" failed, expected device to be attached but was detaching), but I believe that is something that has to be improved on the EBS side.

@rdpsin
Copy link
Contributor

rdpsin commented Sep 30, 2022

I see. This can potentially lead to leaked resources. As long as we are explicit in our docs, I guess it should be fine. Can you add this?

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 30, 2022
@ConnorJC3 ConnorJC3 force-pushed the io2-iops branch 4 times, most recently from 175f0f8 to 9a8f104 Compare September 30, 2022 20:09
@k8s-ci-robot k8s-ci-robot removed the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 30, 2022
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Sep 30, 2022
@ConnorJC3 ConnorJC3 force-pushed the io2-iops branch 5 times, most recently from 03088a5 to 6b7732c Compare September 30, 2022 20:53
@ConnorJC3 ConnorJC3 changed the title Bump io2 max IOPs to block express limit Add support for io2 Block Express volumes Sep 30, 2022
Copy link
Member

@torredil torredil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline and decided against instance type validation using metadata during volume mounting to:

  1. Avoid introducing new dependency on metadata. Long-term, we want to move to a world where metadata is an entirely optional dependency and not a requirement on both the controller & node.
  2. Avoid being fragile to changes by requiring a CSI driver update when/if EBS supports block express on other instance types. It would be best to rely on the EBS api as source of truth. The downside here is that setting IOPS > 64K on non-bx supported instances results in failed, expected device to be attached but was detaching which is not very helpful.

Manually tested, these changes lgtm.

docs/parameters.md Outdated Show resolved Hide resolved
pkg/driver/controller.go Outdated Show resolved Hide resolved
docs/parameters.md Outdated Show resolved Hide resolved
pkg/cloud/cloud.go Outdated Show resolved Hide resolved
pkg/driver/controller.go Outdated Show resolved Hide resolved
io2 Block Express, which io2 volumes automatically get upgraded to on
certain instance types, has a higher iops limit than normal io2 volumes
Signed-off-by: Connor Catlett <conncatl@amazon.com>
@rdpsin
Copy link
Contributor

rdpsin commented Oct 4, 2022

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 4, 2022
Copy link
Member

@torredil torredil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 4, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ConnorJC3, torredil

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants