Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(elasticloadbalancingv2): #13332 is a breaking change! #13437

Closed
akefirad opened this issue Mar 6, 2021 · 11 comments · Fixed by #13490
Closed

(elasticloadbalancingv2): #13332 is a breaking change! #13437

akefirad opened this issue Mar 6, 2021 · 11 comments · Fixed by #13490
Assignees
Labels
@aws-cdk/aws-elasticloadbalancingv2 Related to Amazon Elastic Load Balancing V2 bug This issue is a bug. in-progress This issue is being actively worked on. p1

Comments

@akefirad
Copy link

akefirad commented Mar 6, 2021

❓ General Issue

#13332 is a breaking change!

The Question

I believe #13332 fixing (#13150) is a breaking change since it adds the resource (with certificates) and then remove the old one (with same certificate), which results to removing the existing certificates.

I can't think of anything other than, doing the migration is two deployment or adding the missing certificates manually after the deployment. Both causes some disruption to the service (endpoints are gonna be unavailable for a couple of seconds/minutes).

I'd suggest to update the release note accordingly.

@akefirad akefirad added guidance Question that needs advice or information. needs-triage This issue or PR still needs to be triaged. labels Mar 6, 2021
@github-actions github-actions bot added the @aws-cdk/aws-elasticloadbalancingv2 Related to Amazon Elastic Load Balancing V2 label Mar 6, 2021
@akefirad
Copy link
Author

akefirad commented Mar 6, 2021

Confirmed. The first certificate in the list will be gone, if one upgrades to v1.92.0.

@peterwoodworth peterwoodworth added bug This issue is a bug. p1 and removed guidance Question that needs advice or information. needs-triage This issue or PR still needs to be triaged. labels Mar 9, 2021
@njlynch
Copy link
Contributor

njlynch commented Mar 9, 2021

Some potential learnings so far:

It appears that as-is, the AWS::ElasticLoadBalancingV2::ListenerCertificate resource reacts as expected on creates/deletes, but does ... odd .. things on update.

For the original bugs (#13150 and #13329), the issue was that if the listener was updated from having 2 to 3 certificates (requiring an update of the ListenerCertificate resource), one of the certificates would be dropped. Here, doing an upgrade from v1.91.0 to v1.92.0 means that any users that have successfully deployed with two certs attached to a single ListenerCertificate would see an update to that ListenerCertificate and the creation of a new ListenerCertificate. The creation appears to work, but the update causes one of the certificates to be removed.

-- Deployed successfully (all at once, no updates) on v1.91.0 --

  LBListener49E825B4:
    Type: AWS::ElasticLoadBalancingV2::Listener
    Properties:
      DefaultActions:
        - RedirectConfig:
            Host: example.com
            StatusCode: HTTP_302
          Type: redirect
      LoadBalancerArn:
        Ref: LB8A12904C
      Certificates:
        - CertificateArn: arn:aws:acm:eu-west-1:123456789012:certificate/f53b3399-2092-4fd7-a777-33734a4dad22
      Port: 443
      Protocol: HTTPS
  LBListenerSSLCert177C324A:
    Type: AWS::ElasticLoadBalancingV2::ListenerCertificate
    Properties:
      Certificates:
        - CertificateArn: arn:aws:acm:eu-west-1:123456789012:certificate/b6e4aa65-6d49-4a4e-8321-af2790e4e128
        - CertificateArn: arn:aws:acm:eu-west-1:123456789012:certificate/8f87cdba-816e-4dda-b606-7760e513ca3d
      ListenerArn:
        Ref: LBListener49E825B4

-- v1.92.0 --

  LBListener49E825B4:
    Type: AWS::ElasticLoadBalancingV2::Listener
    Properties:
      DefaultActions:
        - RedirectConfig:
            Host: example.com
            StatusCode: HTTP_302
          Type: redirect
      LoadBalancerArn:
        Ref: LB8A12904C
      Certificates:
        - CertificateArn: arn:aws:acm:eu-west-1:123456789012:certificate/f53b3399-2092-4fd7-a777-33734a4dad22
      Port: 443
      Protocol: HTTPS
  LBListenerSSLCert177C324A:
    Type: AWS::ElasticLoadBalancingV2::ListenerCertificate
    Properties:
      Certificates:
        - CertificateArn: arn:aws:acm:eu-west-1:123456789012:certificate/b6e4aa65-6d49-4a4e-8321-af2790e4e128
      ListenerArn:
        Ref: LBListener49E825B4
  LBListenerSSLCert26015FAF3:
    Type: AWS::ElasticLoadBalancingV2::ListenerCertificate
    Properties:
      Certificates:
        - CertificateArn: arn:aws:acm:eu-west-1:123456789012:certificate/8f87cdba-816e-4dda-b606-7760e513ca3d
      ListenerArn:
        Ref: LBListener49E825B4

-- cdk diff --

Stack Issue13437Stack
Resources
[+] AWS::ElasticLoadBalancingV2::ListenerCertificate LB/Listener/SSLCert2 LBListenerSSLCert26015FAF3
[~] AWS::ElasticLoadBalancingV2::ListenerCertificate LB/Listener/SSLCert LBListenerSSLCert177C324A replace
 └─ [~] Certificates (requires replacement)
     └─ @@ -1,8 +1,5 @@
        [ ] [
        [ ]   {
        [ ]     "CertificateArn": "arn:aws:acm:eu-west-1:123456789012:certificate/b6e4aa65-6d49-4a4e-8321-af2790e4e128"
        [-]   },
        [-]   {
        [-]     "CertificateArn": "arn:aws:acm:eu-west-1:123456789012:certificate/8f87cdba-816e-4dda-b606-7760e513ca3d"
        [ ]   }
        [ ] ]

In the above, the b6e4aa65-6d49-4a4e-8321-af2790e4e128 certificate was the one that was dropped/removed entirely by the update.

The code was written to be backwards-compatible, such that if only two certificates were attached to a listener, no logical IDs would change. I suspect that is what needs to change here; if we create a new logical ID for both certificates, both should get "clean" (single-cert) resources, and both should update appropriately.

@akefirad
Copy link
Author

akefirad commented Mar 9, 2021

🤔 I don't think so there's a solution for this. Your suggestion wouldn't solve the issue either (didn't test it through). I suspect it removes both certificates at the last step which is deleting the old resource.

@njlynch
Copy link
Contributor

njlynch commented Mar 9, 2021

I just emulated the patch on my test stack, and it appears to work:

-- patch --

    const listenerCert1 = listener.node.findChild('SSLCert') as elbv2.ApplicationListenerCertificate;
    (listenerCert1.node.defaultChild as elbv2.CfnListenerCertificate).overrideLogicalId('LBListenerSSLCertOverride1234')

-- cdk diff --

Stack Issue13437Stack
Resources
[-] AWS::ElasticLoadBalancingV2::ListenerCertificate LBListenerSSLCert177C324A destroy
[+] AWS::ElasticLoadBalancingV2::ListenerCertificate LB/Listener/SSLCert LBListenerSSLCertOverride1234
[+] AWS::ElasticLoadBalancingV2::ListenerCertificate LB/Listener/SSLCert2 LBListenerSSLCert26015FAF3

By deploying the above, all ListenerCertificate operations are Creates/Deletes, which appear to work properly (all 3 certs on my listener were maintained). This is logically equivalent to us changing the logical ID in the code for the first ListenerCertificate. I'll spin up a fix.

@akefirad
Copy link
Author

akefirad commented Mar 9, 2021

Interesting! How come destroy is not removing the certificates? Effectively it's the same as before; adding new certificates, removing the old resource (which removes the added certificates) (I might be missing something here!).

@njlynch
Copy link
Contributor

njlynch commented Mar 9, 2021

Interesting! How come destroy is not removing the certificates?

I am assuming that the CloudFormation resource (ListenerCertificate) is associated with a relationship/ID of some kind on ELB's side. When the resource is deleted, it is removing the relationship by that ID, rather than by the individual certificate ARNs. This is just an educated guess, but it matches the behavior I'm seeing.

Effectively it's the same as before...

I believe the difference is the lack of and/or undefined support for an array of certificate ARNs in the ListenerCertificate resource. Despite the resource modeling (Certificates as a list of AWS::ElasticLoadBalancingV2::ListenerCertificate Certificate objects), the docs state that "You can specify one certificate per resource.".

I can't say with 100% certainty, but I believe -- between the docs and my testing -- that Updates are not handled well when there are multiple certificates attached to a single ListenerCertificate. That was the cause of the original bug, and that's the cause of this issue as well. By changing the approach so that the existing multi-certificate ListenerCertificate is deleted, rather than updated, the undesirable behavior can be completely side-stepped.

njlynch added a commit that referenced this issue Mar 9, 2021
… ALB if more than 2 certificates exist

Support for multiple certificates attached to a single ALB listener was
originally implemented by putting all certificates in an array on a single
`ListenerCertificate` resource. The docs state that only one certificate may be
specified, although multiple certificates do appear to work initially.  Initial
resource creation of a `ListenerCertificate` with multiple certificates appears
to succeed, but subsequent updates to this resource (to either add or remove
certificates) yields undefined and undesireable behavior.

The fix in #13332 attempted to fix this by creating a new `ListenerCertificate`
per certificate, and -- at my direction -- maintained partial backwards
compatibility by keeping the original ID for the first `ListenerCertificate`
resource. However, this has the effect of triggering an update to the existing
resource, which does not appear to work correctly.

By forcing a logical ID change for all `ListenerCertificate` resources, we can
force all existing resources to be deleted, and new resources created. This
avoids doing any updates on any `ListenerCertificate` resources with an array
of certificates, which appears to side-step the undefined behavior.

fixes #13437
@njlynch njlynch added the in-progress This issue is being actively worked on. label Mar 9, 2021
@akefirad
Copy link
Author

akefirad commented Mar 9, 2021

When the resource is deleted, it is removing the relationship by that ID, rather than by the individual certificate ARNs.

Interestingly I remember the opposite. I forgot the exact details of my experience. But this is what I recall:

  1. I had an ALB with two certificates (one main, foo.com and one additional, bar.com), everything created by CDK.
  2. I added an additional certificate baz.com to the listener in my CDK code. Deployed the code and one additional certificate (either bar.com or baz.com, don't remember which one) was gone.
  3. I added the missing certificate manually to the listener.
  4. Upgraded CDK to v1.92.0 and re-deployed the stack again. The additional certificate was gone (the first one in the list) which was because CFN firsts adds the new resource and then remove the old one, which effectively means the certificate is gone (seems a design flaw to me on CFN side).

Two things that is different with your example/suggestion

  1. One certificate was added manually to the listener.
  2. The resource is not replaced by removed completely. (I can't see how this might be relevant)

Let me know if I can help with anything.

@mergify mergify bot closed this as completed in #13490 Mar 9, 2021
mergify bot pushed a commit that referenced this issue Mar 9, 2021
… ALB if more than 2 certificates exist (#13490)

Support for multiple certificates attached to a single ALB listener was
originally implemented by putting all certificates in an array on a single
`ListenerCertificate` resource. The docs state that only one certificate may be
specified, although multiple certificates do appear to work initially.  Initial
resource creation of a `ListenerCertificate` with multiple certificates appears
to succeed, but subsequent updates to this resource (to either add or remove
certificates) yields undefined and undesireable behavior.

The fix in #13332 attempted to fix this by creating a new `ListenerCertificate`
per certificate, and -- at my direction -- maintained partial backwards
compatibility by keeping the original ID for the first `ListenerCertificate`
resource. However, this has the effect of triggering an update to the existing
resource, which does not appear to work correctly.

By forcing a logical ID change for all `ListenerCertificate` resources, we can
force all existing resources to be deleted, and new resources created. This
avoids doing any updates on any `ListenerCertificate` resources with an array
of certificates, which appears to side-step the undefined behavior.

fixes #13437


----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@github-actions
Copy link

github-actions bot commented Mar 9, 2021

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@akefirad
Copy link
Author

akefirad commented Mar 10, 2021

@njlynch I think this is not gonna work if the order of execution is not deterministic, meaning if destroy gets executed after adding the new resources, it will remove the certificates. (it happens often with EKS resources.) I'll try it after the release.

@njlynch
Copy link
Contributor

njlynch commented Mar 10, 2021

@akefirad - It is deterministic, and the destroy gets executed after adding the new resources. As I said above, I believe that the ListenerCertificate relationships are created/deleted via a referenced ID, not via the individual certificate ARNs.

Upgraded CDK to v1.92.0 and re-deployed the stack again. The additional certificate was gone (the first one in the list) which was because CFN firsts adds the new resource and then remove the old one.

I think the additional certificate was removed, not because of the order of operations, but because any update of a ListenerCertificate with multiple certificates will drop certificates. This matches my testing, at least.

Here's a screenshot from my patch-tested update to my stack. You can see here that my new ListenerCertificate resources were created first, and then the existing one was deleted in the cleanup phase. This resulted in the correct configuration on my listener.

Screenshot 2021-03-10 at 10 07 15

Please give my patch above a try for yourself.

@akefirad
Copy link
Author

Seems to work. Thanks for the support!

This was referenced Mar 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-elasticloadbalancingv2 Related to Amazon Elastic Load Balancing V2 bug This issue is a bug. in-progress This issue is being actively worked on. p1
Projects
None yet
3 participants