[aws-rds] Minimize downtime during DBCluster updates #10595

hixi-hyi · 2020-09-29T18:04:18Z

Minimize downtime during DB Cluster updates

Current Status

The CfnDBInstance of DBCluster is currently loosely coupled.
That is, if there are multiple CfnDBInstances, Instance updates will occur at the same time because there is no dependency on them on Cloudformation.
Therefore, the cluster will not be available until the DBInstance update is complete.

Proposal

Adds a dependency to CfnDBInstance.
As a result, one by one, RollingUpdate will be performed, and the only downtime will be the timing of the primary switch.
In other words, when there are two instances, it will take only two failover times to update. (A)
If we can create Dependency dynamically, it will take only a one-time failover time to update. (B)

I think primary failover times are faster than Instance updates. So I think it would be useful to include this feature.
However, the update time for Stack and the maintenance time for offline updates will increase.

What do you think about this proposal?
I'd like to hear your opinion.

Proposal Solution (A)

aws-cdk/packages/@aws-cdk/aws-rds/lib/cluster.ts

Line 734 in d95af00

instance.node.addDependency(internetConnected);

Add instance.node.addDependency(previous_instance);

Proposal Solution (B)

I think we need to use aws-sdk to determine if the current Instance is primary or replica, but I haven't thought about it in detail.

👋 I may be able to implement this feature request
⚠️ This feature might incur a breaking change

This is a 🚀 Feature Request

The text was updated successfully, but these errors were encountered:

hixi-hyi · 2020-09-29T21:05:08Z

p.s. I thought it might be a good idea to create an instanceUpdateBehavior argument and change the behavior accordingly. (ROLLING, BULK)

skinny85 · 2020-12-05T01:34:30Z

Hey @hixi-hyi ,

thanks for opening the issue. This is a very interesting proposal. Pinging @jogold as well , for visibility.

p.s. I thought it might be a good idea to create an instanceUpdateBehavior argument and change the behavior accordingly. (ROLLING, BULK)

Where does this instanceUpdateBehavior live? On the Cluster itself, or somewhere else?

hixi-hyi · 2021-03-08T05:53:51Z

@skinny85

Where does this instanceUpdateBehavior live? On the Cluster itself, or somewhere else?

I think a better definition would be Define to

aws-cdk/packages/@aws-cdk/aws-rds/lib/cluster.ts

Line 22 in 4e03667

interface DatabaseClusterBaseProps {

The developer assigns this attribute when creating a DatabaseCluster.

skinny85 · 2021-03-08T19:17:47Z

@hixi-hyi I think I see where you're going with this. So we would add a property to DatabaseClusterProps, called something like instanceUpdateBehavior, whose type would be an enum with 2 members, with names like BULK (the current behavior, and so the default) and ROLLING (update the instances one-by-one by adding dependencies between them)?

Did I understand your suggestion correctly?

hixi-hyi · 2021-03-11T13:23:21Z

@skinny85 Yes, You know exactly what I mean.

skinny85 · 2021-03-11T20:34:56Z

I'm glad @hixi-hyi 🙂.

Any chance of opening us a PR implementing this? Should only require adding a property here (or perhaps InstanceProps are a better place for it...?), and adding the DependsOn somewhere here.

Here's our Contributing guide: https://github.com/aws/aws-cdk/blob/master/CONTRIBUTING.md.

Thanks,
Adam

Support defining the instance update behaviour of RDS instances. This allows to switch between bulk (all instances at once) and rolling updates (one instance after another). While bulk updates are faster, they have a higher risk for longer downtimes as all instances might be simultaneously unreachable due to the update. Rolling updates take longer but ensure that all but one instance are not updated and thus downtimes are limited to the (at most two) changes of the primary instance. We keep the current behaviour, namely a bulk update, as default. This implementation follows proposal A by hixi-hyi in issue aws#10595. Fixes aws#10595

spanierm42 · 2022-05-18T19:59:07Z

@skinny85 I added a PR for this issue some weeks ago. How long does it commonly take for the CDK maintainers to provide feedback to it? Is there anything I can do to speed up the process?

skinny85 · 2022-05-19T03:26:58Z

@mod-enter apologies for the bad experience! Unfortunately, I'm no longer with the CDK team, so I can't review your Pull Request.

Perhaps @TheRealAmazonKendra can help with this one?

spanierm42 · 2022-05-19T06:15:47Z

No worries, thanks for helping me anyway :)

Support defining the instance update behaviour of RDS instances. This allows to switch between bulk (all instances at once) and rolling updates (one instance after another). While bulk updates are faster, they have a higher risk for longer downtimes as all instances might be simultaneously unreachable due to the update. Rolling updates take longer but ensure that all but one instance are not updated and thus downtimes are limited to the (at most two) changes of the primary instance. We keep the current behaviour, namely a bulk update, as default. This implementation follows proposal A by hixi-hyi in issue aws#10595.

@hixi-hyi

Support defining the instance update behaviour of RDS instances. This allows to switch between bulk (all instances at once) and rolling updates (one instance after another). While bulk updates are faster, they have a higher risk for longer downtimes as all instances might be simultaneously unreachable due to the update. Rolling updates take longer but ensure that all but one instance are not updated and thus downtimes are limited to the (at most two) changes of the primary instance. We keep the current behaviour, namely a bulk update, as default. This implementation follows proposal A by @hixi-hyi in issue #10595. Fixes #10595

github-actions · 2022-07-12T23:40:15Z

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

hixi-hyi added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Sep 29, 2020

github-actions bot added the @aws-cdk/aws-rds Related to Amazon Relational Database label Sep 29, 2020

github-actions bot assigned skinny85 Sep 29, 2020

skinny85 added effort/large Large work item – several weeks of effort p2 and removed needs-triage This issue or PR still needs to be triaged. labels Dec 5, 2020

ericzbeard unassigned skinny85 Jun 17, 2021

ekeyser mentioned this issue Dec 8, 2021

aws-rds: modifying instance_size for DatabaseCluster results in significant downtime #17916

Closed

2 tasks

spanierm42 mentioned this issue Apr 23, 2022

feat(rds): support rolling instance updates to reduce downtime #20054

Merged

mergify bot closed this as completed in #20054 Jul 12, 2022

blimmer mentioned this issue Mar 15, 2024

(rds): instanceUpdateBehaviour is broken with writers/readers configuration #27694

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[aws-rds] Minimize downtime during DBCluster updates #10595

[aws-rds] Minimize downtime during DBCluster updates #10595

hixi-hyi commented Sep 29, 2020 •

edited

Loading

hixi-hyi commented Sep 29, 2020 •

edited

Loading

skinny85 commented Dec 5, 2020

hixi-hyi commented Mar 8, 2021 •

edited

Loading

skinny85 commented Mar 8, 2021 •

edited

Loading

hixi-hyi commented Mar 11, 2021

skinny85 commented Mar 11, 2021

spanierm42 commented May 18, 2022

skinny85 commented May 19, 2022

spanierm42 commented May 19, 2022

github-actions bot commented Jul 12, 2022

[aws-rds] Minimize downtime during DBCluster updates #10595

[aws-rds] Minimize downtime during DBCluster updates #10595

Comments

hixi-hyi commented Sep 29, 2020 • edited Loading

Current Status

Proposal

Proposal Solution (A)

Proposal Solution (B)

hixi-hyi commented Sep 29, 2020 • edited Loading

skinny85 commented Dec 5, 2020

hixi-hyi commented Mar 8, 2021 • edited Loading

skinny85 commented Mar 8, 2021 • edited Loading

hixi-hyi commented Mar 11, 2021

skinny85 commented Mar 11, 2021

spanierm42 commented May 18, 2022

skinny85 commented May 19, 2022

spanierm42 commented May 19, 2022

github-actions bot commented Jul 12, 2022

⚠️COMMENT VISIBILITY WARNING⚠️

hixi-hyi commented Sep 29, 2020 •

edited

Loading

hixi-hyi commented Sep 29, 2020 •

edited

Loading

hixi-hyi commented Mar 8, 2021 •

edited

Loading

skinny85 commented Mar 8, 2021 •

edited

Loading