RFC 470: Aurora Serverless v2 support #472

pahud · 2023-01-03T17:34:38Z

This is a request for comments about Aurora Serverless v2 support. See #470 for
additional details.

APIs will be signed off by @vinayak-kukreja

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache-2.0 license

corymhall

This feels like a stop-gap approach to adding serverless v2 support. Maybe that's what we want to do, but personally I think we need to redesign the entire construct. From a users perspective I need to know too much outside of how the construct works.

Regardless, I would like to see more discussion around specific use cases and maybe when you should/should not use this construct. For example, I don't think any of the discussion in the original PR has been addressed aws/aws-cdk#22446

corymhall · 2023-01-04T13:58:28Z

text/0470-aurora-serverless-v2.md

+### Using Aurora Serverless v2 for existing provisioned workloads
+
+Aurora Serverless v2 allows you to add one or more Aurora Serverless v2 DB instances to
+the existing cluster as reader DB instances. Make sure the cluster engine is compatible with


Only reader? I thought you can have writer as well?

Hi @corymhall

If user has provisioned the cluster with the existing DatabaseCluster L2 construct with at least 1 provisioned instances, any additional serverless instance added into this cluster would always be the reader.

To create a serverless writer, first we need create the cluster with no provisioned instances with instance: 0 and the first serverless instance added into this cluster would be the writer.

vinayak-kukreja

I believe this is not a great customer experience. I agree with Cory that to use this construct the customers would need to understand a lot of the context of what parts are needed to make a cluster work with serverless v2 configuration. Our API is not clearly stating the intent here.

For instance, if I am a customer who would like to have a database cluster that supports Aurora Serverless V2 configuration, would it not be easier to just know that a construct does this for me and abstracting me from the internals of how that is possible. Mentioning serverlessV2Scaling here seems odd to me since that would already be my intent.

I believe we are trying to solve two different intent here,

Customers who already have a database cluster

For them to utilize serverless v2 configuration, they would need to,

Make sure there database engine version aligns with what is supported by v2
Enabling the serverlessV2Scaling configuration
Make sure the region is supported(Which is common for both)

Things to consider here would be,

Is it easy to switch database engine versions? Does that cause a stack update or block us on using certain features?
Can we provide the customers with validations or defaults around database engine versions and regions?

Customers who would like to create a new database cluster

These customers might not be aware of these recent changes. Would it not make sense to provide them with a default configured cluster that supports v2 configuration.

Things to consider here would be,

The cost the customer would incur. From my understanding, serverless v2 instances have a minimum ACU(Aurora Capacity Unit) of 0.5 which would add to their cost. We should warn the customer when they are doing this or mention it somewhere.
What use cases would be common here? If we are to provide a default configuration what would that look like? Would it be one serverless writer and one serverless reader? What would that cost?

I would like to learn how the customers utilize this new configuration? What could a default scenario look like that can target to solve a commonly used pattern?

For instance,
For an Aurora Serverless v2 cluster, you can choose whether to encrypt the cluster volume. For an Aurora Serverless v1 cluster, the cluster volume is always encrypted. You can choose the encryption key, but you can't disable encryption.

Does default encryption make sense for us? What should be our VPC configuration? Just some things to figure out defaults for. Let me know if this makes sense.

vinayak-kukreja · 2023-01-04T19:09:12Z

text/0470-aurora-serverless-v2.md

+const cluster = new rds.DatabaseCluster(this, 'Database', {
+  ...
+  instances: 2,
+  // make sure the engine is compatible with serverless v2.
+  engine: rds.DatabaseClusterEngine.auroraMysql({ version: rds.AuroraMysqlEngineVersion.VER_3_02_1 }),
+  // required before you are allowed to add serverless instances
+  serverlessV2Scaling: {
+    maxCapacity: 4,
+    minCapacity: 0.5,
+  },
+});
+// add a serverless reader
+cluster.addServerlessInstance('InstanceId')


The serverlessV2Scaling and instances is confusing to me. I believe as a customer this is too much to know and does not reflect intent in the API.

For instance, what does instances mean if I have serverlessV2Scaling mentioned. I would assume those are two serverless v2 instances being created by the construct. But these are provisioned instances right?

And we are also assuming the customer to know that there is no way to provision serverlessInstance: 1 in the API and we are doing it via addServerlessInstance(). Let me know if I am getting this right?

And as you mentioned restrictions on region and database engines that are supported by v2. There should either be validations or defaults around it if we are opting for v2 and this responsibility should not lie with the customer to configure.

Pull request has been modified.

pahud · 2023-01-10T18:11:58Z

HI @corymhall and @vinayak-kukreja

I've updated this RFC to propose a new construct instead. Please check it out again.

vinayak-kukreja

I believe we should add a section about use cases to this. We still need to understand the intent behind the API. That would help us figure out how we can make the user experience better with our abstractions.

I can reach out to Aurora team to setup a conversation thread so that you can gather this information.
Has there been any community insights on the use cases or documented anywhere else?

vinayak-kukreja · 2023-01-17T20:07:48Z

text/0470-aurora-serverless-v2.md

+## Aurora Serverless v2 support
+
+Aurora Serverless v2 is an on-demand, autoscaling configuration for Amazon Aurora.
+Aurora Serverless v2 helps to automate the processes of monitoring the workload and


Would it make sense to have helper methods supporting metrics here?

As DatabaseClusterV2 extends the DatabaseClusterBase abstract class and implement IDatabaseCluster so all helper methods defined in the base class should be available for DatabaseClusterV2. If we need to add additional helper methods for this new construct we can consider either add in the abstract class or this class. As this is a new construct without breaking changes, I think it's always possible to add any helper methods if necessary but doesn't have to be do that in the first PR.

And Yes! I will add a section for all known use cases.

Use Cases added.

Thank you for adding those.

In regards to the helper methods, what I want to convey is to think about how certain functions enable a better customer experience. They can be implemented later but the RFC would be a good place to list them out if any.

vinayak-kukreja · 2023-01-17T20:09:32Z

text/0470-aurora-serverless-v2.md

+
+```ts
+new rds.DatabaseClusterV2(stack, 'cluster', {
+  engine,


Can we have some validations in place for Serverless V2 instances? I believe these do not work with all versions of database engines.

yes we should validate the engine: rds.IClusterEngine to make sure it's compatible with serverless v2 and throw if necessary.

Great, do you believe we can have others? I remember you mentioned this is limited to certain regions as well?

Can these be documented in the RFC?

vinayak-kukreja · 2023-01-17T20:09:57Z

text/0470-aurora-serverless-v2.md

+```ts
+new rds.DatabaseClusterV2(stack, 'cluster', {
+  engine,
+  vpc,


Any defaults for VPC that would make sense for our database here?

Yes ec2.IVpc would suffice.

vinayak-kukreja · 2023-01-17T20:11:43Z

text/0470-aurora-serverless-v2.md

+  engine,
+  vpc,
+  // writer(serverless)
+  writer: { serverless: true },


What does serverless: true mean here? A serverless instance has minimum and maximum ACUs right? What are those defaults for us? How can the customer modify them? Do we plan to add to the documentation about expenses this would add?

The default ACU is defined in the serverlessV2Scaling prop for DatabaseClusterV2 which is optional and the default is min: 0.5 max: 1, and that is the minimal required value. We are not allowed to define ACU for individual serverless instances. The only available configuration is the serverlessV2Scaling for the cluster.

Hmm, could you please tell me why the maximum is 1? I see it can be higher here.

We are not allowed to define ACU for individual serverless instances.

By this you mean the config of min and max that we would set would apply to all serverless v2 instances. Right?

Yes we should discuss what is the best defaults of min and max. The minimal possible values from cost-saving perspective is min:0.5 and max:1

the minimal value of min is 0.5

the value of max should be greater than the value of min hence 1

With those defaults, the default cluster would have minimal required values with the best cost saving while customers can always specify any higher values as they wish. With that being said, I think the defaults make sense. Thoughts?

vinayak-kukreja · 2023-01-17T20:14:15Z

text/0470-aurora-serverless-v2.md

+could be either serverless or privisioned.
+
+```ts
+new rds.DatabaseClusterV2(stack, 'cluster', {


Could you mention the defaults for this API and any validations that would be implemented?

What would happen if I do,

new rds.DatabaseClusterV2(stack, 'cluster');

The minimal required props of this construct would be engine and vpc. The engine prop is rds.IClusterEngine and we can check in the construct if the engine is compatible with serverless v2 and throw if necessary.

vpc is required in the InstanceProps for the old DatabaseCluster but we don't need the whole InstanceProps. We just need the vpc: ec2.IVpc.

check out this POC for more implementation details.

Sorry about the confusion. What I meant to ask was what configuration of cluster gets created if we are just relying on defaults? For instance, do we generate 2 provisioned instances in the default scenario

I was unable to figure this out in the POC. Could you point me to the implementation and also add that information in the RFC?

vinayak-kukreja · 2023-01-17T20:14:26Z

text/0470-aurora-serverless-v2.md

+[Using Aurora Serverless v2](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.html)
+for more details.
+
+### Create a new cluster


Can any helper methods make it easier or more clear for the customer to use this resource?

As DatabaseClusterV2 extends the DatabaseClusterBase abstract class and implement IDatabaseCluster so all helper methods defined in the base class should be available for DatabaseClusterV2. If there's new requirement to add any new helper method we can consider add in the base class so the old DatabaseCluster class can inherit as well. As DatabaseClusterV2 aims to create a new ServerlessV2-enabled cluster with a writer and optional readers, which could be declared in the construct property, it is not clear to me to have the requirement to define additional helper methods in the L2 construct class rather than the base class. If there's any, please let me know.

vinayak-kukreja · 2023-01-17T20:16:25Z

text/0470-aurora-serverless-v2.md

+    { instanceType: ec2.InstanceType.of(ec2.InstanceClass.T3, ec2.InstanceSize.LARGE) },
+    // reader 2(serverless)
+    { serverless: true },


This looks odd to me. We have more flexibility provisioning an instance but serverless is just a boolean? Can this experience be modeled to be similar?

We need an interface for writer and readers and they can be either provisioned or serverless and the default is provisioned.

For a serverless instance, the minimal required property would be serverless: true since the default is provisioned while for a provisioned instanced the default could be just {} but you can optionally specify custom instanceType which is not allowed in serverless.

I think this could satisfy both serverless and provisioned here but I am open to different design for this interface.

I believe there could be an issue with large clusters with our current API with list of readers and writers.

For instance, imagine a cluster with 50 readers. Instantiating with current API would look quite large and have redundant information as well.

Also, could you tell me what do you mean by custom instance type is not allowed in serverless? From what I understand, we can have ACU's modified for serverless v2 instances?

From my current design, if you are creating a large cluster with 50 or more readers, you need to pass a list of instance options to the readers prop. And each element in the list could have different instance types or instance options. Do you expect any other user experience for this scenario?

new DatabaseClusterV2(scope, id, { writer, readers: readerList, }

Also, could you tell me what do you mean by custom instance type is not allowed in serverless? From what I understand, we can have ACU's modified for serverless v2 instances?

In serverlessV2, there is no option to specify individual ACU for individual serverless instances. All serverless instances are sharing the same ServerlessV2ScalingConfiguration which is cluster-wise. Behind the scene the serverless instance is literally the CfnDbInstance which does not allow you to specify any specific ACU for it.

vinayak-kukreja · 2023-01-17T20:19:30Z

text/0470-aurora-serverless-v2.md

+Users that opt-in serverless v2 enabled clusters should use this class to create a new cluster
+as well as writer and reader instances, which could be serverless, porivioned or mixed.


How can the user opt-in for this functionality? Is it just by using this construct?

Yes. From my current proposed design, the DatabaseClusterV2 construct is only for serverlessV2-enabled clusters only. When you instantiate this class, a CfnDatabaseCluster would be provisioned with serverlessV2 enabled.

Great, thank you for clarifying. Could you please add this to the RFC too?

Pull request has been modified.

pahud · 2023-01-17T22:41:59Z

Just added the Use Cases section and elaborated more details on the design. FYR @vinayak-kukreja

vinayak-kukreja · 2023-01-23T18:21:01Z

text/0470-aurora-serverless-v2.md

+## Use Cases
+
+1. To create a new Aurora Serverless v2 cluster with one writer only. This writer could be either serverless or provisioned instance:
+
+- serverless writer
+- provisioned writer
+
+2. To create a new Aurora Serverless v2 cluster with one writer and one reader as serverless or provisoined. This could be:
+
+- serverless writer + serverless reader
+- serverless writer + provisioned reader
+- provisioned writer + serverless reader
+- provisioned writer + provisioned reader
+
+3. To create a new Aurora Serverless v2 cluster with one writer and multiple readers. This could be:
+
+- serverless writer + all serverless readders
+- serverless writer + all provisioned readers
+- serverless writer + mixed readers
+- provisioned writer + all serverless readers
+- provisioned writer + all provisioned readers
+- provisioned writer + mixed readers


Thank you for adding these.

Let me make sure the Aurora Serverless Service team also reviews this and if there are any more that they think could provide a better customer experience, then we can add those here as well.

After syncing with the Aurora team, listing here for visibility what this design needs to look out for,

Supporting a customer use case where the cluster could be empty i.e. without any instance and just the storage layer. This helps customers not actively using the database and remove the compute layer which adds costs for them. This could potentially also help in migrating from an existing cluster to a serverless v2 supported cluster.

Decide on a default scenario for provisioning.

Decide on default minimum and maximum ACUs for serverless v2 instance. Should there be different defaults based on what database engine is selected?

Helper methods to add/remove/upgrade instances.

Metrics to detect failover in instances to alarm over and general database cluster metrics that can be helpful for the customers.

During implementation,

Testing that provisioning is working with all supported database engines.

Callouts in documentation about

Unreachable writer instance due to DNS caching if the instance was replaced due to a failover.

Cluster instances sizes are recommended to be similar to better handle a failover scenario.

vinayak-kukreja · 2023-01-24T19:28:07Z

Hey @pahud, replied to some of the comments thread. Let me know if you have any queries.

pahud · 2023-01-26T20:30:55Z

Per discussion with the core team offline, I am closing this RFC in favor of a new one.

pahud added 5 commits December 21, 2022 18:26

initial commit

97a9ea9

minor update

9b4f285

minor

9c72ed2

update README

6d0f944

fix linter errors

ed26c35

corymhall previously requested changes Jan 4, 2023

View reviewed changes

vinayak-kukreja previously requested changes Jan 4, 2023

View reviewed changes

create a new construct

b20a648

vinayak-kukreja previously requested changes Jan 17, 2023

View reviewed changes

add use cases

5d317e6

fix linter

25be10a

pahud added 3 commits January 17, 2023 22:56

add more description on the design

303dfc2

add more description on the design

a503c2f

minor

062eae1

vinayak-kukreja reviewed Jan 23, 2023

View reviewed changes

pahud closed this Jan 26, 2023

evgenyka added the l2-request request for new L2 construct label Oct 11, 2023

		Users that opt-in serverless v2 enabled clusters should use this class to create a new cluster
		as well as writer and reader instances, which could be serverless, porivioned or mixed.

RFC 470: Aurora Serverless v2 support #472

RFC 470: Aurora Serverless v2 support #472

Conversation

pahud commented Jan 3, 2023 • edited Loading

corymhall left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vinayak-kukreja left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pahud commented Jan 10, 2023

vinayak-kukreja left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pahud Jan 25, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pahud Jan 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pahud Jan 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pahud commented Jan 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vinayak-kukreja commented Jan 24, 2023

pahud commented Jan 26, 2023 • edited Loading

pahud commented Jan 3, 2023 •

edited

Loading

pahud Jan 25, 2023 •

edited

Loading

pahud Jan 17, 2023 •

edited

Loading

pahud Jan 17, 2023 •

edited

Loading

pahud commented Jan 17, 2023 •

edited

Loading

pahud commented Jan 26, 2023 •

edited

Loading