feat: propose new features for the Postgres matrix #4

gbartolini · 2023-04-23T12:51:28Z

Mutual TLS support for authentication of replicas
JSON logging in standard output
Audit logging
Read-only root file system
Transparent change of configuration for PostgreSQL parameters
Fencing

ahachete · 2023-04-27T16:30:49Z

postgres/spec/feature_matrix.yaml

@@ -207,6 +207,11 @@ categories:
        description: |
          Operators may chose by default to generate self-signed SSL certificates.
          They may also offer the option to specify the CA and certificates that users want Postgres clusters to use.
+      - id: mtlsrep


While not crucially important, so far the convention of having four character ids for categories and five character ids for the features was used. There's no strong reason to keep it like this --simply there isn't-- but if you believe the IDs like this one and others as part of this commit can be fitted under this convention, it would be better.

We can call it sslr

postgres/spec/feature_matrix.yaml

ahachete · 2023-04-27T16:40:51Z

postgres/spec/feature_matrix.yaml

@@ -694,6 +716,16 @@ categories:
          The operator must provide proper information to the user as to the status and final result of the operation.
          The operator should provide ongoing status information, and perform the operation with the minimum downtime required.
        main: true
+      - id: cfgchg
+        name: PostgreSQL configuration changes
+        type: boolean


I'd propose to make this feature a bit more generic (broader scope). Changes for restart may be for many more reasons that not only hot-standby parameters change (e.g. a frequent case is adding an extension to shared_preload_libraries). For this reason, I'd vouch to remove any reference to particular parameters and just keep the description generic to "changes to configuration that may require reload or restart" (not a proposed wording, just to illustrate the idea).

Current description also raises some doubts (at least for me) on whether this implies that restarts may happen (or not) as soon as a configuration change is triggered --which is debatable whether that's a good thing or not, leaving more control for the user. For this reason, I'd argue that the description is reworded more in the terms of describing whether the user provides a fully automated way to proceed with a configuration reload or restart, providing the adequate information to the user, without fully qualifying whether that's automatically or user triggered --however a great place to add that information is the comments field on the vendor submission.

Actually, if these thoughts are taking into account, this feature would be quite close to day2/crest, becoming possibly a duplicate. Maybe all these ideas here can be merged into a single one, potentially improving the actual day2/crest?

I don't agree here. Ensuring and coordinating the restart within a cluster is a feature, an important feature that removes the need for a human operator. For example, if you want to raise max_connections and you have replicas, you need to ensure that this operation is performed first on the standbys, and then - as last - on the primary. If you decrease the value, it is the opposite.

Ideally, a Kubernetes operator should simulate what a human being would do in this case, but do it in an automated and reliable way, without requiring human intervention - in order to prioritize self-healing and high availability. Then, obviously, you can configure and request human intervention, but I believe that these features should be highlighted, as configuration changes should be as transparent as they can possibly be to the end user and an operator should handle that as part of the resource lifecycle.

Oh, I think we agree on the intent (to show an operator provides capabilities associated with controlling different aspects of the cluster's lifecycle, like a controlled restart operation) but maybe we're confused by the wording.

In particular my concern is not with the fact that automation should be used to perform a careful and correct restart; but rather than with the fact that it is triggered "automatically" (whenever a change is requested) rather than giving the operator of when running it (but then it runs automatically). In other words: I may want to keep a restart operation on hold until a better time (e.g. 3am, on a valley of traffic) rather than being launched as soon as I edit my max_connections parameter (which I may or may not realize may immediately trigger that event).

So I agree it's a good thing and this feature this reflect that the restart operation is fully automated; but I encourage wording to not assume that automated operation is triggered immediately, and that it is even offered as an option or that the default is the opposite.

But I'll leave the final decision to you, having my opinion here I'll just merge the latest patch that you send :) as your criteria should be well represented here.

Ok to soften the wording and suggest that users have the choice.

Please check

ahachete

Left some comments, anything not explicitly commented means agreement from my side with the proposed changes.

ahachete · 2023-06-16T15:35:38Z

@gbartolini May you kindly review the above comment? :) It would be great to merge your improvements soon. Thanks!

- Mutual TLS support for authentication of replicas - JSON logging in standard output - Audit logging - Read-only root file system - Transparent change of configuration for PostgreSQL parameters - Fencing Signed-off-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>

Signed-off-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>

gbartolini · 2024-03-06T06:21:43Z

@ahachete do you have any further comments? Thanks!

gbartolini mentioned this pull request Apr 24, 2023

Review operator feature matrix from DoK cloudnative-pg/cloudnative-pg#1948

Closed

jsilvela approved these changes Apr 25, 2023

View reviewed changes

ahachete reviewed Apr 27, 2023

View reviewed changes

postgres/spec/feature_matrix.yaml Show resolved Hide resolved

ahachete reviewed Apr 27, 2023

View reviewed changes

ahachete requested changes Apr 27, 2023

View reviewed changes

gbartolini added 2 commits October 24, 2023 18:16

fix: apply suggestions from Alvaro

1017e05

Signed-off-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>

gbartolini force-pushed the new-features branch from 49ab700 to 1017e05 Compare October 24, 2023 16:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: propose new features for the Postgres matrix #4

feat: propose new features for the Postgres matrix #4

gbartolini commented Apr 23, 2023

ahachete Apr 27, 2023

gbartolini May 18, 2023

ahachete Apr 27, 2023

ahachete Apr 27, 2023

gbartolini May 29, 2023 •

edited

Loading

ahachete May 29, 2023

gbartolini Oct 24, 2023

gbartolini Oct 24, 2023

ahachete left a comment

ahachete commented Jun 16, 2023

gbartolini commented Mar 6, 2024

feat: propose new features for the Postgres matrix #4

Are you sure you want to change the base?

feat: propose new features for the Postgres matrix #4

Conversation

gbartolini commented Apr 23, 2023

ahachete Apr 27, 2023

Choose a reason for hiding this comment

gbartolini May 18, 2023

Choose a reason for hiding this comment

ahachete Apr 27, 2023

Choose a reason for hiding this comment

ahachete Apr 27, 2023

Choose a reason for hiding this comment

gbartolini May 29, 2023 • edited Loading

Choose a reason for hiding this comment

ahachete May 29, 2023

Choose a reason for hiding this comment

gbartolini Oct 24, 2023

Choose a reason for hiding this comment

gbartolini Oct 24, 2023

Choose a reason for hiding this comment

ahachete left a comment

Choose a reason for hiding this comment

ahachete commented Jun 16, 2023

gbartolini commented Mar 6, 2024

gbartolini May 29, 2023 •

edited

Loading