Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion docs/explanation.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,10 @@ This section contains pages with more detailed explanations that provide additio
* [Legacy charm]

## Operational concepts
* [Connection pooling]
* [Units]
* [Users]
* [Logs]
* [Connection pooling]

## Security and hardening
* [Security hardening guide][Security]
Expand All @@ -22,6 +23,7 @@ This section contains pages with more detailed explanations that provide additio

[Architecture]: /t/11857
[Interfaces and endpoints]: /t/10251
[Units]: /t/17525
[Users]: /t/10798
[Logs]: /t/12099
[Juju]: /t/11985
Expand Down
93 changes: 93 additions & 0 deletions docs/explanation/e-units.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# PostgreSQL units

Each [HA](https://en.wikipedia.org/wiki/High_availability)/[DR](https://en.wikipedia.org/wiki/IT_disaster_recovery) implementation has a primary and secondary (standby) site(s).
Charmed PostgreSQL cluster size can be [easily scaled](/t/11863) from 0 to 10 units ([contact us](/t/11863) for 10+ units cluster). It is recommended to use 3+ units cluster size in production (due to [Raft consensus](https://en.wikipedia.org/wiki/Raft_(algorithm)) requirements). Those units type can be:
* **Primary**: unit which accepts all writes and guaranties [no split brain](https://en.wikipedia.org/wiki/Split-brain_(computing)).
* **Sync Standby** (synchronous copy) : designed for the fast automatic failover. Used for read-only queries and guaranties the latest transaction availability.
* **Replica** (asynchronous copy): designed for long-running and resource consuming queries without affecting Primary performance. Used for read-only queries without guaranties of the latest transaction availability.

> **Warning**: all SQL transactions have to be confirmed by all Sync Standby unit(s) before Primary unit commit transaction to the client. Therefor the high-performance and high-availability is a trade-of balance between "Sync Standby" and "Replica" units count in the cluster.

> **Note**: starting from revision 561 all Charmed PostgreSQL units are configured as Sync Standby members. It provides better guaranties for the data survival when two of three units gone simultaneously. Users can re-configure the necessary synchronous units count using Juju config option '[synchronous_node_count](https://charmhub.io/postgresql/configurations?channel=14/edge#synchronous_node_count)'.

![PostgreSQL Units types|690x253, 100%](upload://pY5kzxO9ELJGEqEe1F1RQjOG6SS.png)

## Primary

The simplest way to find the Primary unit is to run `juju status`. Please be aware that the information here can be outdated as it is being updated only on [Juju event 'update-status'](https://documentation.ubuntu.com/juju/3.6/reference/hook/#update-status):
```shell
ubuntu@juju360:~$ juju status postgresql
Model Controller Cloud/Region Version SLA Timestamp
postgresql lxd localhost/localhost 3.6.5 unsupported 13:04:15+02:00

App Version Status Scale Charm Channel Rev Exposed Message
postgresql 14.15 active 3 postgresql 14/stable 553 no

Unit Workload Agent Machine Public address Ports Message
postgresql/0* active idle 0 10.189.210.53 5432/tcp Primary <<<<<<<<<<<<<<
postgresql/1 active idle 1 10.189.210.166 5432/tcp
postgresql/2 active idle 2 10.189.210.188 5432/tcp

Machine State Address Inst id Base AZ Message
0 started 10.189.210.53 juju-422c1a-0 ubuntu@22.04 Running
1 started 10.189.210.166 juju-422c1a-1 ubuntu@22.04 Running
2 started 10.189.210.188 juju-422c1a-2 ubuntu@22.04 Running
```

The up-to-date Primary unit number can be received using Juju action `get-primary`:
```shell
> juju run postgresql/leader get-primary
...
primary: postgresql/0
```

Also it is possible to retrieve this information using [patronictl](/t/17406#p-37204-patronictl-3) and [Patroni REST API](/t/17406#p-37204-patroni-rest-api-8).

## Standby / Replica

At the moment it is possible to retrieve this information using [patronictl](/t/17406#p-37204-patronictl-3) and [Patroni REST API](/t/17406#p-37204-patroni-rest-api-8) only (check the linked documentation for the access details). Example:
```shell
> ... patronictl ... list
+ Cluster: postgresql (7499430436963402504) ---+-----------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------------+----------------+--------------+-----------+----+-----------+
| postgresql-0 | 10.189.210.53 | Leader | running | 1 | |
| postgresql-1 | 10.189.210.166 | Sync Standby | streaming | 1 | 0 |
| postgresql-2 | 10.189.210.188 | Replica | streaming | 1 | 0 |
+--------------+----------------+--------------+-----------+----+-----------+
```
On the example above:
* `postgresql-0` is a PostgreSQL Primary unit (Patroni Leader) which accepts all writes
* `postgresql-1` is a PostgreSQL/Patroni Sync Standby unit which can be promoted as new primary using manual switchover (safe).
* `postgresql-2` is a PostgreSQL/Patroni Replica unit which can NOT be directly promoted as a new Primary using manual switchover. The automatic promotion Replica=>Sync Standby is necessary to guaranties the latest SQL transactions availability on this unit to allow further promotion as a new Primary. Otherwise the manual failover can be performed to Replica unit accepting the risks of loosing the last transactions(s) which lagged behind Primary.

## Replica lag distance

At the moment it is possible to retrieve this information using [patronictl](/t/17406#p-37204-patronictl-3) and [Patroni REST API](/t/17406#p-37204-patroni-rest-api-8) only (check the linked documentation for the access details). Example:
```shell
> ... patronictl ... list
+ Cluster: postgresql (7499430436963402504) ---+-----------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------------+----------------+--------------+-----------+----+-----------+
| postgresql-0 | 10.189.210.53 | Leader | running | 1 | |
| ...
| postgresql-2 | 10.189.210.188 | Replica | streaming | 1 | 42 | <<<<<
+--------------+----------------+--------------+-----------+----+-----------+

> curl ... x.x.x.x:8008/cluster | jq
"members": [
{
"name": "postgresql-0",
"role": "leader",
"state": "running",
...
},
...
{
"name": "postgresql-2",
"role": "replica",
"state": "streaming",
...
"lag": 42 <<<<<<<<<<<< Lag in MB
}
```
2 changes: 1 addition & 1 deletion docs/explanation/e-users.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Charm Users explanations
# Users

There are three types of users in PostgreSQL:
* Internal users (used by charm operator)
Expand Down
25 changes: 17 additions & 8 deletions docs/how-to.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,12 @@ Installation of different cloud services with Juju:
* [Azure]
* [Multi-availability zones (AZ)][Multi-AZ]

Specific deployment scenarios and architectures:
* [Terraform]
* [Air-gapped]
Other deployment scenarios and configurations:
* [TLS VIP access]
* [Juju spaces]
* [Air-gapped]
* [Terraform]
* [Juju storage]

## Usage and maintenance

Expand All @@ -25,6 +27,7 @@ Specific deployment scenarios and architectures:
* [Scale replicas]
* [Enable TLS]
* [Enable plugins/extensions]
* [Switchover/failover]

## Backup and restore
* [Configure S3 AWS]
Expand All @@ -36,9 +39,10 @@ Specific deployment scenarios and architectures:

## Monitoring (COS)

* [Enable monitoring]
* [Enable alert rules]
* [Enable tracing]
* [Enable monitoring] with Grafana
* [Enable alert rules] with Prometheus
* [Enable tracing] with Tempo
* [Enable profiling] with Parca

## Minor upgrades
* [Perform a minor upgrade]
Expand Down Expand Up @@ -69,13 +73,17 @@ This section is for charm developers looking to support PostgreSQL integrations
[GCE]: /t/15722
[Azure]: /t/15733
[Multi-AZ]: /t/15749
[TLS VIP access]: /t/16576
[Juju spaces]: /t/17416
[Terraform]: /t/14916
[Air-gapped]: /t/15746
[TLS VIP access]: /t/16576
[Juju storage]: /t/17529

[Integrate with another application]: /t/9687
[External access]: /t/15802
[Scale replicas]: /t/9689
[Enable TLS]: /t/9685
[Switchover/failover]: /t/17523

[Configure S3 AWS]: /t/9681
[Configure S3 RadosGW]: /t/10313
Expand All @@ -87,7 +95,8 @@ This section is for charm developers looking to support PostgreSQL integrations
[Enable monitoring]: /t/10600
[Enable alert rules]: /t/13084
[Enable tracing]: /t/14521

[Enable profiling]: /t/17172

[Perform a minor upgrade]: /t/12089
[Perform a minor rollback]: /t/12090

Expand Down
2 changes: 1 addition & 1 deletion docs/how-to/h-async-set-up.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ juju run -m rome db1/leader create-replication
To switchover and use `lisbon` as the primary instead, run

```shell
juju run -m lisbon db2/leader promote-to-primary
juju run -m lisbon db2/leader promote-to-primary scope=cluster
```

## Scale a cluster
Expand Down
65 changes: 65 additions & 0 deletions docs/how-to/h-deploy-juju-spaces.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Deploy on Juju spaces

The Charmed PostgreSQL operator supports [Juju spaces](https://documentation.ubuntu.com/juju/latest/reference/space/index.html) to separate network traffic for:
- **Client** - PostgreSQL instance to client data
- **Instance-replication** - cluster instances replication data
- **Cluster-replication** - cluster to cluster replication data
- **Backup** - backup and restore data

## Prerequisites

* **Charmed PostgreSQL 16**
* Configured network spaces
* See [Juju | How to manage network spaces](https://documentation.ubuntu.com/juju/latest/reference/juju-cli/list-of-juju-cli-commands/add-space/)

## Deploy

On application deployment, constraints are required to ensure the unit(s) have address(es) on the specified network space(s), and endpoint binding(s) for the space(s).

For example, with spaces configured for instance replication and client traffic:
```shell
❯ juju spaces
Name Space ID Subnets
alpha 0 10.163.154.0/24
client 1 10.0.0.0/24
peers 2 10.10.10.0/24
```

The space `alpha` is default and cannot be removed. To deploy Charmed PostgreSQL Operator using the spaces:
```shell
juju deploy postgresql --channel 16/edge \
--constraints spaces=client,peers \
--bind "database-peers=peers database=client"
```

[note type=caution]
Currently there's no support for the juju `bind` command. Network space binding must be defined at deploy time only.
[/note]

Consequently, a client application must use the `client` space on the model, or a space for the same subnet in another model, for example:
```shell
juju deploy client-app \
--constraints spaces=client \
--bind database=client
```

The two application can be then related using:
```shell
juju integrate postgresql:database client-app:database
```

The client application will receive network endpoints on the `10.0.0.0/24` subnet.

The Charmed PostgreSQL operator endpoints are:

| Endpoint | Traffic |
| ------------------------------ | -------------------- |
| database | Client |
| database-peers | Instance-replication |
| replication-offer, replication | Cluster-replication |
| s3-parameters | Backup |


[note]
If using a network space for the backup traffic, the user is responsible for ensuring that the target object storage URL traffic is routed via the specified network space.
[/note]
Loading