From 59f22bdd05f506df733aa07558f6a6639ba37f2f Mon Sep 17 00:00:00 2001 From: Abby <78209557+abby-cyber@users.noreply.github.com> Date: Thu, 16 Mar 2023 17:25:00 +0800 Subject: [PATCH 1/2] config-rolling-update-strategy-for-operator --- .../11.rolling-update-strategy.md | 37 +++++++++++++++++++ .../9.upgrade-nebula-cluster.md | 4 ++ mkdocs.yml | 4 +- 3 files changed, 44 insertions(+), 1 deletion(-) create mode 100644 docs-2.0/nebula-operator/11.rolling-update-strategy.md diff --git a/docs-2.0/nebula-operator/11.rolling-update-strategy.md b/docs-2.0/nebula-operator/11.rolling-update-strategy.md new file mode 100644 index 00000000000..e248ea86704 --- /dev/null +++ b/docs-2.0/nebula-operator/11.rolling-update-strategy.md @@ -0,0 +1,37 @@ +# NebulaGraph cluster rolling update strategy + +NebulaGraph clusters use a distributed architecture to divide data into multiple logical partitions, which are typically evenly distributed across different nodes. In distributed systems, there are usually multiple replicas of the same data. To ensure the consistency of data across multiple replicas, NebulaGraph clusters use the Raft protocol to synchronize multiple partition replicas. In the Raft protocol, each partition elects a leader replica, which is responsible for handling write requests, while follower replicas handle read requests. + +When a NebulaGraph cluster created by NebulaGraph Operator performs a rolling update, a storage node temporarily stops providing services for the update. For an overview of rolling updates, see [Performing a Rolling Update](https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-intro/). If the node hosting the leader replica stops providing services, it will result in the unavailability of read and write operations for that partition. To avoid this situation, by default, Operator migrates the leader replicas to other unaffected nodes during the rolling update process of a NebulaGraph cluster. This way, when a storage node is being updated, the leader replicas on other nodes can continue processing client requests, ensuring the read and write availability of the cluster. + +The process of migrating all leader replicas from one storage node to the other nodes may take a long time. To better control the rolling update duration, Operator provides a field called `enableForceUpdate`. When it is confirmed that there is no external access traffic, you can set this field to `true`. This way, the leader replicas will not be migrated to other nodes, thereby speeding up the rolling update process. + +## Rolling update trigger conditions + +Operator triggers a rolling update of the NebulaGraph cluster under the following circumstances: + +- The version of the NebulaGraph cluster changes. +- The configuration of the NebulaGraph cluster changes. + +## Specify a rolling update strategy + +In the YAML file for creating a cluster instance, add the `spec.storaged.enableForceUpdate` field and set it to `true` or `false` to control the rolling update speed. + +When `enableForceUpdate` is set to `true`, it means that the partition leader replicas will not be migrated, thus speeding up the rolling update process. Conversely, when set to `false`, it means that the leader replicas will be migrated to other nodes to ensure the read and write availability of the cluster. The default value is `false`. + +!!! caution + + When setting `enableForceUpdate` to `true`, make sure there is no external access traffic. + +Configuration example: + +```yaml +... +spec: +... + storaged: + enableForceUpdate: true // When set to true, it speeds up the rolling update process. + ... +``` + + diff --git a/docs-2.0/nebula-operator/9.upgrade-nebula-cluster.md b/docs-2.0/nebula-operator/9.upgrade-nebula-cluster.md index ac697b3aff1..81c52b76d84 100644 --- a/docs-2.0/nebula-operator/9.upgrade-nebula-cluster.md +++ b/docs-2.0/nebula-operator/9.upgrade-nebula-cluster.md @@ -204,3 +204,7 @@ You have created a NebulaGraph cluster with Helm. For details, see [Create a Neb 1 vesoft/nebula-metad:{{nebula.tag}} 3 vesoft/nebula-storaged:{{nebula.tag}} ``` + +## Accelerate the upgrade process + +The upgrade process of a cluster is a rolling update process and can be time-consuming due to the state transition of the leader partition replicas in the Storage service. You can configure the `enableForceUpdate` field in the cluster instance's YAML file to skip the leader partition replica transfer operation, thereby accelerating the upgrade process. For more information, see [Specify a rolling update strategy](11.rolling-update-strategy.md). \ No newline at end of file diff --git a/mkdocs.yml b/mkdocs.yml index 8ce2810b6a1..70899cdee0e 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -703,13 +703,15 @@ nav: - Deploy clusters: - Deploy clusters with Kubectl: nebula-operator/3.deploy-nebula-graph-cluster/3.1create-cluster-with-kubectl.md - Deploy clusters with Helm: nebula-operator/3.deploy-nebula-graph-cluster/3.2create-cluster-with-helm.md + - Connect to NebulaGraph databases: nebula-operator/4.connect-to-nebula-graph-service.md - Configure clusters: - Custom configuration parameters for a NebulaGraph cluster: nebula-operator/8.custom-cluster-configurations/8.1.custom-conf-parameter.md - Reclaim PVs: nebula-operator/8.custom-cluster-configurations/8.2.pv-reclaim.md #ent - Balance storage data after scaling out: nebula-operator/8.custom-cluster-configurations/8.3.balance-data-when-scaling-storage.md - Upgrade NebulaGraph clusters: nebula-operator/9.upgrade-nebula-cluster.md - - Connect to NebulaGraph databases: nebula-operator/4.connect-to-nebula-graph-service.md + - Specify a rolling update strategy: nebula-operator/11.rolling-update-strategy.md +#ent - Backup and restore: nebula-operator/10.backup-restore-using-operator.md - Self-healing: nebula-operator/5.operator-failover.md - FAQ: nebula-operator/7.operator-faq.md From fb182c062a5181dd581c9452ef5fd6662da97ba4 Mon Sep 17 00:00:00 2001 From: Abby <78209557+abby-cyber@users.noreply.github.com> Date: Fri, 17 Mar 2023 13:48:25 +0800 Subject: [PATCH 2/2] Update 11.rolling-update-strategy.md --- docs-2.0/nebula-operator/11.rolling-update-strategy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs-2.0/nebula-operator/11.rolling-update-strategy.md b/docs-2.0/nebula-operator/11.rolling-update-strategy.md index e248ea86704..8dea20c348c 100644 --- a/docs-2.0/nebula-operator/11.rolling-update-strategy.md +++ b/docs-2.0/nebula-operator/11.rolling-update-strategy.md @@ -21,7 +21,7 @@ When `enableForceUpdate` is set to `true`, it means that the partition leader re !!! caution - When setting `enableForceUpdate` to `true`, make sure there is no external access traffic. + When setting `enableForceUpdate` to `true`, make sure there is no traffic entering the cluster for read and write operations. This is because this setting will force the cluster pods to be rebuilt, and during this process, data loss or client request failures may occur. Configuration example: