Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update ssl & adds-on #2261

Merged
merged 5 commits into from
Sep 21, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ NebulaGraph Operator does not support the v1.x version of NebulaGraph. NebulaGra

| NebulaGraph | NebulaGraph Operator |
| ------------- | -------------------- |
| 3.5.x | 1.5.0, 1.6.0 |
| 3.5.x | 1.5.0, 1.6.1 |
abby-cyber marked this conversation as resolved.
Show resolved Hide resolved
| 3.0.0 ~ 3.4.1 | 1.3.0, 1.4.0 ~ 1.4.2 |
| 3.0.0 ~ 3.3.x | 1.0.0, 1.1.0, 1.2.0 |
| 2.5.x ~ 2.6.x | 0.9.0 |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -108,72 +108,6 @@ The following example shows how to create a NebulaGraph cluster by creating a cl
=== "Cluster with Zones"

NebulaGraph Operator supports creating a cluster with [Zones](../../4.deployment-and-installation/5.zone.md).

You must set the following parameters for creating a cluster with Zones. Other parameters can be changed as needed. For more information on other parameters, see the [sample configuration](https://github.com/vesoft-inc/nebula-operator/blob/v{{operator.release}}/config/samples/apps_v1alpha1_nebulacluster.yaml).

| Parameter | Default value | Description |
| :---- | :--- | :--- |
| `spec.metad.licenseManagerURL` | - | Configure the URL that points to the LM, which consists of the access address and port number (default port `9119`) of the LM. For example, `192.168.8.100:9119`. **You must configure this parameter in order to obtain the license information; otherwise, the enterprise edition cluster cannot be used.** |
|`spec.<graphd|metad|storaged>.image`|-|The container image of the Graph, Meta, or Storage service of the enterprise edition.|
|`spec.imagePullSecrets`| - |Specifies the Secret for pulling the NebulaGraph Enterprise service images from a private repository.|
|`spec.alpineImage`|`reg.vesoft-inc.com/nebula-alpine:latest`|The Alpine Linux image, used to obtain the Zone information where nodes are located.|
|`spec.metad.config.zone_list`|-|A list of zone names, split by comma. For example: zone1,zone2,zone3. <br/>**Zone names CANNOT be modified once be set.**|
|`spec.graphd.config.prioritize_intra_zone_reading`|`false`|Specifies whether to prioritize sending queries to the storage nodes in the same zone.<br/>When set to `true`, the query is sent to the storage nodes in the same zone. If reading fails in that Zone, it will decide based on `stick_to_intra_zone_on_failure` whether to read the leader partition replica data from other Zones. |
|`spec.graphd.config.stick_to_intra_zone_on_failure`|`false`|Specifies whether to stick to intra-zone routing if unable to find the requested partitions in the same zone. When set to `true`, if unable to find the partition replica in that Zone, it does not read data from other Zones.|

???+ note "Learn more about Zones in NebulaGraph Operator"

**Understanding NebulaGraph's Zone Feature**

NebulaGraph utilizes a feature called Zones to efficiently manage its distributed architecture. Each Zone represents a logical grouping of Storage pods and Graph pods, responsible for storing the complete graph space data. The data within NebulaGraph's spaces is partitioned, and replicas of these partitions are evenly distributed across all available Zones. The utilization of Zones can significantly reduce inter-Zone network traffic costs and boost data transfer speeds. Moreover, intra-zone-reading allows for increased availability, because replicas of a partition spread out among different zones.

**Configuring NebulaGraph Zones**

To make the most of the Zone feature, you first need to determine the actual Zone where your cluster nodes are located. Typically, nodes deployed on cloud platforms are labeled with their respective Zones. Once you have this information, you can configure it in your cluster's configuration file by setting the `spec.metad.config.zone_list` parameter. This parameter should be a list of Zone names, separated by commas, and should match the actual Zone names where your nodes are located. For example, if your nodes are in Zones `az1`, `az2`, and `az3`, your configuration would look like this:

```yaml
spec:
metad:
config:
zone_list: az1,az2,az3
```

**Operator's Use of Zone Information**

NebulaGraph Operator leverages Kubernetes' [TopoloySpread](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/) feature to manage the scheduling of Storage and Graph pods. Once the `zone_list` is configured, Storage services are automatically assigned to their respective Zones based on the `topology.kubernetes.io/zone` label.

For intra-zone data access, the Graph service dynamically assigns itself to a Zone using the `--assigned_zone=$NODE_ZONE` parameter. It identifies the Zone name of the node where the Graph service resides by utilizing an init-container to fetch this information. The Alpine Linux image specified in `spec.alpineImage` (default: `reg.vesoft-inc.com/nebula-alpine:latest`) plays a role in obtaining Zone information.

**Prioritizing Intra-Zone Data Access**

By setting `spec.graphd.config.prioritize_intra_zone_reading` to `true` in the cluster configuration file, you enable the Graph service to prioritize sending queries to Storage services within the same Zone. In the event of a read failure within that Zone, the behavior depends on the value of `spec.graphd.config.stick_to_intra_zone_on_failure`. If set to `true`, the Graph service avoids reading data from other Zones and returns an error. Otherwise, it reads data from leader partition replicas in other Zones.

```yaml
spec:
alpineImage: reg.vesoft-inc.com/cloud-dev/nebula-alpine:latest
graphd:
config:
prioritize_intra_zone_reading: "true"
stick_to_intra_zone_on_failure: "false"
```

**Zone Mapping for Resilience**

Once Storage and Graph services are assigned to Zones, the mapping between the pod and its corresponding Zone is stored in a configmap named `<cluster_name>-graphd|storaged-zone`. This mapping facilitates pod scheduling during rolling updates and pod restarts, ensuring that services return to their original Zones as needed.

!!! caution

DO NOT manually modify the configmaps created by NebulaGraph Operator. Doing so may cause unexpected behavior.


Other optional parameters for the enterprise edition are as follows:

| Parameter | Default value | Description |
| :---- | :--- | :--- |
|`spec.storaged.enableAutoBalance`| `false`| Specifies whether to enable automatic data balancing. For more information, see [Balance storage data after scaling out](../8.custom-cluster-configurations/8.3.balance-data-when-scaling-storage.md).|
|`spec.enableBR`|`false`|Specifies whether to enable the BR tool. For more information, see [Backup and restore](../10.backup-restore-using-operator.md).|
|`spec.graphd.enable_graph_ssl`|`false`| Specifies whether to enable SSL for the Graph service. For more details, see [Enable mTLS](../8.custom-cluster-configurations/8.5.enable-ssl.md). |


??? info "Expand to view sample cluster configurations"

Expand All @@ -184,90 +118,34 @@ The following example shows how to create a NebulaGraph cluster by creating a cl
name: nebula
namespace: default
spec:
# Used to obtain the Zone information where nodes are located.
alpineImage: "reg.vesoft-inc.com/cloud-dev/nebula-alpine:latest"
# Used for backup and recovery as well as log cleanup functions.
# If you do not customize this configuration,
# the default configuration will be used.
agent:
image: reg.vesoft-inc.com/cloud-dev/nebula-agent
version: v3.6.0-sc
exporter:
image: vesoft/nebula-stats-exporter
replicas: 1
maxRequests: 20
# Used to create a console container,
# which is used to connect to the NebulaGraph cluster.
console:
version: "nightly"
graphd:
config:
# The following parameters are required for creating a cluster with Zones.
accept_partial_success: "true"
ca_client_path: certs/root.crt
ca_path: certs/root.crt
cert_path: certs/server.crt
key_path: certs/server.key
enable_graph_ssl: "true"
prioritize_intra_zone_reading: "true"
stick_to_intra_zone_on_failure: "true"
sync_meta_when_use_space: "true"
stick_to_intra_zone_on_failure: "false"
session_reclaim_interval_secs: "300"
# The following parameters are required for collecting logs.
logtostderr: "1"
redirect_stdout: "false"
stderrthreshold: "0"
initContainers:
- name: init-auth-sidecar
imagePullPolicy: IfNotPresent
image: 496756745489.dkr.ecr.us-east-1.amazonaws.com/auth-sidecar:v1.60.0
env:
- name: AUTH_SIDECAR_CONFIG_FILENAME
value: sidecar-init
volumeMounts:
- name: credentials
mountPath: /credentials
- name: auth-sidecar-config
mountPath: /etc/config
sidecarContainers:
- name: auth-sidecar
image: 496756745489.dkr.ecr.us-east-1.amazonaws.com/auth-sidecar:v1.60.0
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: 100m
memory: 500Mi
env:
- name: LOCAL_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: LOCAL_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: LOCAL_POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
readinessProbe:
httpGet:
path: /ready
port: 8086
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
livenessProbe:
httpGet:
path: /live
port: 8086
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
volumeMounts:
- name: credentials
mountPath: /credentials
- name: auth-sidecar-config
mountPath: /etc/config
volumes:
- name: credentials
emptyDir:
medium: Memory
volumeMounts:
- name: credentials
mountPath: /usr/local/nebula/certs
resources:
requests:
cpu: "2"
Expand All @@ -286,6 +164,8 @@ The following example shows how to create a NebulaGraph cluster by creating a cl
# Zone names CANNOT be modified once set.
# It's suggested to set an odd number of Zones.
zone_list: az1,az2,az3
validate_session_timestamp: "false"
# LM access address and port number.
licenseManagerURL: "192.168.8.xxx:9119"
resources:
requests:
Expand Down Expand Up @@ -332,11 +212,83 @@ The following example shows how to create a NebulaGraph cluster by creating a cl
imagePullPolicy: Always
imagePullSecrets:
- name: nebula-image
# Evenly distribute storage Pods across Zones.
# Must be set when using Zones.
topologySpreadConstraints:
- topologyKey: "topology.kubernetes.io/zone"
whenUnsatisfiable: "DoNotSchedule"
```

!!! caution

Make sure storage Pods are evenly distributed across zones before ingesting data by running `SHOW ZONES` in nebula-console. For zone-related commands, see [Zones](../../4.deployment-and-installation/5.zone.md).

You must set the following parameters for creating a cluster with Zones. Other parameters can be changed as needed.

| Parameter | Default value | Description |
| :---- | :--- | :--- |
| `spec.metad.licenseManagerURL` | - | Configure the URL that points to the LM, which consists of the access address and port number (default port `9119`) of the LM. For example, `192.168.8.100:9119`. **You must configure this parameter in order to obtain the license information; otherwise, the enterprise edition cluster cannot be used.** |
|`spec.<graphd|metad|storaged>.image`|-|The container image of the Graph, Meta, or Storage service of the enterprise edition.|
|`spec.imagePullSecrets`| - |Specifies the Secret for pulling the NebulaGraph Enterprise service images from a private repository.|
|`spec.alpineImage`|`reg.vesoft-inc.com/nebula-alpine:latest`|The Alpine Linux image, used to obtain the Zone information where nodes are located.|
|`spec.metad.config.zone_list`|-|A list of zone names, split by comma. For example: zone1,zone2,zone3. <br/>**Zone names CANNOT be modified once be set.**|
|`spec.graphd.config.prioritize_intra_zone_reading`|`false`|Specifies whether to prioritize sending queries to the storage pods in the same zone.<br/>When set to `true`, the query is sent to the storage pods in the same zone. If reading fails in that Zone, it will decide based on `stick_to_intra_zone_on_failure` whether to read the leader partition replica data from other Zones. |
|`spec.graphd.config.stick_to_intra_zone_on_failure`|`false`|Specifies whether to stick to intra-zone routing if unable to find the requested partitions in the same zone. When set to `true`, if unable to find the partition replica in that Zone, it does not read data from other Zones.|
|`spec.topologySpreadConstraints[0].topologyKey`|``| It is a field in Kubernetes used to control the distribution of storage Pods. Its purpose is to ensure that your storage Pods are evenly spread across Zones. <br/>To use the Zone feature, you must set the value to `topology.kubernetes.io/zone`. Run `kubectl get node --show-labels` to check the key. For more information, see [TopologySpread](https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/#example-multiple-topologyspreadconstraints).|

???+ note "Learn more about Zones in NebulaGraph Operator"

**Understanding NebulaGraph's Zone Feature**

NebulaGraph utilizes a feature called Zones to efficiently manage its distributed architecture. Each Zone represents a logical grouping of Storage pods and Graph pods, responsible for storing the complete graph space data. The data within NebulaGraph's spaces is partitioned, and replicas of these partitions are evenly distributed across all available Zones. The utilization of Zones can significantly reduce inter-Zone network traffic costs and boost data transfer speeds. Moreover, intra-zone-reading allows for increased availability, because replicas of a partition spread out among different zones.

**Configuring NebulaGraph Zones**

To make the most of the Zone feature, you first need to determine the actual Zone where your cluster nodes are located. Typically, nodes deployed on cloud platforms are labeled with their respective Zones. Once you have this information, you can configure it in your cluster's configuration file by setting the `spec.metad.config.zone_list` parameter. This parameter should be a list of Zone names, separated by commas, and should match the actual Zone names where your nodes are located. For example, if your nodes are in Zones `az1`, `az2`, and `az3`, your configuration would look like this:

```yaml
spec:
metad:
config:
zone_list: az1,az2,az3
```

**Operator's Use of Zone Information**

NebulaGraph Operator leverages Kubernetes' [TopoloySpread](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/) feature to manage the scheduling of Storage and Graph pods. Once the `zone_list` is configured, Storage services are automatically assigned to their respective Zones based on the `topology.kubernetes.io/zone` label.

For intra-zone data access, the Graph service dynamically assigns itself to a Zone using the `--assigned_zone=$NODE_ZONE` parameter. It identifies the Zone name of the node where the Graph service resides by utilizing an init-container to fetch this information. The Alpine Linux image specified in `spec.alpineImage` (default: `reg.vesoft-inc.com/nebula-alpine:latest`) plays a role in obtaining Zone information.

**Prioritizing Intra-Zone Data Access**

By setting `spec.graphd.config.prioritize_intra_zone_reading` to `true` in the cluster configuration file, you enable the Graph service to prioritize sending queries to Storage services within the same Zone. In the event of a read failure within that Zone, the behavior depends on the value of `spec.graphd.config.stick_to_intra_zone_on_failure`. If set to `true`, the Graph service avoids reading data from other Zones and returns an error. Otherwise, it reads data from leader partition replicas in other Zones.

```yaml
spec:
alpineImage: reg.vesoft-inc.com/cloud-dev/nebula-alpine:latest
graphd:
config:
prioritize_intra_zone_reading: "true"
stick_to_intra_zone_on_failure: "false"
```

**Zone Mapping for Resilience**

Once Storage and Graph services are assigned to Zones, the mapping between the pod and its corresponding Zone is stored in a configmap named `<cluster_name>-graphd|storaged-zone`. This mapping facilitates pod scheduling during rolling updates and pod restarts, ensuring that services return to their original Zones as needed.

!!! caution

DO NOT manually modify the configmaps created by NebulaGraph Operator. Doing so may cause unexpected behavior.


Other optional parameters for the enterprise edition are as follows:

| Parameter | Default value | Description |
| :---- | :--- | :--- |
|`spec.storaged.enableAutoBalance`| `false`| Specifies whether to enable automatic data balancing. For more information, see [Balance storage data after scaling out](../8.custom-cluster-configurations/8.3.balance-data-when-scaling-storage.md).|
|`spec.enableBR`|`false`|Specifies whether to enable the BR tool. For more information, see [Backup and restore](../10.backup-restore-using-operator.md).|
|`spec.graphd.enable_graph_ssl`|`false`| Specifies whether to enable SSL for the Graph service. For more details, see [Enable mTLS](../8.custom-cluster-configurations/8.5.enable-ssl.md). |

{{ ent.ent_end }}

1. Create a NebulaGraph cluster.
Expand Down
Loading