Skip to content

Commit

Permalink
Operator Horizontal Scale (#139)
Browse files Browse the repository at this point in the history
* Add horizontal scale feature
* New design and architecture to support horizontal scale and recovery flows
  • Loading branch information
NataliAharoniPayu authored Jul 10, 2022
1 parent a1a57c9 commit 7c6427c
Show file tree
Hide file tree
Showing 39 changed files with 4,074 additions and 1,347 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ bin
*.swp
*.swo
*~
.metals*
.vscode*
.vscode/*
!.vscode/settings.json
!.vscode/tasks.json
Expand Down
11 changes: 7 additions & 4 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Build the manager binary
FROM golang:1.15 as builder
FROM golang:1.16 as builder

ARG DEBIAN_FRONTEND=noninteractive

Expand All @@ -10,6 +10,7 @@ RUN apt-get update \
&& apt-get install -y curl

# install redis cli

RUN cd /tmp &&\
curl http://download.redis.io/redis-stable.tar.gz | tar xz &&\
make -C redis-stable &&\
Expand All @@ -28,18 +29,20 @@ COPY main.go main.go
COPY api/ api/
COPY controllers/ controllers/
COPY server/ server/
COPY data/ data/

# Build
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -a -o manager main.go

# Use distroless as minimal base image to package the manager binary
# Refer to https://github.com/GoogleContainerTools/distroless for more details
FROM gcr.io/distroless/base-debian10
FROM gcr.io/distroless/base-debian11
WORKDIR /
COPY --from=builder /workspace/manager .
COPY --from=builder /bin/redis-cli .
COPY --from=builder /bin ./bin
COPY --from=builder /lib ./lib
COPY --from=builder /usr/bin/yes .
USER nonroot:nonroot
ENV PATH="./:${PATH}"

ENTRYPOINT ["/manager"]
ENTRYPOINT ["/manager"]
38 changes: 38 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,22 @@ Requirements:
* `kustomize` >= 4.0
* `docker`: latest version, at least 6.25 GB of memory limit

**Quick Start**

```bash
sh ./hack/install.sh # you might need to run this as sudo if a regular user can't use docker
cd ./hack/redis-bin -> sh ./run.sh # in case it is the first run on the local machine
tilt up
```

**Set up on non-local env**

**1. Setting up a cluster**

```bash
cd hack
sh ./install.sh # you might need to run this as sudo if a regular user can't use docker
cd ./hack/redis-bin -> sh ./run.sh # in case it is the first run on the local machine
```

If the `.kube/config` file was not updated it can be populated using
Expand Down Expand Up @@ -82,6 +93,33 @@ Run non cache `go test` command on specific path. for example:
go test -count=1 ./controllers/rediscli/
```

### Use the test cluster feature

Test cluster feature is a set of tests implemented to run asynchrounously to the operator manager loop, they simulates:
* Loss of random follower in cluster
* Loss of random leader in cluster
* Loss of random follower and random leader (that owns different set of slots)
* Loss of all followers
* Loss of all nodes beside one replica for each set of slots range, randomely chosen - somethines the survivor is follower and sometimes it is leader (actual scenario for example is loss of all az's beside one)
* Loss of leader and all of its followrs

The test can fill the cluster nodes with mock data during to the performed "live site", and keep track on which and how many of the inserted keys succeeded, and at the end of the recovery process attempts to read the keys from cluster and match the value to the value that got reported during tracking writting process.

The report reflects:
* If the recovery process suceeded with healthy and ready cluster before test time out expired (configurable estimated value)
* How many writes succeeded with ratio to the itended keys that been attempted to be inserted
* How many reads succeded with ratio to the number of successful writes
* Both values in terms of: Actual amount and Success rate

Run the test:
* Port forward the manager to some local port (8080 for example)
* ```Curl -X POST localhost:<forwarded port for example 8080>/test``` (no mock data)
* ```Curl -X POST localhost:<forwarded port for example 8080>/testData``` (with mock data)

Note:
Running the test lab with mock data is concidered sensitive operation, and naturally is not allowed.
In order to enable it, the config param 'ExposeSensitiveEntryPoints' need to be set to 'true' (please follow the config file documentation regard this param before doing so).

### Development using Tilt

The recommended development flow is based on [Tilt](https://tilt.dev/) - it is used for quick iteration on code running in live containers.
Expand Down
134 changes: 81 additions & 53 deletions config/configfiles/operator.conf
Original file line number Diff line number Diff line change
@@ -1,31 +1,44 @@


# The setters value defines a set of indicators for the operator to determine if to adjust or avoid a behaviour that can be practiced as default based on the booleanic value of the related indicator

# The operator exposes set of entry points that can serve the user in case of need, part of them are sensitive and holds the potential to harm existing data if the operator is deployed on production.
# Those same entry points can serve the user for debug, validation and testing if operator is deployed on development environment.
# The following indicator serves as a 'feature-bit' that tells the operator to hide those sensitive entry points in order to avoid harm on sensitive environment, naturally it is set to be 'false' (Recommended).
# ExposeSensitiveEntryPoints

# The thresholds value sets definite bounderies for the operator to perform during running concurrent operations
# and during decision making based on given stated values

# During new node initialization, a request for data replication is sent, and each new node is being sampled and watched untill thre
# in order to make sure the sync process is being performed properly
# SyncMatchThreshold

# During recovery process, missing pods will be recreated asynchronously,
# this value set the maximum unhealthy nodes that will be recovered by operator at once per reconcile loop
# MaxToleratedPodsRecoverAtOnce

# During updating process, pods get failed-over, removed, and recreated so the new ones will hold
# the new requested updated form in terms of the new spec.
# this value set the maximum number of nodes to be deleted at once per update loop
# MaxToleratedPodsUpdateAtOnce

# The wait times are defined by an interval value - how often the check is done
# and a timeout value, total amount of time to wait before considering the
# operation failed.

# Wait duration for the SYNC operation start. After a new node connects to a leader
# there can be a delay before the sync operation starts.
# SyncStartCheckInterval
# SyncStartCheckTimeout

# Wait duration for the SYNC operation.
# Wait values for the SYNC operation.
# SyncCheckInterval
# SyncCheckTimeout

# Wait duration for the LOAD operation start.
# LoadStartCheckInterval
# LoadStartCheckTimeout

# Wait duration for the LOAD operation. This time should be set reasonably high
# because it depends on the size of the DB shards and network latency. Make sure
# the time is high enough to allow for the data transfer between two nodes.
# The LOAD and SYNC operations are important during the recreation of a lost
# node, when the data from a leader is loaded on a replica.
# https://redis.io/topics/replication
# The operator uses the INFO message from Redis to get information about the
# status of SYNC (master_sync_in_progress) and LOAD (loading_eta_seconds)
# https://redis.io/commands/info
# LoadCheckInterval
# LoadCheckTimeout
# Wait values for redis cluster configuration alignment.
# SleepDuringTablesAlignProcess
# RedisNodesAgreeAboutSlotsConfigCheckInterval
# RedisNodesAgreeAboutSlotsConfigTimeout

# Wait duration of the '--cluster create' command.
# ClusterCreateInterval
# ClusterCreateTimeout

# The estimated time it takes for volume mounted configmaps to be updated on the
# pods. After a configmap is changed, the configmap controller will update a
Expand All @@ -35,10 +48,6 @@
# The estimated time it takes for Redis to load the new config map from disk.
# ACLFileLoadDuration

# Wait duration of the '--cluster create' command.
# ClusterCreateInterval
# ClusterCreateTimeout

# Wait duration for a pod to be in ready state - pod is in Ready state and
# the containers passed all conditions.
# PodReadyCheckInterval
Expand All @@ -52,6 +61,10 @@
# PodDeleteCheckInterval
# PodDeleteCheckTimeout

# Wait duration for the removal of node id from other nodes tables
# RedisRemoveNodeCheckInterval
# RedisRemoveNodeTimeout

# Duration of the PING command.
# RedisPingCheckInterval
# RedisPingCheckTimeout
Expand All @@ -60,6 +73,10 @@
# RedisClusterReplicationCheckInterval
# RedisClusterReplicationCheckTimeout

# Wait duration for nodes to load dataset to their memory
# WaitForRedisLoadDataSetInMemoryCheckInterval
# WaitForRedisLoadDataSetInMemoryTimeout

# Wait duration of the MEET command.
# RedisClusterMeetCheckInterval
# RedisClusterMeetCheckTimeout
Expand All @@ -74,32 +91,43 @@
# RedisAutoFailoverCheckInterval
# RedisAutoFailoverCheckTimeout

# SleepIfForgetNodeFails
# If forget node function fails, sleep before taking any deletion or irreversible action

setters:
ExposeSensitiveEntryPoints: false
thresholds:
SyncMatchThreshold: 90
MaxToleratedPodsRecoverAtOnce: 15
MaxToleratedPodsUpdateAtOnce: 5
times:
syncStartCheckInterval: 500ms
syncStartCheckTimeout: 15000ms
syncCheckInterval: 500ms
syncCheckTimeout: 15000ms
loadCheckInterval: 500ms
loadCheckTimeout: 180000ms
loadStartCheckInterval: 500ms
loadStartCheckTimeout: 180000ms
clusterCreateInterval: 5000ms
clusterCreateTimeout: 30000ms
aclFilePropagationDuration: 5000ms
aclFileLoadDuration: 500ms
podReadyCheckInterval: 2000ms
podReadyCheckTimeout: 30000ms
podNetworkCheckInterval: 2000ms
podNetworkCheckTimeout: 30000ms
podDeleteCheckInterval: 2000ms
podDeleteCheckTimeout: 30000ms
redisPingCheckInterval: 2000ms
redisPingCheckTimeout: 30000ms
redisClusterReplicationCheckInterval: 2000ms
redisClusterReplicationCheckTimeout: 30000ms
redisClusterMeetCheckInterval: 2000ms
redisClusterMeetCheckTimeout: 30000ms
redisManualFailoverCheckInterval: 2000ms
redisManualFailoverCheckTimeout: 30000ms
redisAutoFailoverCheckInterval: 2000ms
redisAutoFailoverCheckTimeout: 30000ms
SyncCheckInterval: 5000ms
SyncCheckTimeout: 30000ms
SleepDuringTablesAlignProcess: 12000ms
ClusterCreateInterval: 5000ms
ClusterCreateTimeout: 90000ms
ACLFilePropagationDuration: 5000ms
ACLFileLoadDuration: 5000ms
PodReadyCheckInterval: 3000ms
PodReadyCheckTimeout: 30000ms
PodNetworkCheckInterval: 3000ms
PodNetworkCheckTimeout: 60000ms
PodDeleteCheckInterval: 3000ms
PodDeleteCheckTimeout: 60000ms
RedisPingCheckInterval: 2000ms
RedisPingCheckTimeout: 20000ms
RedisClusterReplicationCheckInterval: 2000ms
RedisClusterReplicationCheckTimeout: 30000ms
RedisClusterMeetCheckInterval: 2000ms
RedisClusterMeetCheckTimeout: 10000ms
RedisManualFailoverCheckInterval: 5000ms
RedisManualFailoverCheckTimeout: 40000ms
RedisAutoFailoverCheckInterval: 5000ms
RedisAutoFailoverCheckTimeout: 40000ms
RedisNodesAgreeAboutSlotsConfigCheckInterval: 3000ms
RedisNodesAgreeAboutSlotsConfigTimeout: 12000ms
RedisRemoveNodeCheckInterval: 2000ms
RedisRemoveNodeTimeout: 20000ms
WaitForRedisLoadDataSetInMemoryCheckInterval: 2000ms
WaitForRedisLoadDataSetInMemoryTimeout: 10000ms
SleepIfForgetNodeFails: 20000ms
4 changes: 3 additions & 1 deletion config/configfiles/redis.conf
Original file line number Diff line number Diff line change
Expand Up @@ -1513,6 +1513,8 @@ cluster-require-full-coverage no
# In order to setup your cluster make sure to read the documentation
# available at https://redis.io web site.

enable-debug-command yes

########################## CLUSTER DOCKER/NAT support ########################

# In certain deployments, Redis Cluster nodes address discovery fails, because
Expand Down Expand Up @@ -2048,4 +2050,4 @@ jemalloc-bg-thread yes
# by setting the following config which takes a space delimited list of warnings
# to suppress
#
# ignore-warnings ARM64-COW-BUG
# ignore-warnings ARM64-COW-BUG
2 changes: 1 addition & 1 deletion config/configfiles/users.acl
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
user default off nopass -@all
user admin on #713bfda78870bf9d1b261f565286f85e97ee614efe5f0faf7c34e7ca4f65baca ~* &* +@all
user testuser on #13d249f2cb4127b40cfa757866850278793f814ded3c587fe5889e889a7a9f6c ~testkey:* &* -@all +get +set
user rdcuser on #400f9f96b4a343f4766d29dbe7bee178d7de6e186464d22378214c0232fb38ca &* -@all +replconf +ping +psync
user rdcuser on #400f9f96b4a343f4766d29dbe7bee178d7de6e186464d22378214c0232fb38ca &* -@all +replconf +ping +psync
6 changes: 3 additions & 3 deletions config/configmap/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,16 @@ configMapgenerator:
- ../configfiles/redis.conf
options:
labels:
redis-cluster: rdc-test
redis-cluster: dev-rdc
- name: users-acl
files:
- ../configfiles/users.acl
options:
labels:
redis-cluster: rdc-test
redis-cluster: dev-rdc
- name: operator-config
files:
- ../configfiles/operator.conf
options:
labels:
redis-operator: rdc-test
redis-operator: dev-rdc
2 changes: 1 addition & 1 deletion config/manager/base/manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ metadata:
namespace: system
labels:
control-plane: controller-manager
redis-operator: rdc-test
redis-operator: dev-rdc
spec:
selector:
matchLabels:
Expand Down
2 changes: 1 addition & 1 deletion config/samples/local_cluster.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: db.payu.com/v1
kind: RedisCluster
metadata:
name: rdc-test
name: dev-rdc
namespace: default
spec:
leaderCount: 3
Expand Down
2 changes: 1 addition & 1 deletion config/samples/updated_cluster.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: db.payu.com/v1
kind: RedisCluster
metadata:
name: rdc-test
name: dev-rdc
namespace: default
spec:
leaderCount: 3
Expand Down
Loading

0 comments on commit 7c6427c

Please sign in to comment.