Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding RedisCluster client to support Redis Cluster Mode #1660

Merged
merged 39 commits into from
Nov 25, 2021

Conversation

barshaul
Copy link
Contributor

@barshaul barshaul commented Oct 28, 2021

Pull Request check-list

Please make sure to review and check all of these items:

  • [ V] Does $ tox pass with this change (including linting)?
  • [V ] Do the CI tests pass with this change (enable it first in your forked repo and wait for the github action build to finish)?
  • [ V] Is the new or changed code fully tested?
  • [ V] Is a documentation update included (if this change modifies existing APIs, or introduces new ones)?

NOTE: these things are not required to open a PR and can be done
afterwards / while the PR is open.

Description of change

closes #931

Added support for cluster mode and provides a client for
Redis Cluster.

The cluster client is based on redis-py-cluster
by Grokzen, with a lot of added and
changed functionality.

Create RedisCluster:

Connecting redis-py to the Redis Cluster instance(s) is easy.
RedisCluster requires at least one node to discover the whole cluster nodes,
and there is multiple ways of creating a RedisCluster instance:

  • Use the 'host' and 'port' arguments:
    >>> from redis.cluster import RedisCluster as Redis
    >>> rc = Redis(host='localhost', port=6379)
    >>> print(rc.get_nodes())
    [[host=127.0.0.1,port=6379,name=127.0.0.1:6379,server_type=primary,redis_connection=Redis<ConnectionPool<Connection<host=127.0.0.1,port=6379,db=0>>>], [host=127.0.0.1,port=6378,name=127.0.0.1:6378,server_type=primary,redis_connection=Redis<ConnectionPool<Connection<host=127.0.0.1,port=6378,db=0>>>], [host=127.0.0.1,port=6377,name=127.0.0.1:6377,server_type=replica,redis_connection=Redis<ConnectionPool<Connection<host=127.0.0.1,port=6377,db=0>>>]]
  • Use Redis URL:
    >>> from redis.cluster import RedisCluster as Redis
    >>> rc = Redis.from_url("redis://localhost:6379/0")
  • Use ClusterNode(s):
    >>> from redis.cluster import RedisCluster as Redis
    >>> from redis.cluster import ClusterNode
    >>> nodes = [ClusterNode('localhost', 6379), ClusterNode('localhost', 6378)]
    >>> rc = Redis(startup_nodes=nodes)

When a RedisCluster instance is being created it first attempts to establish a
connection to one of the provided startup nodes. If none of the startup nodes
are reachable, a 'RedisClusterException' will be thrown.
After a connection to the one of the cluster's nodes is established, the
RedisCluster instance will be initialized with 3 caches:
a slots cache which maps each of the 16384 slots to the node/s handling them,
a nodes cache that contains ClusterNode objects (name, host, port, redis connection)
for all of the cluster's nodes, and a commands cache contains all the server
supported commands that were retrieved using the Redis 'COMMAND' output.

RedisCluster instance can be directly used to execute Redis commands. When a
command is being executed through the cluster instance, the target node(s) will
be internally determined. When using a key-based command, the target node will
be the node that holds the key's slot.
Cluster management commands or other cluster commands have predefined node
group targets (all-primaries, all-nodes, random-node, all-replicas), which are
outlined in the command’s function documentation.
For example, ‘KEYS’ command will be sent to all primaries and return all keys
in the cluster, and ‘CLUSTER NODES’ command will be sent to a random node.
Other management commands will require you to pass the target node/s to execute
the command on.

    >>> # target-nodes: the node that holds 'foo1's key slot
    >>> rc.set('foo1', 'bar1')
    >>> # target-nodes: the node that holds 'foo2's key slot
    >>> rc.set('foo2', 'bar2')
    >>> # target-nodes: the node that holds 'foo1's key slot
    >>> print(rc.get('foo1'))
    b'bar'
    >>> # target-nodes: all-primaries
    >>> print(rc.keys())
    [b'foo1', b'foo2']
    >>> # target-nodes: all-nodes
    >>> rc.flushall()

Specifying Target Nodes:

As mentioned above, some RedisCluster commands will require you to provide the
target node/s that you want to execute the command on, and in other cases, the
target node will be determined by the client itself. That being said, ALL
RedisCluster commands can be executed against a specific node or a group of
nodes by passing the command kwarg target_nodes.
The best practice is to specify target nodes using RedisCluster class's node
flags: PRIMARIES, REPLICAS, ALL_NODES, RANDOM. When a nodes flag is passed
along with a command, it will be internally resolved to the relevant node/s.
If the nodes topology of the cluster changes during the execution of a command,
the client will be able to resolve the nodes flag again with the new topology
and attempt to retry executing the command.

    >>> from redis.cluster import RedisCluster as Redis
    >>> # run cluster-meet command on all of the cluster's nodes
    >>> rc.cluster_meet(Redis.ALL_NODES, '127.0.0.1', 6379)
    >>> # ping all replicas
    >>> rc.ping(Redis.REPLICAS)
    >>> # ping a specific node
    >>> rc.ping(Redis.RANDOM)
    >>> # ping all nodes in the cluster, default command behavior
    >>> rc.ping()
    >>> # execute bgsave in all primaries
    >>> rc.bgsave(Redis.PRIMARIES)

You could also pass ClusterNodes directly if you want to execute a command on a
specific node / node group that isn't addressed by the nodes flag. However, if
the command execution fails due to cluster topology changes, a retry attempt
will not be made, since the passed target node/s may no longer be valid, and
the relevant cluster or connection error will be returned.

    >>> node = rc.get_node('localhost', 6379)
    >>> # Get the keys only for that specific node
    >>> rc.keys(node)
    >>> # get Redis info from a subset of primaries
    >>> subset_primaries = [node for node in rc.get_primaries() if node.port > 6378]
    >>> rc.info(subset_primaries)

In addition, you can use the RedisCluster instance to obtain the Redis instance
of a specific node and execute commands on that node directly. The Redis client,
however, cannot handle cluster failures and retries.

    >>> cluster_node = rc.get_node(host='localhost', port=6379)
    >>> print(cluster_node)
    [host=127.0.0.1,port=6379,name=127.0.0.1:6379,server_type=primary,redis_connection=Redis<ConnectionPool<Connection<host=127.0.0.1,port=6379,db=0>>>]
    >>> r = cluster_node.redis_connection
    >>> r.client_list()
    [{'id': '276', 'addr': '127.0.0.1:64108', 'fd': '16', 'name': '', 'age': '0', 'idle': '0', 'flags': 'N', 'db': '0', 'sub': '0', 'psub': '0', 'multi': '-1', 'qbuf': '26', 'qbuf-free': '32742', 'argv-mem': '10', 'obl': '0', 'oll': '0', 'omem': '0', 'tot-mem': '54298', 'events': 'r', 'cmd': 'client', 'user': 'default'}]
    >>> # Get the keys only for that specific node
    >>> r.keys()
    [b'foo1']

Multi-key commands:

Redis supports multi-key commands in Cluster Mode, such as Set type unions or
intersections, mset and mget, as long as the keys all hash to the same slot.
By using RedisCluster client, you can use the known functions (e.g. mget, mset)
to perform an atomic multi-key operation. However, you must ensure all keys are
mapped to the same slot, otherwise a RedisClusterException will be thrown.
Redis Cluster implements a concept called hash tags that can be used in order
to force certain keys to be stored in the same hash slot, see
Keys hash tag.
You can also use nonatomic for some of the multikey operations, and pass keys
that aren't mapped to the same slot. The client will then map the keys to the
relevant slots, sending the commands to the slots' node owners. Non-atomic
operations batch the keys according to their hash value, and then each batch is
sent separately to the slot's owner.

    #  Atomic operations can be used when all keys are mapped to the same slot
    >>> rc.mset({'{foo}1': 'bar1', '{foo}2': 'bar2'})
    >>> rc.mget('{foo}1', '{foo}2')
    [b'bar1', b'bar2']
    # Non-atomic multi-key operations splits the keys into different slots
    >>> rc.mset_nonatomic({'foo': 'value1', 'bar': 'value2', 'zzz': 'value3')
    >>> rc.mget_nonatomic('foo', 'bar', 'zzz')
    [b'value1', b'value2', b'value3']

Cluster PubSub:

When a ClusterPubSub instance is created without specifying a node, a single
node will be transparently chosen for the pubsub connection on the
first command execution. The node will be determined by:

  1. Hashing the channel name in the request to find its keyslot
  2. Selecting a node that handles the keyslot: If read_from_replicas is
    set to true, a replica can be selected.

Known limitations with pubsub:

Pattern subscribe and publish do not work properly because if we hash a pattern
like fo* we will get a keyslot for that string but there is a endless
possibilities of channel names based on that pattern that we can’t know in
advance. This feature is not limited but the commands is not recommended to use
right now.
See redis-py-cluster documentaion
for more.

    >>> p1 = rc.pubsub()
    # p1 connection will be set to the node that holds 'foo' keyslot
    >>> p1.subscribe('foo')
    # p2 connection will be set to node 'localhost:6379'
    >>> p2 = rc.pubsub(rc.get_node('localhost', 6379))

Read Only Mode

By default, Redis Cluster always returns MOVE redirection response on accessing
a replica node. You can overcome this limitation and scale read commands with
READONLY mode.

To enable READONLY mode pass read_from_replicas=True to RedisCluster
constructor. When set to true, read commands will be assigned between the
primary and its replications in a Round-Robin manner.

You could also enable READONLY mode in runtime by running readonly() method,
or disable it with readwrite().

    >>> from cluster import RedisCluster as Redis
    # Use 'debug' mode to print the node that the command is executed on
    >>> rc_readonly = Redis(startup_nodes=startup_nodes, 
                    read_from_replicas=True, debug=True)
    >>> rc_readonly.set('{foo}1', 'bar1')
    >>> for i in range(0, 4):
            # Assigns read command to the slot's hosts in a Round-Robin manner
    >>>     rc_readonly.get('{foo}1')
    # set command would be directed only to the slot's primary node
    >>> rc_readonly.set('{foo}2', 'bar2')
    # reset READONLY flag
    >>> rc_readonly.readwrite()
    # now the get command would be directed only to the slot's primary node
    >>> rc_readonly.get('{foo}1')

See Redis Cluster tutorial and
Redis Cluster specifications
to learn more about Redis Cluster.

@codecov-commenter
Copy link

codecov-commenter commented Oct 28, 2021

Codecov Report

Merging #1660 (d5c14c8) into master (d2b2333) will increase coverage by 1.08%.
The diff coverage is 93.73%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1660      +/-   ##
==========================================
+ Coverage   89.04%   90.13%   +1.08%     
==========================================
  Files          53       59       +6     
  Lines       11052    14085    +3033     
==========================================
+ Hits         9841    12695    +2854     
- Misses       1211     1390     +179     
Impacted Files Coverage Δ
redis/commands/cluster.py 76.97% <76.97%> (ø)
tests/conftest.py 91.83% <88.67%> (-0.43%) ⬇️
redis/commands/parser.py 88.70% <88.70%> (ø)
redis/cluster.py 89.86% <89.86%> (ø)
redis/commands/core.py 83.94% <91.66%> (+0.06%) ⬆️
redis/exceptions.py 98.46% <96.29%> (-1.54%) ⬇️
tests/test_cluster.py 98.46% <98.46%> (ø)
redis/__init__.py 85.71% <100.00%> (+1.09%) ⬆️
redis/client.py 83.25% <100.00%> (+1.16%) ⬆️
redis/commands/__init__.py 100.00% <100.00%> (ø)
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d2b2333...d5c14c8. Read the comment docs.

@chayim chayim self-assigned this Oct 28, 2021
@chayim
Copy link
Contributor

chayim commented Oct 28, 2021

@barshaul Thanks for the submission - this is so exciting! We'll start to review it over the next short while - with the goal of getting this into the next beta, as discussed. Just one ask as we go through this - mind having a look, to see why codecov coverage dropped from ~89% down to ~77% with this commit? High test coverage is a big goal.

docker/cluster/redis.conf Outdated Show resolved Hide resolved
tasks.py Outdated Show resolved Hide resolved
tox.ini Show resolved Hide resolved
tox.ini Outdated Show resolved Hide resolved
tests/test_scripting.py Outdated Show resolved Hide resolved
redis/commands/__init__.py Outdated Show resolved Hide resolved
redis/crc.py Outdated Show resolved Hide resolved
redis/client.py Show resolved Hide resolved
redis/client.py Show resolved Hide resolved
tests/conftest.py Outdated Show resolved Hide resolved
tests/conftest.py Outdated Show resolved Hide resolved
tests/test_sentinel.py Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
docker/base/create_cluster.sh Show resolved Hide resolved

# SERVER INFORMATION

class AclCommands:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk, so for this PR scope- I'll refactor the classes' names and add docstring, but I can leave all in the same file, yup?

def check_slots_coverage(self, slots_cache):
# Validate if all slots are covered or if we should try next
# startup node
for i in range(0, REDIS_CLUSTER_HASH_SLOTS):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will not give us the required result. We want to check if all slots from 0 to REDIS_CLUSTER_HASH_SLOTS are found in slots_cache, what you suggests mean that if we'll have slots not in the above range we'll get false, but we can have only one slot in slots_cache and it will return true...

tasks.py Show resolved Hide resolved
redis/cluster.py Outdated Show resolved Hide resolved


class NodesManager:
def __init__(self, startup_nodes, from_url=False,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user creates an instance with redisCluster.from_url(), then the url is being parsed and processed to a startup node in the RedisCluster init method, which then passes the startup node to the nodeManager.
So, no matter what - the nodes manager excepts to get startup_nodes. from_url bool is only there to indicate if we need to create all the other cluster's Redis nodes with a costumed connection pool. See NodesManager::create_redis_node

barshaul and others added 20 commits November 15, 2021 10:37
…on key-based commands to be executed against the default node, if target_nodes were not passed. All tests were adjusted.
…etting the port to the default value (6379). e.g. "redis://localhost" will be parsed to host=localhost, port=6379
…he codecov ignore section. removed the 'non redismod' marker from the pytest run in install_and_test.sh
@chayim chayim merged commit 9db1eec into redis:master Nov 25, 2021
Andrew-Chen-Wang added a commit to aio-libs-abandoned/aioredis-py that referenced this pull request Jan 4, 2022
* redis/redis-py#1660

Signed-off-by: Andrew-Chen-Wang <acwangpython@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Redis Cluster Support?
4 participants