Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEYS should be possible #148

Open
sgohl opened this issue May 15, 2019 · 4 comments
Open

KEYS should be possible #148

sgohl opened this issue May 15, 2019 · 4 comments

Comments

@sgohl
Copy link

sgohl commented May 15, 2019

From my perspective, the limitation of not being able to get all KEYS is a total dealbreaker.

Being sure, that this is only a design limitation made willingly, I want to ask politely if you really don't see any chance of implementing this anyhow.

Technically, it's just that all cluster nodes have to be SCAN from corvus and responded to the client just as if it was one single instance. This may be slower, but better than not available.

Since SCAN is also not supported, there seems to be no alternative for getting all keys in a cluster.

Is there any technical reason why this can't be done?

@sgohl sgohl changed the title KEYS must be possible KEYS should be possible May 15, 2019
@jasonjoo2010
Copy link
Contributor

scan requires iterator. Iterators are different from each group of nodes. So there are mainly two ways to implement it:

Altogether.

You send a simple command scan 2 to proxy and receive a list of result like:

10.10.99.21:6379

1 ) 29173
2 )

1 ) key1 => val1
...
N) keyN => valN

10.10.99.25:6393

1 ) 33123
2 )

1 ) key1 => val1
...
N) keyN => valN

So you must take all the cursors as parameters to proxy to fetch next page. Command like scan 10.10.99.21:6379:29173 10.10.99.25:6393:33123 ...

Oh it's complicated.

One Backend Each Time

It will be better. But it also need to specify the backend and cursor together to do scanning.

Scenario

In most scenarios that you need scan is something in operations. But we have more ways to do it including fetch the map of the cluster and do it in a script.

So what's your scenario here?

@sgohl
Copy link
Author

sgohl commented May 20, 2019

Hey, thanks for your time. There's a big possibility here that I am doing or thinking wrong due to lack of knowledge to details or inner workings.

But my understanding of the sense of a proxy in that case would be the simulating the returns of a single redis instance.
The actual need for KEYS came from the simple point of view that you just want to get all keys because there's no possibility to pre-iterate; the client does not know the names of the Keys he wants, so to speak.

Maybe I am missing something completely, or there's a better way of doing it

@jasonjoo2010
Copy link
Contributor

Hey, thanks for your time. There's a big possibility here that I am doing or thinking wrong due to lack of knowledge to details or inner workings.

But my understanding of the sense of a proxy in that case would be the simulating the returns of a single redis instance.
The actual need for KEYS came from the simple point of view that you just want to get all keys because there's no possibility to pre-iterate; the client does not know the names of the Keys he wants, so to speak.

Maybe I am missing something completely, or there's a better way of doing it

That depends.

KEYS

The command KEYS is evil marked on official website redis.io because it costs much especially there are so many keys exists. (For single instance)
Because REDIS is running in serial executing model all following commands are blocked during execution of KEYS. So it's recommended replacing it into SCAN officially.

SCAN

First let's go to the difference:

  1. It doesn't support by prefix.
  2. It will not ensure that one key occurs once. That means it will be unique in one calling but may not be unique between callings.
  3. It will not generate a collection in one time. We call it walk is more exact to list compared to KEYS.
  4. Though you will get a number to be used next calling but the callings may be unlimited(walk forever).

So implementing it is complicated.

Scenario

So we need turn back to the scenario we meet. If it's a common logic we should turn to other implementation. (Change the logic)

But if you for some operations' reason (eg. Deleting some keys set wrong, thus, a repair after suffering a wrong logic) you can write a script to walk the cluster manually.(Bypass the proxy)

Does it help?

@doyoubi
Copy link
Contributor

doyoubi commented Jun 21, 2019

KEYS command is dangerous. SCAN is not that easy to implement which requires storing two indices inside one number. Anyway, I recommend use some replication tools (redis-migrate-tool) to get all the data especially for analysing the stored data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants