Add PFADD, PFCOUNT, PFMERGE commands #211

ilbaktin · 2021-05-30T22:25:11Z

Hi there! We're using your awesome miniredis library for redis-related unit tests in the company I'm working at. Once we started using HLL-related commands of redis, we also definitely wanted to cover the corresponding logic with unit tests but unfortunately found out that HLL-related commands are not supported in miniredis.

I'm adding HLL-related commands to miniredis with this PR using one quite popular implementation of HLL in golang https://github.com/axiomhq/hyperloglog but not reimplementing the actual C++ implementation of HLL https://download.redis.io/redis-stable/src/hyperloglog.c. Why I did this choice - in fact the miniredis is mainly used for testing purposes and hence it's not necessary to have the 100% compatibility of all the internal details, it's enough to have the compatibility of the protocol and the functionality. Basically the implementation of HLL used in this PR behaves almost the same as redis' variant of HLL. One potential difference I could think of is that the set cardinality estimation accuracy could slightly differ on big sets (containing over 1M of elements), but anyway I'm not sure that somebody plans to have sets of over 1M cardinality in unit tests. Please let me know what you think about it.

Also one arguable question - not sure if it's OK for you that I'm adding a new dependency. I could have added the implementation of HLL explicitly as it's been done for geohash library.

Thanks, please let's discuss all the things. If you have any concern about what might be improved, please let me know.

…eature/hll

alicebob · 2021-05-31T06:31:35Z

On Sun, May 30, 2021 at 03:25:22PM -0700, Konstantin Ilbaktin wrote: Hi there! We're using your awesome miniredis library for redis-related unit tests in the company I'm working at. Once we started using HLL-related commands of redis, we also definitely wanted to cover the corresponding logic with unit tests but unfortunately found out that HLL-related commands are not supported in miniredis.

Hi! Thanks for the PR! I'll have a proper look in a few days.

alicebob · 2021-06-02T07:08:55Z

Thanks, I had a look and the code looks nice and simple. I never used the PF* redis commands, but they are one of the few which do not have a ton of options and arguments and special cases :)

slightly differ on big sets

That's absolutely totally fine :)

About the included hyperloglog library: it's MIT licensed, so how about we add the relevant files to a subdir in miniredis? (hyperloglog.go, registers.go, LICENSE ?). It's straightforward code which I don't expect to change a lot, and saves a dependency.

I assume we can't avoid the murmur dependency. What would happen if we use another hashing function?

Can you add some tests to the ./integration/ subdir? If the numbers don't match up 100% you can use c.DoLoosly() which still compares structures (or maybe c.DoRounded()). I'm happy to help there if you want.

ilbaktin · 2021-06-03T08:58:42Z

Thanks for looking into the PR!

No problem, I can add relevant files from hyperloglog lib to miniredis subdir.

Regarding a hash function - before using Redis HLL commands we tried using some golang HLL lib (several options) and store these HLLs in a database. Trying golang HLL libs I found out that the most of them have better accuracy if I use murmur as a hash function. But again in tests we don't need better accuracy (and moreover HLL lib that I used in this PR has a default hash func - metro hash which is still not bad) so we can get rid of murmur I think.

And regarding integration tests - no problem, I will take a look and try to handle.

Will handle these changes when I have a chance, presumably on the weekend.

alicebob · 2021-06-03T09:03:17Z

Will handle these changes when I have a chance, presumably on the weekend.

Thanks! The comment in the readme about Hyperloglog is from 2015, so there doesn't seem to be much urgency ;)

ilbaktin · 2021-06-03T09:05:09Z

Update: metro lib is a dependency of hyperloglog lib, but metro lib is under MIT license, so we probably can take the necessary hash function right from metro lib to miniredis.

…eature/hll

ilbaktin · 2021-06-12T20:14:33Z

@alicebob Hi again!

Added a few changes as discussed before:

Got rid of hyperloglog and murmur dependencies. Instead I added hyperloglog's and metro's hash source code directly into miniredis repo since they're under MIT license.
Added integration tests for HLL related commands (and found a small bug in response message meanwhile). I handled unstable approximate nature of HLLs in integration tests with a new function DoApprox() comparing ints with a threshold.

Again, let me know what you think!

alicebob · 2021-06-13T14:18:27Z

integration/test.go

@@ -481,6 +482,48 @@ func (c *client) DoRounded(rounded int, cmd string, args ...string) {
 	}
 }

+// result must match, with floats rounded


That's a bit outdated :)

alicebob · 2021-06-13T14:21:48Z

Looks great! I'll have a more proper look in a few days, but I don't expect any issues. Really nice it worked out without adding dependencies.

alicebob · 2021-06-16T08:51:28Z

Thanks for the very easy PR :)

I merged it to master, will make it a proper release in a few days or so.

ilbaktin added 6 commits May 21, 2021 16:48

implement PFADD, PFCOUNT, PFMERGE

1b00e53

fixes for hll related commands, add tests

00e3e94

Merge branch 'master' of https://github.com/alicebob/miniredis into f…

db1a664

…eature/hll

fix broken spop test

fa19783

add more comments to hll-related methods

b47de9a

rm excessive comment

c7c62d2

ilbaktin added 4 commits June 12, 2021 20:55

copy hyperloglog related code to miniredis

f934bf2

add integration tests for hll, fix small bug in hll response

ec032c2

Merge branch 'master' of https://github.com/alicebob/miniredis into f…

cfdeaa5

…eature/hll

added DoApprox func to compare ints with threshold

69ba7de

alicebob reviewed Jun 13, 2021

View reviewed changes

alicebob merged commit 1cd723a into alicebob:master Jun 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PFADD, PFCOUNT, PFMERGE commands #211

Add PFADD, PFCOUNT, PFMERGE commands #211

ilbaktin commented May 30, 2021

alicebob commented May 31, 2021 via email

alicebob commented Jun 2, 2021

ilbaktin commented Jun 3, 2021

alicebob commented Jun 3, 2021 via email

ilbaktin commented Jun 3, 2021

ilbaktin commented Jun 12, 2021

alicebob Jun 13, 2021

alicebob commented Jun 13, 2021

alicebob commented Jun 16, 2021

Add PFADD, PFCOUNT, PFMERGE commands #211

Add PFADD, PFCOUNT, PFMERGE commands #211

Conversation

ilbaktin commented May 30, 2021

alicebob commented May 31, 2021 via email

alicebob commented Jun 2, 2021

ilbaktin commented Jun 3, 2021

alicebob commented Jun 3, 2021 via email

ilbaktin commented Jun 3, 2021

ilbaktin commented Jun 12, 2021

alicebob Jun 13, 2021

Choose a reason for hiding this comment

alicebob commented Jun 13, 2021

alicebob commented Jun 16, 2021