Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Efficiently using a dict of lists #182

Open
fadams opened this issue Apr 16, 2020 · 0 comments
Open

Question: Efficiently using a dict of lists #182

fadams opened this issue Apr 16, 2020 · 0 comments
Assignees
Labels

Comments

@fadams
Copy link

fadams commented Apr 16, 2020

Hi thanks for the great project.
From what I've observed it looks like objects stored in say a RedisDict are "immutable" in the sense that if I were to want to store a dict in a RedisDict (e.g. to implement a Redis backed shared dictionary of objects) that'd be fine, but if I wanted to change an element of any stored object I'd have to do something like:

tmp = redis_dict[key]
tmp['field'] = "value"
redis_dict[key] = tmp

For modest sized objects that workaround seems fine, but as objects get larger I wonder about the efficiency?

Things get more awkward as one use case I've got is what amounts to a dict of lists and I'd like that to be backed by Redis so multiple instances of my application can see the same dict of lists.

Now I know that I could keep writing the list to the redis_dict and from a functional perspective I think it'd work, but my suspicion is that it'd get horribly inefficient pretty quickly and as the number of items in the list grows at a guess what'd end up happening is an increasingly expensive JSON serialisation of the list.

As far as I can tell it's not possible for a RedisDict to contain a RedisList (when I tried it barfed with a JSON error IIRC)

Do you have any thoughts on how to do this sort of thing efficiently - I'm no Redis expert and one of the reasons I looked at pottery was to try and avoid having to to get too much into the weeds of Redis but now I'm wondering whether I might as well keep digging :-)

I'm guessing RedisList is implemented as, well, a Redis list with the values of the list JSON serialised so appending an item directly to a RedisList should only be as expensive as the serialisation cost for that Item, but how to model a dict of those efficiently (and intuitively) kind of escapes me.

I guess from the RedisList example
lyrics = RedisList(redis=redis, key='lyrics')

the key is, well key - so to model something like
{"a": [], "b": [], "c": [], "d": [] .......}

Do I simply have

{"a": RedisList(redis=redis, key="a"), "b": RedisList(redis=redis, key="b", "c": RedisList(redis=redis, key="c", "d": RedisList(redis=redis, key="d" .......}

Though as the containing dict isn't stored how would other instance know about the keys

So I'm thinking the only way to do this is by dereferencing - so I have a RedisSet holding the keys and if I want to "look up" a list I first check the key is in the dict and if it isn't I look it up in the RedisSet (cause it might have been set by another instance in the cluster) and if the key is in the RedisSet I create a RedisList instance using that key and add it to my dict.

I think that'd work, but is rather less transparent than a dict of lists idiom that it is trying to model and loses some of the benefit of what is supposed to be a more pythonic container abstraction.

Do you have any thoughts? Is my observation about right or have I missed something?

To be clear I'm not being critical, I'm just thinking out loud about how to model a shared dict of lists in a more efficient way than JSON serialising the entire list for every insert and would really appreciate thoughts on this problem.

MTIA

@brainix brainix self-assigned this Dec 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants