Limit by memory instead of number of items #163

felipou · 2022-12-17T17:10:42Z

I need a LRU cache that limits the number of items by memory usage. Since I'm storing just Vec items, I'm planning a simple wrapper around LruCache to solve my problem on the short term.

But then I started thinking about creating a more generic solution, and started thinking about a special trait to calculate the size of the object. Then I thought "this may already exist", and voila, I found this crate: https://crates.io/crates/memory-lru

Reading their code gave me the insight that we must think what to do when a value in the cache is modified (since its size may change), which they solved by providing a function which gets another function as argument in which the value is modified, so that they can know when the values was mutated and its size needs to be recalculated.

I also found this issue discussing something similar: #32

But I think we can do better. I thought about creating some sort of "Lock" object returned on the "get" function which would trigger a recalculation of the value's size when dropped, that way we can safely access and mutate values in the cache.

I also thought that the "MemorySize" trait could be implemented for primitives and common std structs (like Vec), and we could also provide a macro to derive this trait for structs containing values that already have the MemorySize trait.

Anyway, I'm thinking about creating a pull request implementing these features, would that be a welcomed feature?

One thing I'm still not sure about is: should we create a wrapper around LruCache, or try to implement this directly for LruCache (which I'm not even sure is possible without breaking compatibility with how it works right now)? I'm leaning toward creating a wrapper (like the crate I mentioned above).

d-e-s-o · 2022-12-17T17:31:50Z

Not the author of the crate, just interested.

I thought about creating some sort of "Lock" object returned on the "get" function which would trigger a recalculation of the value's size when dropped, that way we can safely access and mutate values in the cache.

May I ask what would happen if, as part of the mutation, the cache's limit is exceeded?

jeromefroe · 2022-12-19T01:36:23Z

Hey @felipou 👋

Anyway, I'm thinking about creating a pull request implementing these features, would that be a welcomed feature?

Definitely! This sounds like a broadly useful feature and, in fact, sounds similar to #160 which is about allowing the user to specify the "cost" of an element inserted into the cache.

One thing I'm still not sure about is: should we create a wrapper around LruCache, or try to implement this directly for LruCache (which I'm not even sure is possible without breaking compatibility with how it works right now)? I'm leaning toward creating a wrapper (like the crate I mentioned above).

I would also probably leans towards creating a wrapper at first since it would provide more flexibility from not having to worry about breaking compatibility.

felipou · 2022-12-19T03:30:41Z

Great, I'll try to have something done by next week!

About PR #160, I think it's very similar indeed! And I'd argue that in that case too maybe we should implement it as a wrapper. I'll see if it makes sense to use that idea and create two layers of wrappers: one with a generic cost, and another on top of that with the memory cost.

Answering @d-e-s-o :

May I ask what would happen if, as part of the mutation, the cache's limit is exceeded?

I think it should be the same behavior as if we had popped the value, and then inserted a larger one. Basically just evict older entries until we reach the desired threshold.

koute · 2022-12-24T04:52:02Z

Sorry for the drive-by shameless plug, just accidentally saw this issue so I thought I'd mention it, @felipou FYI the other day I needed an LRU crate which can do this, so I wrote my own which can.

One thing that's worth mentioning is that there are two sources of memory usage when inserting into a map:

the inline size of the internal hash map,
the memory on the heap used by the elements inserted into the hash map.

If you'd want to strictly and precisely limit memory usage you'd need to take both into the account. It's also worth remembering that the hash map grows in powers of two (so you can't just simply multiply inline item size by the number of items nor by the capacity to get its inline usage) and that also it has some control bytes which also take up extra space per entry.

(Although to be fair, considering that lru boxes all of its elements if you're storing something big in the map then the inline size is probably going to be negligible.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit by memory instead of number of items #163

Limit by memory instead of number of items #163

felipou commented Dec 17, 2022

d-e-s-o commented Dec 17, 2022

jeromefroe commented Dec 19, 2022

felipou commented Dec 19, 2022

koute commented Dec 24, 2022

Limit by memory instead of number of items #163

Limit by memory instead of number of items #163

Comments

felipou commented Dec 17, 2022

d-e-s-o commented Dec 17, 2022

jeromefroe commented Dec 19, 2022

felipou commented Dec 19, 2022

koute commented Dec 24, 2022