Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HEXPIRE #857

Open
badrishc opened this issue Dec 6, 2024 · 5 comments · May be fixed by #864
Open

Add HEXPIRE #857

badrishc opened this issue Dec 6, 2024 · 5 comments · May be fixed by #864
Assignees
Labels
API enhancement New feature or request help wanted Extra attention is needed

Comments

@badrishc
Copy link
Contributor

badrishc commented Dec 6, 2024

Feature request type

API addition

Is your feature request related to a problem? Please describe

We would like to support HEXPIRE and its variants in Garnet. The key goal is to support expiration of individual keys in a HASH object.

Describe the solution you'd like

The proposal is to keep two dictionaries:

  • Dictionary<byte[], byte[]> as it currently exists
  • Dictionary<byte[], long>, a secondary map that defaults to null, and only has entries for items that have expiration times.

That way, in the common case where there is no expiration for a hash object, we only incur one extra comparison (obj == null). Keeping it separate also allows the no-expiration case to have no memory overheads.

On a read operation into the hash object, we have to check this table to prune out the results that are sent to the user. We might have wrapper methods for accessing the hash object to perform this pruning (gated by the null check on the object, for performance). Note that under read lock, we cannot update the data structure to remove the expired elements, we can only ignore the expired entries.

On any write operation such as HSET, we have the opportunity to clean up the data structure by deleting the expired entries. To support this, we keep a priority queue:

  • PriorityQueue<byte[], long>

The priority queue orders by expiration time, so that we can efficiently Peek and determine whether keys in the hash have expired (i.e., their expiration timestamp is less than DateTime.Now). This is similar to CheckExpiry in the main store. The expired keys are removed from the main dictionary and the expiration dictionary. The priority queue is popped.

We add a HCOLLECT key operation that will prune the expirations for the hash with that key. There might be a HCOLLECT * as well which will scan and collect all the hash keys.

Finally, need some background prune thread that will handle expiration pruning for the whole system. Perhaps it can be similar to compaction with a ExpirationPruneFrequency periodicity. It could be a periodic task similar to this: https://github.com/microsoft/garnet/blob/main/libs/server/StoreWrapper.cs#L393C22-L393C26.

Describe alternatives you've considered

See #674 where the expirations are stored in the same dictionary that holds values. This is problematic due to the additional allocation of the wrapper and the memory/cpu overhead even in the common case where expirations are not being used for individual keys in the HASH object.

Additional context

No response

@Vijay-Nirmal
Copy link
Contributor

You can assign it to me

@badrishc
Copy link
Contributor Author

badrishc commented Dec 6, 2024

Thanks @Vijay-Nirmal, look forward to it!

@Vijay-Nirmal
Copy link
Contributor

There are problems using PriorityQueue for this implementation.

  1. As of .Net 8, there is no remove method to delete an item when using HPERSIST operation. Remove method was added to PriorityQueue as part of .Net 9 which has O(n) time complexity
  2. As of .Net 8, there is no way to update the priority of an already existing item. In .Net 9, we can use Remove and Enqueue it as a new item but removing an item is O(n). So it won't be an efficient operation.

If we are fine with taking this performance compromise of using Remove method then we can only finish this implementation after we move Garnet to .Net 9 (Not sure if we are planning to do that or skip .Net 9 version)

Alternatively, I think we can use SortedDictionary<long, ushort> instead of PriorityQueue<byte[], long>, where the key is the expiration time in ticks and the value is the number of keys with the same expiration time. Below is the implementation plan with SortedDictionary

  • To set the expiration: We can add new entry with value as 1 if the expiration does not exist, or update with value + 1 if the expiration already exists.
  • To update the expiration: Use the old expiration to find the entry and remove it if the value is 1 else update the value with value - 1 and do the same as the above point.
  • To find a check for expired items: Use First() method similar to Peek method in PriorityQueue implementation
  • To persist an expiring item (HPERSIST): Use the old expiration to find the entry and remove it if the value is 1 else update the value with value - 1

@badrishc Let me know your opinion on this alternative design

@badrishc
Copy link
Contributor Author

Can we just keep priority queue for now, and not actually remove the item? The point of truth is anyway the expirationTimes dictionary that we can verify expiration, for items being peeled out via priority queue.

If not, it is fine to use SortedDictionary for now. Concern is that adding/removing etc. is very slow, and First() is expensive O(log N).

@Vijay-Nirmal
Copy link
Contributor

I managed to do this with PriorityQueue itself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
3 participants