-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: ManagedRangeCache
's Last-access-time Datastructure - RangeLru
#27
Conversation
ManagedRangeCache
's Last-access-time Oracle - RangeLru
ManagedRangeCache
's Last-access-time Oracle - RangeLru
ManagedRangeCache
's Last-access-time Oracle - RangeLru
ManagedRangeCache
's Last-access-time Datastructure - RangeLru
|
||
At any point in time, the `RangeCache` owns a contiguous range. | ||
|
||
In-line with our non-OOM and memory-efficiency requirements, we require a way to evict ranges that have not been requested by the operator for a long time. We choose to do so in a similar fashion to `ManagedLru` - based on the last-accessed time of a given subrange (the subranges being a partition of the total range owned by `RangeCache`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How the subranges are divided? Considering the rows kept in DynamicFilter can be inserted/deleted continuously, it may be non-trivial 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, why not maintain the epoch per row?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, why not maintain the epoch per row?
This would violate the integrity of range cache since it guarantees to cache all rows in a certain range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How the subranges are divided?
It's quite straightforward. Everytime some range is accessed, it would be considered a new subrange with a given last-accessed time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, why not maintain the epoch per row?
Actually, on further thought, perhaps ManagedLru cache can work since it will update all the rows in a range with new timestamp on every read. So it will achieve the same thing but be less efficient since it has to track each row rather than the access of logical range.
# The Design of `ManagedRangeCache`'s Last-access-time Oracle - `RangeLru` | ||
|
||
## Summary | ||
The design of a mechanism for evicting ranges last accessed before a certain epoch for the `DynamicFilter` operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the "access" include both writes and reads?
Note that for DynamicFilter the writes are from outer side and the reads are from inner side (i.e. the 1-row side)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking for a while, I suppose that the "access" should only include reads from inner-side (moving the "row pointer" up or down), because the cache is designed used for reads only. The outer side is basically identical to a MaterializeExecutor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, access only indicates reads.
fn evict_before(&self, epoch: u64) -> Vec<ScalarRange> | ||
``` | ||
|
||
1. The main data structure (main index) is a `BTreeMap { epoch -> Vec<ScalarRange> } ` to facilitate the `evict_before` operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there should be only one access for each epoch (right?), Vec
is not needed here
|
||
Notes: | ||
1. `evict_before` needs to delete all ranges below the watermark from both the main index and the secondary index | ||
2. `access` needs to overwrite any range that overlaps with the accessed range in both the main and secondary indexes. In the former, it will modify the endpoints of the ranges that it partially overlaps with, and delete ranges that it fully overlaps with. In the latter, it will delete the left bounds that it fully overlaps with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen if the last-access time of subranges are not ordered? For example
1. 0 ~ 10 last access epoch = 100
2. 10 ~ 20 last access epoch = 200
3. 20 ~ 30 last access epoch = 150
4. 30 ~ 40 last access epoch = 300
We want to evict_before(160)
and 1 & 3 will be evicted, but 2 & 4 is not continuous subrange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every contiguous range of epochs maps onto a contiguous range in the cache. So this scenario will never happen.
In the impl, we include many sanity checks regarding this continuity property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every contiguous range of epochs maps onto a contiguous range in the cache
Wow, I suddenly understand what you mean!
For DynamicFilter, the ranges accessed must be always continuous - the end of the range at epoch n
must be the start of the range at epoch n+1
. Your algorithm is based on this fact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having another case:
1. 10~30 epoch = 1
2. 20~30 epoch = 2
3. 20~25 epoch = 3
If we evict epoch 1 and 2, then only (20~25
) will be kept and (10~20
, 25~30
) will be evicted. (Not sure if I got this correctly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having another case
Oh, indeed. The range that is evicted is not a continuous range in this case, so I made an error.
So evict before can indeed return two contiguous ranges (when the remaining range splits up the old range).
However, the property that the remaining range is always contiguous is still correct.
Please allow me also to propose my idea here. It's too simple to be written into an RFC. First of all, caching is helpful but not critical to DynamicFilter. I'll show this by discussion in 2 cases:
In summary, for major cases of Second, based on that conclusion, I would prefer either no caching or a very simple caching implementation.
After thinking for a while, I felt that even a fixed-size caching is not trivial to implement, so I would like to propose no caching at all. |
Indeed, it may be ok to rely on the block cache. Operator-side cache incurs memory (plus code and runtime) overhead that can only be justified by a random read scenario, whereas ranges would probably display a good block-level cache-locality. (Although I'm not sure if the range interface to storage incurs a network latency cost to scan S3 even if the relevant blocks are in cache.) Furthermore, the cacheing mechanism here is designed for reads caused by a pointer that moves repeatedly up and down. It is completely ineffective for a |
No description provided.