Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(executor): Managed DynamicFilter cache eviction based on epoch #6540

Closed
jon-chuang opened this issue Nov 23, 2022 · 2 comments
Closed
Assignees

Comments

@jon-chuang
Copy link
Contributor

jon-chuang commented Nov 23, 2022

The range cache's logical range can be partitioned into a finite number of subranges that have an associated last-accessed epoch. Then, it is easy to evict those old parts of the range. We can prove that this will never result in a non-contiguous range (due to fixed point theorem). We will add some sanity checks to ensure that requested ranges are always contiguous with the last requested.

Since we will modify only the logical range, this operation should not be affected by scaling.

I will post a more detailed design below.

Edit: RFC here risingwavelabs/rfcs#27

@fuyufjh
Copy link
Member

fuyufjh commented Nov 28, 2022

I will post a more detailed design below.

Looking forward

@jon-chuang
Copy link
Contributor Author

jon-chuang commented Nov 28, 2022

We will need a data structure with two operations.

/// This updates the last accessed time for a given range. 
/// `epoch` must be greater than any previously seen epoch.
fn access(&mut self, range: ScalarRange, epoch: u64)

/// This obtains the ranges subject to deletion in a given epoch
fn evict_before(&self, epoch: u64) -> Vec<ScalarRange>
  1. The main data structure is a BTreeMap { epoch -> Vec<ScalarRange> } to facilitate the evict_before operation.
  2. The secondary index is a modified BTreeMap, storing the data BTreeMap { ScalarImpl -> (is_inclusive: bool, epoch) }, allowing one to look up the epochs intersecting with a given range, with an additional field lower_unbounded_range_epoch: Option<u64> as lower-unbounded range's existence and epoch cannot be represented. Alternately to using abstract range bounds, we can use the byte encoding of the range bounds.

Impl details:
access will update the secondary index, by replacing the given range with new bounds, and deleting any existing bounds that are contained within it. Then, the main index will be updated to reflect the change in last accessed time of the corresponding ranges (by adding new range and modifying or deleting old ones).

As for the proof of contiguous range, it is actually simpler. The path of the dynamic filter pointer f(t) is continuous. Hence, any restriction of the path to a continuous range of t is also continuous. This includes both epoch till now, and any past range of epochs (not so relevant, but this makes the time complexity of access and evict_before linear in the number of distinct epochs, i.e. amortized constant time if we access every epoch).

@fuyufjh fuyufjh removed this from the release-0.1.15 milestone Dec 19, 2022
@jon-chuang jon-chuang closed this as not planned Won't fix, can't repro, duplicate, stale Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants