-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DBCol should implement its own GC method #2732
Comments
FYI for |
It should be explicit. |
Is my understanding correct that the sheer existence of |
@SkidanovAlex my understanding is that we iterate through all columns and call the method to do gc so that we don't forget some columns. |
@bowenwang1996 that doesn't work. Order in which columns are GCed is important, and sometimes to GC one column we need to iterate another. E.g. to GC state, we can't just iterate over the state collections. |
We can count a number of calls and then check it in Store Validator. |
Does a simple attribute-like trait work, something like
or each column gc trait includes column specific logic to determine whether key should be GCed and we're reorganizing gc logics into column gc trait? |
@frol WDYT? ^^^
Ideally, it would be awesome to call something like |
@Kouprin why is it not possible? |
@bowenwang1996 I didn't say it's not possible. My doubts are about mutating data a lot which may be complicated to handle in Rust. |
I see, yes i found this pattern a lot and in the internal notion gc spec I did put this together in the pseudo python code :) So your proposal looks like:
and for columns want to skip gc it's a noop or conditional noop in |
Before:
After:
For noop columns they still should be executed somewhere in GC to make sure we don't skip any columns:
Then I think we'll decide what to do with |
Why do you need to do that? Couldn't you check it right after the gc is done in the current iteration? |
@Kouprin sent me offline, it's checking in store-validator, my understanding is checking offline for performance reason? |
No guarantees that all columns must be cleared at each iteration. For example, Chunks are cleared after Blocks. (#2716) |
Worked on this today. It seems rust doesn't let me
error: expected type, found variant
would work. and match has better property that enforce you fill cases for all column. |
@ailisp I thought that we had independent constants, that is why I suggested using a trait. Enum is even better exactly for the enforcement rule of covering all the cases. |
implement #2732 remove cache and update gc count. because gc count is in batch update, cache it in ChainStoreUpdate to implement semantic: column.gc_count += 1. Test Plan --------- Test it still pass existing gc test and gc count of each column correctly update when remove_old_data This PR is also introduce the first storage migration, tested by near init, near run with master binary, then use this branch binary to run, observe not crash and new column created.
We missed at least two columns:
ColTransactionResult
,ColBlockMerkleTree
. Each DBCol must implement its own GC method to avoid forgetting about GCing data.Discussed with @frol, it seems we need to define Trait DBCol with method
gc(store_update: StoreUpdate, key: &[u8])
and implement it for each column where executestore_update.delete(column, key)
.The text was updated successfully, but these errors were encountered: