-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chain: gc partial chunks on archival nodes #6439
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absence of tests is concerning. I guess it's the usual "we don't have infra to test this conveniently", but the logic here looks quite finicky...
Yeah, I’ve tested this manually and pytest/tests/sanity/block_sync_archival.py checks that the node is able to fill in those requests but there is currently no test that checks the gc is happening. I am contemplating a few approaches so I am planning to add something. |
4cc84bb
to
60e4960
Compare
@mzhangmzz, PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not very comfortable that we don't have a unit test for this. It feels to me that should be relatively easy to do with TestEnv
So I’ve been thinking about rolling this feature out and started wondering about how third party operators will deal with it. With the need to run recompress and then setting this option I concluded that in the end adding the option just leaves window open for someone to end up with a messy archival node. On top of that, my current thinking is to cut 1.25.1 which would act as an opt-in step.
It’s kinda tested a bit in the Python test, at the moment so at least there’s that. |
This is commit 6be2e0e upstream. Start garbage collecting ColPartialChunks and ColInvalidChunks on archival nodes. The former is quite sizeable column and its data can be recovered from ColChunks. The latter is only needed when operating at head. Note that this is likely insufficient for the garbage collection to happen in reasonable time (since with current default options we’re garbage collecting only two heights at a time). It’s best to clean out the two columns. Issue: near#6242
Start garbage collecting ColPartialChunks and ColInvalidChunks on
archival nodes. The former is quite sizeable column and its data can
be recovered from ColChunks. The latter is only needed when operating
at head.
Note that this is likely insufficient for the garbage collection to
happen in reasonable time (since with current default options we’re
garbage collecting only two heights at a time). It’s best to clean
out the two columns.
Issue: #6242