-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: vacuum in time based retention policy #2165
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for that @ABresting !
Would it be possible to test that this actually reduces the sqlite file size?
Cheers
You can find the image built from this PR at
Built from 9d78ab5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved BUT like Ivan said. Are we sure it works?
Now that's actually a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, although I think we have to monitor how long this vacuuming can stall nodes with large DBs (e.g. Status with 30 days of history), or do we already have data on how long a vacuum can take?
Would be good to test this PR with a copy of a database like status.test or status.prod. I'm worried that this will end up impacting negatively the node. |
The go-waku prod fleet is currently connected to status.prod and keeps 7d of history. VACUUM took 11mins to execute, (because of this, this PR was introduced to indicate the user that something is happening waku-org/go-waku#838) |
I agree with @richard-ramos here. Approved previously, but with some hard numbers of time it takes to vacuum, I don't think this is a good idea. @ABresting @Ivansete-status I'm not sure that periodic vacuuming should be part of retention policy at all - this seems to be something specifically required by SQLite and not any other type of backend. In this case a better approach is indeed to have infra perform the vacuum "out of band" from time to time so as not to stall production nodes. In our case, this is only necessary until the migration to postgresql, which should be imminent. |
Good point indeed! How long does it take if you perform VACUUM regularly? For example, daily, or two consecutive VACUUM calls? I have the impression that it took 11' because the vacuum operation needed to handle a lot of data. I believe that the VACUUM concept is widely used in other backends. In Postgres, the "standard" VACUUM operation doesn't necessarily block other operations (see this.) And we are performing it asynchronously. Finally, if we decide that the VACUUM operation should be made manually, then we will need to adapt the other retention policies (capacity and size.) as well. |
@yakimant executed VACUUM twice in a node with a small db with 1.8GB of data and this is the result:
2m the first run, 1m51s the second. Not much time difference between both attempts. Some items to take into account when deciding whether to proceed with automated vacuums:
I'd suggest downloading a copy of one of those databases and running VACUUM locally first to see the impact
Ah! i'm only focusing on sqlite in here. I have not read yet on how postgresql handles vacuum |
Thanks for checking @richard-ramos , @yakimant ! In this case, even though this PR is approved we can't merge it @ABresting. I think we need to close this PR and submit a different one where we remove the On the other hand, we need to create a separate chron script to perform the vacuum task separately under certain conditions. WDYT @richard-ramos , @yakimant ? |
I have simmilar feelings, we need to address sqlite vacuuming differently. |
I'm actually thinking of removing the vacuum functionality of go-waku because seems to me that having it as part of the node is too problematic. |
closing this PR since due to SQLite it is better to not use automatic vacuuming. |
Description
Enables the vacuum of DB in Time-based retention policy.
Changes
Issue
#2164