feat: postgres vacuum enabled with test case #2313

ABresting · 2023-12-21T11:33:37Z

Description

With this change, Waku Store protocol now supports retention policy on PostgreSQL. Outdated messages are deleted and DB enters a non-blocking VACUUM state. Effectively it reduces the size of the DB on disk while allowing parallel read/write operation on database.

Changes

Apart from vacuum functionality in PostgreSQL database, I have extended the ArchiveDriver so that using the driver one can know which type of Database driver it is currently using i.e. Sqlite or Postgres or in-memory (Queue driver) etc. It was required to disable SQLite-based vacuuming smoothly.

Retention policy test cases have also been updated to support Postgres instead of SQLite. We make this choice since SQLite Vacuum process blocks the read/write operations, so it is decided to do Vacuum manually on Waku nodes/client running SQLite as store archive.

Changed the Size based retention policy test case such that it fulfills the purpose that after performing the vacuum, DB size is reduced.

Postgres Vacuum function created
SQLite focused retention policy test cases replaced by Postgres
Vacuum function added after each type of retention policy
ArchiveDriver extended to support database type feature

Issue

closes #1885

github-actions · 2023-12-21T11:33:52Z

This PR may contain changes to database schema of one of the drivers.

If you are introducing any changes to the schema, make sure the upgrade from the latest release to this change passes without any errors/issues.

Please make sure the label release-notes is added to make sure upgrade instructions properly highlight this change.

github-actions · 2023-12-21T11:38:16Z

You can find the image built from this PR at

quay.io/wakuorg/nwaku-pr:2313

Built from 6164530

AlejandroCabeza

LGTM

tests/waku_archive/test_retention_policy.nim

waku/waku_archive/driver/postgres_driver/postgres_driver.nim

waku/waku_archive/retention_policy/retention_policy_size.nim

waku/waku_archive/driver/postgres_driver/postgres_driver.nim

Ivansete-status

Thanks for this!
Nevertheless, the behaviour isn't 100% valid because the database size is not changed at all.

I think we need to properly configure the "autocavuum" in the database ( suggested by @yakimant ) and simplify this logic. I'd remove the while.

On the other hand, I've tested locally with the following settings in order to configure a quite intense vacuum service, and the database size gets reduced properly:

autovacuum = on			# Enable autovacuum subprocess?  'on'
					# requires track_counts to also be on.

# Aggressive settings for frequent autovacuum operations
autovacuum_vacuum_scale_factor = 0.01   # Trigger vacuum at 1% of dead tuples
autovacuum_vacuum_threshold = 50        # Minimum number of updated or deleted tuples before vacuum
autovacuum_analyze_scale_factor = 0.01   # Trigger analyze at 1% of changed tuples
autovacuum_analyze_threshold = 50        # Minimum number of inserted, updated, or deleted tuples before analyze
autovacuum_freeze_max_age = 200000000   # Maximum age before forced vacuum freeze

# Optionally adjust vacuum cost delay to control vacuuming speed
autovacuum_vacuum_cost_delay = 10       # Vacuuming cost delay in milliseconds

tests/waku_archive/test_retention_policy.nim

waku/waku_archive/driver/postgres_driver/postgres_driver.nim

waku/waku_archive/driver/queue_driver/queue_driver.nim

waku/waku_archive/retention_policy/retention_policy_size.nim

yakimant · 2023-12-25T12:33:06Z

@Ivansete-status, just to mention, that I've never used autovacuum before and don't know the pros/cons and how it is applicable for our case. I've just seen this option somewhere.

Thanks for trying it! Would be great to know, how well does it work for us.

ABresting · 2023-12-28T13:50:02Z

@Ivansete-status, just to mention, that I've never used autovacuum before and don't know the pros/cons and how it is applicable for our case. I've just seen this option somewhere.

Thanks for trying it! Would be great to know, how well does it work for us.

Basically, the autovacuum process in Postgres is to ensure that deleted tuples/rows space is reusable and the database size is utilized properly. There are certainly thresholds based on which a timely non-blocking simple DB vacuum is done automatically. autovacuum is good to have provided the parameters of autovacuum are set carefully.

yakimant · 2023-12-29T12:04:03Z

I've read a bit about vacuum, autovacuum, vacuum full and pg_repack.

Looks like autovacuum is a must for most of the systems, which do update and or delete operations. This should make them emptyed space available for new edits.

But I don't think this will actually let the space available to the filesystem.
Probably we don't neet to take of that unless there is an instant jump of db activity.
In that case pg_repack is suggested by community, rather than vacuum full

Ivansete-status

Thanks @ABresting !
It looks nice. I've added some comments. Ping me when done and I'll double-check again.

Regarding the disk space consumption, I've noticed that the VACUUM doesn't work well in a normal scenario. In other words, it only worked well when I forced VACUUM every two seconds.

I've also tried the autovacuum and it didn't work well either. I couldn't manage to reduce the database size.

IMO, the only working solution is to use pg_repack tool, even though at first I was quite reluctant to use it (it requires installing a non-standard extension and having the pg_repack utility installed in the system as well.)

As you properly mentioned, this PR is not enough and we need to perform additional actions to keep the database size bounded.

Ivansete-status · 2023-12-29T10:25:20Z

waku/waku_archive/retention_policy/retention_policy_capacity.nim

+    # sleep to give it some time to complete vacuuming
+  await sleepAsync(350)


This sleep shouldn't be needed. If we need to perform that sleep to make tests work properly, I suggest applying that sleep in the tests directly.
That applies to the other two retention policies :)

Ivansete-status · 2023-12-29T10:31:27Z

waku/waku_archive/retention_policy/retention_policy_size.nim

  # NOTE: Using SQLite vacuuming is done manually, we delete a percentage of rows
  # if vacumming is done automatically then we aim to check DB size periodially for efficient
  # retention policy implementation.
+  # to shread/delete messsges, get the total row/message count


tiny typo:

Suggested change

# to shread/delete messsges, get the total row/message count

# to shread/delete messages, get the total row/message count

Do you mind reviewing all the comments within this execute proc? There is another tiny typo in "periodially" and some lines seem outdated.

Ivansete-status · 2023-12-29T10:34:14Z

waku/waku_archive/retention_policy/retention_policy_size.nim

+  let dbEngine = driver.getDbType()
+  if dbEngine == "sqlite":
+    return ok()


That doesn't seem correct as it prevents from applying "size" retention policy in SQLite.
This logic might be suitable to be run from within driver.performVacuum()

ABresting · 2023-12-30T15:24:54Z

Thanks for the nice review Ivan! I am well aware of the situation you are mentioning. That's a Postgres async issue where when you delete a large number of rows then they take time to first delete and then on top if you vacuum then too it takes time. But good part is that all those operations will eventually take place if not instantly, since Postgres operations are best effort. Not a lot of things we can do there. Open to hear how pg_repack tool's trade off in our case since I have a hunch that it comes with some baggage, performance degradation, data corruption to name some I could find.

…

On Sat, 30 Dec 2023, 01:44 Ivan FB, ***@***.***> wrote: ***@***.**** commented on this pull request. Thanks @ABresting <https://github.com/ABresting> ! It looks nice. I've added some comments. Ping me when done and I'll double-check again. Regarding the disk space consumption, I've noticed that the VACUUM doesn't work well in a normal scenario. In other words, it only worked well when I forced VACUUM every two seconds. I've also tried the autovacuum and it didn't work well either. I couldn't manage to reduce the database size. IMO, the only working solution is to use pg_repack tool, even though at first I was quite reluctant to use it (it requires installing a non-standard extension and having the pg_repack utility installed in the system as well.) As you properly mentioned, this PR is not enough and we need to perform additional actions to keep the database size bounded. ------------------------------ In waku/waku_archive/retention_policy/retention_policy_capacity.nim <#2313 (comment)>: > + # sleep to give it some time to complete vacuuming + await sleepAsync(350) This sleep shouldn't be needed. If we need to perform that sleep to make tests work properly, I suggest applying that sleep in the tests directly. That applies to the other two retention policies :) ------------------------------ In waku/waku_archive/retention_policy/retention_policy_size.nim <#2313 (comment)>: > # NOTE: Using SQLite vacuuming is done manually, we delete a percentage of rows # if vacumming is done automatically then we aim to check DB size periodially for efficient # retention policy implementation. + # to shread/delete messsges, get the total row/message count tiny typo: ⬇️ Suggested change - # to shread/delete messsges, get the total row/message count + # to shread/delete messages, get the total row/message count Do you mind reviewing all the comments within this execute proc? There is another tiny typo in "periodially" and some lines seem outdated. ------------------------------ In waku/waku_archive/retention_policy/retention_policy_size.nim <#2313 (comment)>: > + let dbEngine = driver.getDbType() + if dbEngine == "sqlite": + return ok() That doesn't seem correct as it prevents from applying "size" retention policy in *SQLite*. This logic might be suitable to be run from within driver.performVacuum() — Reply to this email directly, view it on GitHub <#2313 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEGBFRITJ3C4JHYWCVRYRY3YL4QA3AVCNFSM6AAAAABA6IFBX2VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTOOJYHAYDOMJZHA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

yakimant · 2024-01-03T14:14:32Z

@Ivansete-status vacuum and autovacuum should not make disk usage lower.
It only allows cleaned data to give space for new records.

But probably it's fine in many cases.

Ivansete-status · 2024-01-10T16:14:57Z

@Ivansete-status vacuum and autovacuum should not make disk usage lower. It only allows cleaned data to give space for new records.

But probably it's fine in many cases.

Thanks for the comment @yakimant ! You are absolutely right.

In order to have the "size" retention policy in Postgres working correctly we need to take additional action to reduce the disk space occupied by the database. If not, every time the retention policy is applied, it will drop rows until the table gets empty.

If we cannot work on such an external tool (i.e. pg_repack) then we shouldn't support "size" retention policy when Postgres is used (cc @ABresting .)

ABresting · 2024-01-11T12:22:27Z

@Ivansete-status vacuum and autovacuum should not make disk usage lower. It only allows cleaned data to give space for new records.
But probably it's fine in many cases.

Thanks for the comment @yakimant ! You are absolutely right.

In order to have the "size" retention policy in Postgres working correctly we need to take additional action to reduce the disk space occupied by the database. If not, every time the retention policy is applied, it will drop rows until the table gets empty.

Thanks for the comment Ivan 💯
Well, IMO if we devise the retention policy in such as way that a drop of rows only happens once and then after x amount of time (conditional to if the size is above the threshold), in that case, we should not be worried about table getting emptied. It should not work aggressively but rather best-effort policy. Now you may find it non-reliable/probable but if not having out of a box tool/solution then the current retention policy should work. If there is no harm then why not?

About the size of the disk space, I believe that from a user perspective, if cleaned space is not reclaimed to the file system, but still be used by DB to insert new messages then again it is a win-win anyway. I do not think we have run infra tests this way where the retention policy doesn't work in an aggressive way (risk of emptying DB). I think just the Table insertions were not the major issue, it was the ever-growing log size?

BTW this running/checking retention policy after x amount of time is how it is working rn in the app code I believe.

Ivansete-status · 2024-01-12T10:34:03Z

Thanks for the comment @ABresting !

You are right that the database won't get empty in the happy scenario where the node gets messages continuously. However, we cannot always guarantee that, and there will be periods of inactivity. We cannot deliver something that won't work well in 100% of the cases.

The only solution I see, if we want to support the "size" retention policy for Postgres, is that we start using the pg_repack utility. For that, we need to use a postgres docker image with that extension installed, e.g. hartmutcouk/pg-repack-docker:1.4.8, and on the other hand, we will need to have an external app (another docker service) that invokes the pg_repack regularly (PGPASSWORD=test123 ~/utils/pg_repack-1.4.8/bin/pg_repack -U postgres -h 127.0.0.1 -p 5432 -d postgres --table messages) so that the database size gets bounded properly.

Ivansete-status · 2024-03-07T08:08:12Z

Thanks so much for the PR @ABresting !
In the end, we've applied a different approach and we can keep the database size controlled by utilizing partitions. See #2506
Excuse me, I am getting the license to close this PR according to the mentioned above.
Big hug!

postgres vacuum enabled with test case

d2e54d3

ABresting requested review from richard-ramos, AlejandroCabeza, alrevuelta and Ivansete-status December 21, 2023 11:33

ABresting self-assigned this Dec 21, 2023

ABresting added the release-notes Issue/PR needs to be evaluated for inclusion in release notes highlights or upgrade instructions label Dec 21, 2023

AlejandroCabeza approved these changes Dec 21, 2023

View reviewed changes

ABresting added 2 commits December 22, 2023 01:05

postgres vacuum updated

606b7da

postgres vacuum excluded for MacOS

9eedf81

Ivansete-status reviewed Dec 22, 2023

View reviewed changes

updated review code

b182bdb

ABresting requested review from Ivansete-status December 28, 2023 16:33

Ivansete-status reviewed Dec 29, 2023

View reviewed changes

Ivansete-status closed this Mar 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: postgres vacuum enabled with test case #2313

feat: postgres vacuum enabled with test case #2313

ABresting commented Dec 21, 2023 •

edited

Loading

github-actions bot commented Dec 21, 2023

github-actions bot commented Dec 21, 2023 •

edited

Loading

AlejandroCabeza left a comment

Ivansete-status left a comment •

edited

Loading

yakimant commented Dec 25, 2023

ABresting commented Dec 28, 2023

yakimant commented Dec 29, 2023

Ivansete-status left a comment

Ivansete-status Dec 29, 2023

Ivansete-status Dec 29, 2023

Ivansete-status Dec 29, 2023

ABresting commented Dec 30, 2023 via email

yakimant commented Jan 3, 2024

Ivansete-status commented Jan 10, 2024

ABresting commented Jan 11, 2024

Ivansete-status commented Jan 12, 2024

Ivansete-status commented Mar 7, 2024

		# sleep to give it some time to complete vacuuming
		await sleepAsync(350)

	# to shread/delete messsges, get the total row/message count
	# to shread/delete messages, get the total row/message count

feat: postgres vacuum enabled with test case #2313

feat: postgres vacuum enabled with test case #2313

Conversation

ABresting commented Dec 21, 2023 • edited Loading

Description

Changes

Issue

github-actions bot commented Dec 21, 2023

github-actions bot commented Dec 21, 2023 • edited Loading

AlejandroCabeza left a comment

Choose a reason for hiding this comment

Ivansete-status left a comment • edited Loading

Choose a reason for hiding this comment

yakimant commented Dec 25, 2023

ABresting commented Dec 28, 2023

yakimant commented Dec 29, 2023

Ivansete-status left a comment

Choose a reason for hiding this comment

Ivansete-status Dec 29, 2023

Choose a reason for hiding this comment

Ivansete-status Dec 29, 2023

Choose a reason for hiding this comment

Ivansete-status Dec 29, 2023

Choose a reason for hiding this comment

ABresting commented Dec 30, 2023 via email

yakimant commented Jan 3, 2024

Ivansete-status commented Jan 10, 2024

ABresting commented Jan 11, 2024

Ivansete-status commented Jan 12, 2024

Ivansete-status commented Mar 7, 2024

ABresting commented Dec 21, 2023 •

edited

Loading

github-actions bot commented Dec 21, 2023 •

edited

Loading

Ivansete-status left a comment •

edited

Loading