Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include wal_name in list_bucket prefix to improve performance of download_wal #876

Merged
merged 3 commits into from
Jan 17, 2024

Conversation

sjuls
Copy link
Contributor

@sjuls sjuls commented Nov 30, 2023

The current implementation of download_wal lists all files in the wal directory and filters client-side for the actual wal being requested. However since we're querying blob storage which filters by prefix we can include the wal_name in the prefix to execute this filtering on the cloud provider while still retrieving both compressed and non-compressed files.

The list_bucket api in the cloud_interface currently specifies prefix for filtering without any restrictions on the prefix pointing to a folder, and I believe all cloud providers supported by barman support filtering by any prefix as well.

This change should provide a noticeable performance improvement on wal directories containing a large amount of archived wal segments.

@sjuls
Copy link
Contributor Author

sjuls commented Nov 30, 2023

Hi again @mikewallace1979,

So as you might have guessed we're struggling a bit with a huge amount wal segments in blob storage 😅. So we're looking for opportunities to improve wal retrieval to reduce recovery times.

Let me know what you think of this proposal 🙏

@mikewallace1979
Copy link
Contributor

This looks like a good change - I've tested it with the three supported cloud providers using a combination both with and without compression and it is able to find history files and WAL files as expected.

The unit test test_fails_if_wal_not_found will need updating because it currently mocks a list_bucket response containing a path other than the one requested. That worked with the old code but with this patch I think the test should just be mocking an empty list as the list_bucket response.

@sjuls
Copy link
Contributor Author

sjuls commented Nov 30, 2023

Great 👍 I've pushed a correction to the test. Thanks!

Copy link
Contributor

@gcalacoci gcalacoci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@martinmarques martinmarques merged commit d8bb862 into EnterpriseDB:master Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants