Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport to 5.16] fix for list_objects #8105

Merged
merged 1 commit into from
Jun 3, 2024

Conversation

dannyzaken
Copy link
Member

Signed-off-by: Danny Zaken dannyzaken@gmail.com
(cherry picked from commit 219699d)

Explain the changes

  • for postgres queries, there is no need to make an extra DB query, after processing the results of the first query.
  • in Mongo map-reduce, we could get to a situation where the query returned less than the limit, even though there are still relevant entries.
  • in Postgres, this is not happening, since the map_common_prefixes, iterates until it finds enough records (or until it reaches the end).
  • Performing the extra DB query in postgres, can cause a very long query that eventually returns 0 entries. One scenario is where we have a directory that is the last one returned in the previous query, and it has a large number of objects under it (.e.g /folder/file0 .. /folder/file9999999). when returning for the extra query, the marker that is used is 'folder/', so map_common_prefixes starts iterating from 'folder/', looking for all objects that do not start with 'folder/', but ends with a '/' (the delimiters). for postgres there is no smart way to do it, so it just going over all objects under 'folder/' and filetering them out. this can take over several minutes to complete.

Issues: Fixed #xxx / Gap #xxx

  1. https://bugzilla.redhat.com/show_bug.cgi?id=2277990

Testing Instructions:

  • Doc added/updated
  • Tests added

* for postgres queries, there is no need to make an extra DB query,
  after processing the results of the first query.
* in Mongo map-reduce, we could get to a situation where the query
  returned less than the limit, even though there are still relevant
entries.
* in Postgres, this is not happening, since the map_common_prefixes,
  iterates until it finds enough records (or until it reaches the end).
* Performing the extra DB query in postgres, can cause a very long query
  that eventually returns 0 entries. One scenario is where we have a
directory that is the last one returned in the previous query, and it
has a large number of objects under it (.e.g /folder/file0 ..
/folder/file9999999). when returning for the extra query, the marker
that is used is 'folder/', so map_common_prefixes starts iterating from
'folder/', looking for all objects that do not start with 'folder/', but
ends with a '/' (the delimiters). for postgres there is no smart way to
do it, so it just going over all objects under 'folder/' and filetering
them out. this can take over several minutes to complete.

Signed-off-by: Danny Zaken <dannyzaken@gmail.com>
(cherry picked from commit 219699d)
@dannyzaken dannyzaken merged commit 1d8127c into noobaa:5.16 Jun 3, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants