Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NSFS | Improve list objects performance on top of NS FS (PR 1/3) #7912

Merged
merged 1 commit into from
Apr 15, 2024
Merged

NSFS | Improve list objects performance on top of NS FS (PR 1/3) #7912

merged 1 commit into from
Apr 15, 2024

Conversation

naveenpaul1
Copy link
Contributor

Explain the changes

  1. This is first PR for listing unordered objects.
  2. This Pr contains implementation of telldir and seekdir methods in napi layer
  3. Global flag is added in config for controlling bucket list flow

Issues: Fixed #xxx / Gap #xxx

  1. Improve list objects performance on top of NS FS #6615

Testing Instructions:

  1. Not possible with current changes.
  • Doc added/updated
  • Tests added

src/native/fs/fs_napi.cpp Show resolved Hide resolved
src/native/fs/fs_napi.cpp Outdated Show resolved Hide resolved
@romayalon
Copy link
Contributor

@naveenpaul1 let's add some tests? :)

config.js Outdated Show resolved Hide resolved
src/native/fs/fs_napi.cpp Outdated Show resolved Hide resolved
Copy link
Member

@guymguym guymguym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@naveenpaul1 This is great!
It would be great to see the performance improvement from the full impl.
See few comments.

src/native/fs/fs_napi.cpp Outdated Show resolved Hide resolved
@naveenpaul1 naveenpaul1 requested a review from guymguym April 10, 2024 12:59
src/native/fs/fs_napi.cpp Outdated Show resolved Hide resolved
src/native/fs/fs_napi.cpp Outdated Show resolved Hide resolved
Copy link
Member

@guymguym guymguym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two small comments but otherwise looks good.

src/native/fs/fs_napi.cpp Outdated Show resolved Hide resolved
Copy link
Member

@guymguym guymguym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First - we need to switch to BigInt - see #7912 (comment)

Second - I see that readdir also returns the same position that telldir returns as d_off -
https://man7.org/linux/man-pages/man3/readdir.3.html

So we should also return this number (as bigint) from our dir.read() call so that we don't have to call telldir again if we just read the last entry where we want to pause. This means we should add it to our struct Entry (in this file).

@naveenpaul1 naveenpaul1 requested a review from guymguym April 12, 2024 11:47
src/native/fs/fs_napi.cpp Outdated Show resolved Hide resolved
src/native/fs/fs_napi.cpp Outdated Show resolved Hide resolved
src/test/unit_tests/jest_tests/test_list_object.test.js Outdated Show resolved Hide resolved
@naveenpaul1 naveenpaul1 requested a review from guymguym April 15, 2024 08:30
@naveenpaul1 naveenpaul1 requested a review from guymguym April 15, 2024 09:26
@guymguym
Copy link
Member

I am copying here some findings that I want to keep in mind when implementing list objects using this.

It seems that on BSD the docs for telldir/seekdir say that the position is only valid for the lifetime of the dir handle and once it closes the positions are no longer valid to use:

https://man.freebsd.org/cgi/man.cgi?query=seekdir

       The telldir() function returns a	token representing the	current	 loca-
       tion  associated	 with  the named directory stream.  Values returned by
       telldir() are good only for the lifetime	of the DIR pointer, dirp, from
       which they are derived.	If the directory is closed and then  reopened,
       prior values returned by	telldir() will no longer be valid.  Values re-
       turned by telldir() are also invalidated	by a call to rewinddir().

       The  seekdir()  function	sets the position of the next readdir()	opera-
       tion on the directory stream.  The new position reverts to the one  as-
       sociated	 with  the  directory  stream when the telldir() operation was
       performed.

This article describes a bug that was hidden in BSD implementation of telldir for 25 years until it was found by samba users - https://marcbalmer.ch/when-seekdir-wont-seek-to-the-right-position-a9e2b1986203

In Linux the docs do not specifically say for how long the positions returned by telldir are valid, so it might be that different filesystems will interpret those docs differently.

Overall, I think we should be careful here, and enable the optimization only when we know for sure that the filesystem and the kernel supports stable dir positions even using unrelated dir handles.

Signed-off-by: naveenpaul1 <napaul@redhat.com>
@naveenpaul1 naveenpaul1 requested a review from guymguym April 15, 2024 10:19
Copy link
Member

@guymguym guymguym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Lets get it merged.

@naveenpaul1 naveenpaul1 merged commit dbe81f9 into noobaa:master Apr 15, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants