Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow repeated readdir offsets #581

Merged
merged 3 commits into from
Oct 27, 2023

Conversation

jamesbornholt
Copy link
Member

@jamesbornholt jamesbornholt commented Oct 27, 2023

Description of change

POSIX allows seeking an open directory handle, which in FUSE means the
offset can be any offset we've previously returned. This is pretty
annoying for us to implement since we're streaming directory entries
from S3 with ListObjects, which can't resume from an arbitrary index,
and can't fit its continuation tokens into a 64-bit offset anyway. So
we're probably never going to truly support seeking a directory handle.

But there's a special case we've seen come up a couple of times (#477, #520):
some applications read one page of directory entries and then seek back
to 0 and do it again. I don't fully understand why they do this, but
it's common enough that it's worth special casing.

This change makes open directory handles remember their most recent
response so that they can repeat it if asked for the same offset again.
It's not too complicated other than needing to make sure we do
readdirplus correctly (managing the lookup counts for entries that are
being returned a second time).

I've tested this by running the PHP example from #477, which now works.

Related issues: fixes #477, fixes #520.

Does this change impact existing behavior?

No, it allows a case to succeed that previously failed.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

@jamesbornholt jamesbornholt temporarily deployed to PR integration tests October 27, 2023 03:35 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt temporarily deployed to PR integration tests October 27, 2023 03:35 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt temporarily deployed to PR integration tests October 27, 2023 03:35 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt temporarily deployed to PR integration tests October 27, 2023 03:35 — with GitHub Actions Inactive
POSIX allows seeking an open directory handle, which in FUSE means the
`offset` can be any offset we've previously returned. This is pretty
annoying for us to implement since we're streaming directory entries
from S3 with ListObjects, which can't resume from an arbitrary index,
and can't fit its continuation tokens into a 64-bit offset anyway. So
we're probably never going to truly support seeking a directory handle.

But there's a special case we've seen come up a couple of times (awslabs#477, awslabs#520):
some applications read one page of directory entries and then seek back
to 0 and do it again. I don't fully understand _why_ they do this, but
it's common enough that it's worth special casing.

This change makes open directory handles remember their most recent
response so that they can repeat it if asked for the same offset again.
It's not too complicated other than needing to make sure we do
readdirplus correctly (managing the lookup counts for entries that are
being returned a second time).

I've tested this by running the PHP example from awslabs#477, which now works.

Signed-off-by: James Bornholt <bornholt@amazon.com>
@jamesbornholt jamesbornholt temporarily deployed to PR integration tests October 27, 2023 03:38 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt temporarily deployed to PR integration tests October 27, 2023 03:38 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt temporarily deployed to PR integration tests October 27, 2023 03:38 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt temporarily deployed to PR integration tests October 27, 2023 03:38 — with GitHub Actions Inactive
Copy link
Contributor

@dannycjones dannycjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I was hoping we might add support for this scenario soon.

I have just one concern on the repeated reply sizes matching, see comments.

mountpoint-s3/src/fs.rs Outdated Show resolved Hide resolved
mountpoint-s3/src/fs.rs Show resolved Hide resolved
mountpoint-s3/tests/fs.rs Show resolved Hide resolved
Copy link
Contributor

@dannycjones dannycjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a CHANGELOG.md entry would be worth adding since it'll be a 'bug fix' / 'new feature' to be able to support languages which use readdir in this way!

Signed-off-by: James Bornholt <bornholt@amazon.com>
Signed-off-by: James Bornholt <bornholt@amazon.com>
@jamesbornholt jamesbornholt temporarily deployed to PR integration tests October 27, 2023 13:54 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt temporarily deployed to PR integration tests October 27, 2023 13:54 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt temporarily deployed to PR integration tests October 27, 2023 13:54 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt temporarily deployed to PR integration tests October 27, 2023 13:54 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt marked this pull request as ready for review October 27, 2023 13:57
Copy link
Contributor

@dannycjones dannycjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

mountpoint-s3/src/fs.rs Show resolved Hide resolved
@jamesbornholt jamesbornholt added this pull request to the merge queue Oct 27, 2023
Merged via the queue into awslabs:main with commit 8e5688d Oct 27, 2023
18 checks passed
@jamesbornholt jamesbornholt deleted the readdir-repeat branch October 27, 2023 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants