Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement O_DIRECT for open to bypass metadata cache #614

Merged
merged 4 commits into from
Nov 21, 2023

Conversation

dannycjones
Copy link
Contributor

Description of change

This is part of metadata and data caching feature currently in implementation: #581.

Some applications may want to opt-out of caching, maybe aware or not aware of Mountpoint itself. Applications use the O_DIRECT flag when opening a file to indicate that file content should be read directly, bypassing the Kernel caches.

We want to also skip Mountpoint's metadata cache when this flag is provided, so that open calls using it will see the latest object in S3 at open - as it works today without opting into caching.

Relevant issues: #255

Does this change impact existing behavior?

No breaking changes, changes only to a feature has not been released yet.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>
@dannycjones dannycjones requested a review from passaro November 20, 2023 13:58
@dannycjones dannycjones temporarily deployed to PR integration tests November 20, 2023 13:58 — with GitHub Actions Inactive
@dannycjones dannycjones temporarily deployed to PR integration tests November 20, 2023 13:58 — with GitHub Actions Inactive
@dannycjones dannycjones temporarily deployed to PR integration tests November 20, 2023 13:58 — with GitHub Actions Inactive
}

// Open the file w/ O_DIRECT, which should see the new file on S3 despite the old file being cached
let mut buf = [0u8; OBJECT_SIZE];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I use vec![0u8; OBJECT_SIZE] here, I sometimes find that the buffer hasn't been written to (even though read_exact_at has returned Ok). This makes me super uncomfortable (which is why this PR is still draft, not ready to merge).

Copy link
Member

@jamesbornholt jamesbornholt Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O_DIRECT is really hard to test under the Rust test runner because it has undefined behavior if run concurrently with a fork, and all of our FUSE tests involve a fork to spawn fusermount. We've seen it before in #114. Do you still see the issue if you filter to run only one of these tests?

You probably don't see it with this version because buf is on the stack, so less likely to be immediately scribbled over by malloc in the forked process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I saw those limitations with O_DIRECT and fork but wasn't expecting us to be using it in the tests. That sounds like a probable cause!

I don't see the issue when running the test on its own.

@dannycjones dannycjones changed the title Implement O_DIRECT for open to bypass metadata cache if enabled Implement O_DIRECT for open to bypass metadata cache Nov 20, 2023
@dannycjones dannycjones temporarily deployed to PR integration tests November 20, 2023 14:33 — with GitHub Actions Inactive
@dannycjones dannycjones temporarily deployed to PR integration tests November 20, 2023 14:33 — with GitHub Actions Inactive
@dannycjones dannycjones temporarily deployed to PR integration tests November 20, 2023 14:33 — with GitHub Actions Inactive
@dannycjones dannycjones temporarily deployed to PR integration tests November 20, 2023 14:33 — with GitHub Actions Inactive
@dannycjones dannycjones added the performance PRs to run benchmarks on label Nov 21, 2023
… serially

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
passaro
passaro previously approved these changes Nov 21, 2023
Copy link
Contributor

@passaro passaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change in open looks good to me.

I've re-organized the integration tests so that we can force your O_DIRECT tests to execute serially.

mountpoint-s3/tests/common/mod.rs Show resolved Hide resolved
mountpoint-s3/tests/direct_io.rs Outdated Show resolved Hide resolved
@dannycjones dannycjones marked this pull request as ready for review November 21, 2023 18:04
Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
@passaro passaro temporarily deployed to PR integration tests November 21, 2023 18:11 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests November 21, 2023 18:11 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests November 21, 2023 18:11 — with GitHub Actions Inactive
@passaro passaro temporarily deployed to PR integration tests November 21, 2023 18:11 — with GitHub Actions Inactive
@dannycjones dannycjones removed the performance PRs to run benchmarks on label Nov 21, 2023
@dannycjones
Copy link
Contributor Author

LGTM, but its my PR so also needs your approval @passaro 😄

@dannycjones dannycjones added this pull request to the merge queue Nov 21, 2023
Merged via the queue into awslabs:main with commit 3c5f93d Nov 21, 2023
18 of 19 checks passed
@dannycjones dannycjones deleted the caching-directio branch November 21, 2023 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants