-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add zero-compromise directory iteration #457
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I still need to read through the main implementation code another time, but here are a few initial review comments.
src/backend/linux_raw/fs/raw_dir.rs
Outdated
/// | ||
/// let fd = openat(cwd(), ".", OFlags::RDONLY | OFlags::DIRECTORY, Mode::empty()).unwrap(); | ||
/// | ||
/// let mut buf = [MaybeUninit::uninit(); 2048]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this example use DIR_BUF_LEN
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before making docs changes, I want to make sure we're on the same page since I think we're coming at this from different angles. I was trying to demo the edge cases while assuming that people would know to use Vec::spare_capacity_mut
for everyday life, but that's probably a bad assumption.
Here's how I'd like people to use the API:
- When using recursion, people should either use a vec or the stack. If using the stack, I'd lean towards a smaller value to minimize wasted space and skip stack probes, hence the 2048. If using the heap, then we can afford to waste more space, hence using 8192. So maybe this suggests having a constant is actually a bad idea? Instead we could tell people to "Use a buffer size of at least NAME_MAX+24 bytes. We suggest 2048 for stack allocated buffers and 8192 for heap allocated buffers." In practice, almost all file systems use 255 or smaller as their limit: https://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits. Reiser appears to be the only deviant.
- When not using recursion, people would ideally have a cached vec somewhere that they re-use.
Can we guarantee that DIR_BUF_LEN is long enough to support any path that the host OS supports (considering NAME_MAX)?
I don't think so unfortunately, which is why I'm leaning towards removing the constant. There are basically no guarantees on NAME_MAX: https://www.gnu.org/software/libc/manual/html_node/Limits-for-Files.html. Wikipedia has the current file system limits, but nothing prevents a future file system from removing the limit entirely for example.
Going back to the docs, what about having a simple heap example, a simple stack example, and then a production-ready buffer resizing example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going back to the docs, what about having a simple heap example, a simple stack example, and then a production-ready buffer resizing example?
That sounds good. I imagine you could even skip the simple heap example if you wanted to. I imagine the vast majority of Rust code will continue using std::fs::ReadDir
to read directories, or walkdir or so, so I imagine the main audience for using this API directly will be people doing low-level optimization work.
For the stack example, I'm not comfortable encouraging people to use fixed-sized buffers if OS's don't have a limit. If we're not confident enough to assume a NAME_MAX
exists, it feels awkward to make suggestions to users that they bake in numbers like 2048. Could we instead give some guidance like, "only use this approach for reading directories with known layouts where all entries have names less than X" or so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we instead give some guidance like, "only use this approach for reading directories with known layouts where all entries have names less than X" or so?
I think that's reasonable, I added clarifications for the simple stack + heap examples. Happy to remove either if you'd rather not have the example at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks for working on this!
I notice this is implemented only for the linux_raw
backend. I think that's fine for now, though in case you're interested, a next step here is implementing it for the libc backend, using libc::syscall
to call getdents
.
src/backend/linux_raw/fs/raw_dir.rs
Outdated
/// buf.reserve(new_capacity); | ||
/// } | ||
/// ``` | ||
pub fn new(fd: Fd, buf: &'buf mut [MaybeUninit<u8>]) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know if getdents
requires the buffer to be aligned at all? If so, we should document that, and either change the type here or potentially make this unsafe
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on some experimentation, no:
slice::from_raw_parts_mut(
buf.as_mut_ptr().add(3) as *mut MaybeUninit<u8>,
buf.capacity() - 3,
)
That still works.
And this popped up in a search: microsoft/WSL#1769
Anyway it isn't an ABI requirement the pointer be 8 byte aligned
So I'm going to say I'm fairly confident the buffer doesn't have to be aligned. I would also expect the syscall to fail if it needs to be aligned which means it isn't a safety issue.
Done, I wasn't sure if using |
Looks like I'm going to need help figuring out the right cfgs. Are there more |
The main other |
Thanks! |
Hmmm, looks like |
Oh, hrm. The Maybe what we could do here is just special-case x32 here. When |
That's a little sad, but done. |
Thinking about this more, I believe I now have a better solution for x32: sunfishcode/linux-raw-sys#36 It's just a matter of putting special cases in the right place 😅. |
Ok, that patch is now in linux-raw-sys 0.1.3. Could you try updating that and trying this patch without the special case for x86? |
Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
… BS, yielding a 2x instruction count reduction Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
Sweet, that's much nicer. Thanks! |
Looks good, thanks! |
Woohoo, thanks! |
Closes #451
Background reading:
Notes
Based on the above background reading, the following assumptions appear to be sound:
d_off
is actually a cookie that you can use to seek to the next dirent. It has no relation to byte offsets.Benchmarks
Optimal buf size analysis
I set up benchmarks that used power-of-two sized buffers. 2^13 seemed optimal and it's also what the stdlib uses in its BufWriter and BufReader implementations.