-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add read, read_string, and write functions to std::fs #45837
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @dtolnay (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
Before: ```rust use std::fs::File; use std::io::Read; let mut bytes = Vec::new(); File::open(filename)?.read_to_end(&mut bytes)?; do_something_with(bytes) ``` After: ```rust use std::fs::File; do_something_with(File::read_contents(filename)?) ```
6c00a5c
to
fd518ac
Compare
How about a version which reads into a |
src/libstd/fs.rs
Outdated
/// ``` | ||
#[unstable(feature = "file_read_write_contents", issue = /* FIXME */ "0")] | ||
pub fn read_contents<P: AsRef<Path>>(path: P) -> io::Result<Vec<u8>> { | ||
let mut bytes = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could look up the file size first and use Vec::with_capacity
. It might mean as extra system call when reading a tiny file, but it is a huge speedup (up to 2× on my system) when reading large files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe read_to_end
should be responsible for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Through specialization on Seek
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specialization of read_to_end
is a good idea. Filed #45851.
Note that callers of read_to_end
can already avoid quadratic behavior by passing in a pre-allocated buffer. That workaround isn't available with the functions in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve opened a PR with a prototype: #45928
I’m unsure about the naming. @mbrubeck suggests on IRC |
I've wanted these functions in Rust for a while. In .NET, there is |
Alright, I pushed a couple new commits to switch to free functions and add reading to |
Why |
@euclio Because we’re reading bytes and decoding/validating them as UTF-8. Which happens to be the internal encoding of I my opinion it was a mistake for |
I'm not so sure this is a good idea. It combines the errors of two distinct operations. Also, if doing something like For I assume the rationale is just making things easier? If so, the write case is already pretty short (and explicit): File::create(path)?.write_all(contents) For the read cases, maybe we can add instance methods instead? e.g. File::open(path)?.slurp()
File::open(path)?.slurp_utf8() This also seems short to me, not that far from Python's |
The entire point of these functions is for the 90% use case where you don't care about distinguishing between those distinct operations. |
To me, given we have
then
Seems easier to remember (do I want to reuse a buffer or just read and get something) than
which is kind of stop thinking of files first, now thing of some other concept instead. |
Like @jminer, I'm a big fan of these in .Net, and would love to have them in Rust. They're rarely the absolute most optimal way to do whatever, but they're so convenient that I end up using them more than the "proper" way overall, since they're super-helpful in tests, adhoc tools, etc. I like the name saying |
@bluetech As @sfackler said often you don’t care about distinguishing these two errors, and when you do the separate functions/methods are still available. I agree that As to |
Ping from triage @dtolnay — will you be able to spend some time on this? |
I like this and I agree there is value in providing simple conveniences for the 90% use case where you don't care why reading from a file failed, whether you failed to open the file or failed to read bytes or the bytes weren't UTF-8. This resolves the idea Brian expressed in #34857 -- "be more aggressive about identifying and adding simple ergonomic improvements." This is a significant enough new API that I would appreciate a few more eyes. @rfcbot fcp merge |
Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged teams: Concerns:
Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
@rfcbot concern write I'm definitely on board with read and read_utf8, but I'm a worried about write due to the ambiguity around how it would behave with respect to file creation and truncation. |
I like these methods as well. I would be OK with |
Regarding having a separate Regarding |
@sfackler to confirm, has your concern with |
Yep, I think it's okay as-is. |
@rfcbot resolved write |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see all the check boxes so let's get this merged. @SimonSapin please file a tracking issue and update the number in the unstable attributes. r=me
@bors r+ |
📌 Commit c5eff54 has been approved by |
Done: #46588 |
⌛ Testing commit c5eff54 with merge 6b31b1ab1f3a8f4fb4bbeb53c19bbe4d974458e6... |
💔 Test failed - status-travis |
Add read, read_string, and write functions to std::fs New APIs in `std::fs`: ```rust pub fn read<P: AsRef<Path>>(path: P) -> io::Result<Vec<u8>> { … } pub fn read_string<P: AsRef<Path>>(path: P) -> io::Result<String> { … } pub fn write<P: AsRef<Path>, C: AsRef<[u8]>>(path: P, contents: C) -> io::Result<()> { ... } ``` (`read_string` is based on `read_to_string` and so returns an error on non-UTF-8 content.) Before: ```rust use std::fs::File; use std::io::Read; let mut bytes = Vec::new(); File::open(filename)?.read_to_end(&mut bytes)?; do_something_with(bytes) ``` After: ```rust use std::fs; do_something_with(fs::read(filename)?) ```
☀️ Test successful - status-appveyor, status-travis |
What's missing is an lazy iterator on text lines? |
I'm really happy about these additions but I'm just wondering if there is was an RFC which could be added as a reference to this issue? |
@SimonSapin Thanks, I thought I was just really bad at searching :) I'll subscribe on there. |
Use the new fs_read_write functions in rustc internals Uses `fs::read` and `fs::write` (added by rust-lang#45837) where appropriate, to simplify code and dog-food these new APIs. This also improves performance, when combined with rust-lang#47324.
Use the new fs_read_write functions in rustc internals Uses `fs::read` and `fs::write` (added by rust-lang#45837) where appropriate, to simplify code and dog-food these new APIs. This also improves performance, when combined with rust-lang#47324.
Use the new fs_read_write functions in rustc internals Uses `fs::read` and `fs::write` (added by rust-lang#45837) where appropriate, to simplify code and dog-food these new APIs. This also improves performance, when combined with rust-lang#47324.
New APIs in
std::fs
:(
read_string
is based onread_to_string
and so returns an error on non-UTF-8 content.)Before:
After: