Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add read, read_string, and write functions to std::fs #45837

Merged
merged 6 commits into from
Dec 9, 2017

Conversation

SimonSapin
Copy link
Contributor

@SimonSapin SimonSapin commented Nov 7, 2017

New APIs in std::fs:

pub fn read<P: AsRef<Path>>(path: P) -> io::Result<Vec<u8>> {}
pub fn read_string<P: AsRef<Path>>(path: P) -> io::Result<String> {}
pub fn write<P: AsRef<Path>, C: AsRef<[u8]>>(path: P, contents: C) -> io::Result<()> { ... }

(read_string is based on read_to_string and so returns an error on non-UTF-8 content.)

Before:

use std::fs::File;
use std::io::Read;

let mut bytes = Vec::new();
File::open(filename)?.read_to_end(&mut bytes)?;
do_something_with(bytes)

After:

use std::fs;

do_something_with(fs::read(filename)?)

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @dtolnay (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@kennytm kennytm added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Nov 7, 2017
Before:

```rust
use std::fs::File;
use std::io::Read;

let mut bytes = Vec::new();
File::open(filename)?.read_to_end(&mut bytes)?;
do_something_with(bytes)
```

After:

```rust
use std::fs::File;

do_something_with(File::read_contents(filename)?)
```
@ollie27
Copy link
Member

ollie27 commented Nov 7, 2017

How about a version which reads into a String like #34857?

src/libstd/fs.rs Outdated
/// ```
#[unstable(feature = "file_read_write_contents", issue = /* FIXME */ "0")]
pub fn read_contents<P: AsRef<Path>>(path: P) -> io::Result<Vec<u8>> {
let mut bytes = Vec::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could look up the file size first and use Vec::with_capacity. It might mean as extra system call when reading a tiny file, but it is a huge speedup (up to 2× on my system) when reading large files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe read_to_end should be responsible for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Through specialization on Seek.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specialization of read_to_end is a good idea. Filed #45851.

Note that callers of read_to_end can already avoid quadratic behavior by passing in a pre-allocated buffer. That workaround isn't available with the functions in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve opened a PR with a prototype: #45928

@SimonSapin
Copy link
Contributor Author

I’m unsure about the naming. @mbrubeck suggests on IRC read and write free functions in the std::fs module.

@jminer
Copy link

jminer commented Nov 7, 2017

I've wanted these functions in Rust for a while. In .NET, there is File.ReadAllBytes, File.WriteAllBytes, File.ReadAllText, File.WriteAllText, and File.AppendAllText and . The .NET names have always seemed clear to me. I like that .NET uses "all" in the names to signal they read or write the whole file, compared to other read and write functions that don't.

@SimonSapin SimonSapin changed the title Add File::read_contents and File::write_contents convenience functions. Add read, read_utf8, and write functions to std::fs Nov 7, 2017
@SimonSapin
Copy link
Contributor Author

Alright, I pushed a couple new commits to switch to free functions and add reading to String. See updated PR message.

@euclio
Copy link
Contributor

euclio commented Nov 7, 2017

Why read_utf8 over read_string?

@SimonSapin
Copy link
Contributor Author

@euclio Because we’re reading bytes and decoding/validating them as UTF-8. Which happens to be the internal encoding of String, but that’s incidental as far as this function is concerned.

I my opinion it was a mistake for Read::read_to_string to have a name that does not include utf8. We do include it in str::from_utf8, String::from_utf8, String::from_utf8_lossy, char::decode_utf8.

@bluetech
Copy link

bluetech commented Nov 7, 2017

I'm not so sure this is a good idea. It combines the errors of two distinct operations. Also, if doing something like fs::write(mypath) + fs::read(mypath), the path is opened two times, but in the meantime the file might have been deleted, etc.

For fs::write, to me at least, it is not entirely clear from the name alone whether it truncates or overwrites (or even appends).

I assume the rationale is just making things easier? If so, the write case is already pretty short (and explicit):

File::create(path)?.write_all(contents)

For the read cases, maybe we can add instance methods instead? e.g.

File::open(path)?.slurp()
File::open(path)?.slurp_utf8()

This also seems short to me, not that far from Python's open(path).read().

@sfackler
Copy link
Member

sfackler commented Nov 7, 2017

I'm not so sure this is a good idea. It combines the errors of two distinct operations.

The entire point of these functions is for the 90% use case where you don't care about distinguishing between those distinct operations.

@JordiPolo
Copy link
Contributor

To me, given we have

File::open(filename)?.read_to_end(&mut bytes)?;

then

File::open(filename)?.read()?;

Seems easier to remember (do I want to reuse a buffer or just read and get something) than

fs::read(filename)?

which is kind of stop thinking of files first, now thing of some other concept instead.

@scottmcm
Copy link
Member

scottmcm commented Nov 8, 2017

Like @jminer, I'm a big fan of these in .Net, and would love to have them in Rust. They're rarely the absolute most optimal way to do whatever, but they're so convenient that I end up using them more than the "proper" way overall, since they're super-helpful in tests, adhoc tools, etc.

I like the name saying utf8 explicitly, leaving space for a crate to make an encoding-adaptive version.

@SimonSapin
Copy link
Contributor Author

@bluetech As @sfackler said often you don’t care about distinguishing these two errors, and when you do the separate functions/methods are still available.

I agree that File::create(path)?.write_all(contents)? is less bad than the read case that requires an intermediate variable, but it still requires importing the std::io::Write trait separately. These functions can be useful for example for quick prototyping, where even that can be additional friction.

As to fs::write followed immediately by fs::read, yes you wouldn’t do that in real code. You only see it in this diff because there’s a test for these functions themselves.

@shepmaster
Copy link
Member

Ping from triage @dtolnay — will you be able to spend some time on this?

@dtolnay
Copy link
Member

dtolnay commented Nov 18, 2017

I like this and I agree there is value in providing simple conveniences for the 90% use case where you don't care why reading from a file failed, whether you failed to open the file or failed to read bytes or the bytes weren't UTF-8. This resolves the idea Brian expressed in #34857 -- "be more aggressive about identifying and adding simple ergonomic improvements."

This is a significant enough new API that I would appreciate a few more eyes.

@rfcbot fcp merge

@rfcbot rfcbot added the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Nov 18, 2017
@rfcbot
Copy link

rfcbot commented Nov 18, 2017

Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged teams:

Concerns:

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@kennytm kennytm added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 18, 2017
@sfackler
Copy link
Member

@rfcbot concern write

I'm definitely on board with read and read_utf8, but I'm a worried about write due to the ambiguity around how it would behave with respect to file creation and truncation.

@BurntSushi
Copy link
Member

I like these methods as well. I would be OK with write as-is, since I think the documented behavior is exactly what I'd expect. I do not feel strongly though!

@SimonSapin
Copy link
Contributor Author

Regarding having a separate write_string function, I think that reducing the “API surface” is preferable when it doesn’t reduce functionality.

Regarding read_utf8 vs read_string, it’s a trade-off between describing what the function does v.s. std-internal consistency with the existing Read::read_to_string method. I happen to put less value on internal consistency, but that’s really an opinion. The libs team’s consensus seems to be going the other way, so I’ll rename to read_string.

@alexcrichton
Copy link
Member

@sfackler to confirm, has your concern with write been resolved?

@sfackler
Copy link
Member

sfackler commented Dec 6, 2017

Yep, I think it's okay as-is.

@dtolnay
Copy link
Member

dtolnay commented Dec 6, 2017

@rfcbot resolved write

Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see all the check boxes so let's get this merged. @SimonSapin please file a tracking issue and update the number in the unstable attributes. r=me

@SimonSapin SimonSapin changed the title Add read, read_utf8, and write functions to std::fs Add read, read_string, and write functions to std::fs Dec 8, 2017
@dtolnay
Copy link
Member

dtolnay commented Dec 8, 2017

@bors r+

@bors
Copy link
Contributor

bors commented Dec 8, 2017

📌 Commit c5eff54 has been approved by dtolnay

@SimonSapin
Copy link
Contributor Author

Done: #46588

@bors
Copy link
Contributor

bors commented Dec 8, 2017

⌛ Testing commit c5eff54 with merge 6b31b1ab1f3a8f4fb4bbeb53c19bbe4d974458e6...

@bors
Copy link
Contributor

bors commented Dec 8, 2017

💔 Test failed - status-travis

@kennytm
Copy link
Member

kennytm commented Dec 8, 2017

@bors retry #42117

@bors
Copy link
Contributor

bors commented Dec 8, 2017

⌛ Testing commit c5eff54 with merge c7b6d82...

bors added a commit that referenced this pull request Dec 8, 2017
Add read, read_string, and write functions to std::fs

New APIs in `std::fs`:

```rust
pub fn read<P: AsRef<Path>>(path: P) -> io::Result<Vec<u8>> { … }
pub fn read_string<P: AsRef<Path>>(path: P) -> io::Result<String> { … }
pub fn write<P: AsRef<Path>, C: AsRef<[u8]>>(path: P, contents: C) -> io::Result<()> { ... }
```

(`read_string` is based on `read_to_string` and so returns an error on non-UTF-8 content.)

Before:

```rust
use std::fs::File;
use std::io::Read;

let mut bytes = Vec::new();
File::open(filename)?.read_to_end(&mut bytes)?;
do_something_with(bytes)
```

After:

```rust
use std::fs;

do_something_with(fs::read(filename)?)
```
@bors
Copy link
Contributor

bors commented Dec 9, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: dtolnay
Pushing c7b6d82 to master...

@bors bors merged commit c5eff54 into rust-lang:master Dec 9, 2017
@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Dec 9, 2017
@leonardo-m
Copy link

What's missing is an lazy iterator on text lines?

@SimonSapin SimonSapin deleted the file_read_write branch December 9, 2017 22:55
@Ryman
Copy link
Contributor

Ryman commented Dec 23, 2017

I'm really happy about these additions but I'm just wondering if there is was an RFC which could be added as a reference to this issue?

@SimonSapin
Copy link
Contributor Author

@Ryman This PR is closed (merged), further discussion should go into the tracking issue: #46588

There was not RFC for this, but there will be a FCP before it is stabilized.

@Ryman
Copy link
Contributor

Ryman commented Dec 23, 2017

@SimonSapin Thanks, I thought I was just really bad at searching :) I'll subscribe on there.

kennytm added a commit to kennytm/rust that referenced this pull request Jan 11, 2018
Use the new fs_read_write functions in rustc internals

Uses `fs::read` and `fs::write` (added by rust-lang#45837) where appropriate, to simplify code and dog-food these new APIs.  This also improves performance, when combined with rust-lang#47324.
kennytm added a commit to kennytm/rust that referenced this pull request Jan 11, 2018
Use the new fs_read_write functions in rustc internals

Uses `fs::read` and `fs::write` (added by rust-lang#45837) where appropriate, to simplify code and dog-food these new APIs.  This also improves performance, when combined with rust-lang#47324.
kennytm added a commit to kennytm/rust that referenced this pull request Jan 12, 2018
Use the new fs_read_write functions in rustc internals

Uses `fs::read` and `fs::write` (added by rust-lang#45837) where appropriate, to simplify code and dog-food these new APIs.  This also improves performance, when combined with rust-lang#47324.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.