User-friendly input macros. #67

nrc · 2014-05-04T22:44:46Z

Add scan!, scanln!, etc. macros which mirror print!, println!, etc. for doing straightforward input from stdin, a file, or other Reader.

I really miss having something like this. I think it would be really good for increasing uptake of Rust to have a good story here.

I was hacking on a prototype implementation over the weekend. It only supports {} holes (although some of the infrastructure for fixed widths is there) and doesn't work yet, but its getting there. I'd like to land something like that and then extend the mini-language gradually, letting the design evolve.

Add `scan!`, `scanln!`, etc. macros which mirror `print!`, `println!`, etc. for doing straightforward input from stdin, a file, or other Reader.

nrc · 2014-05-04T22:46:16Z

See also issue 6220.

huonw · 2014-05-04T22:51:58Z

cc @lifthrasiir (since you wrote a (non-procedural!) macro for this.)

lifthrasiir · 2014-05-05T08:39:47Z

@huonw Actually I have my own experimental syntax extension named read.rs, and an associated draft design document.

In my opinion, the most problematic aspect of such syntax extension is how to return the read values if any. For example, if scanln!("{} {} {}", a, b, c) wouldn't silently fail on the parsing error, then a, b and c should be assigned before scanln! as scanln! may not assign to them in some cases. (My lex! macro also suffers from this problem, which I consider a bug.)

milibopp · 2014-05-05T09:18:21Z

I find it very weird to have this pseudo-initialization code before the call to scanln!. I think it makes a lot more sense to have the macro expand to code that returns a tuple of the input values. It should probably be an option tuple like Option<(str, u64)> to handle invalid input.

nrc · 2014-05-05T10:03:05Z

@aepsil0n this is pretty much what scanf and streaming io do, but now that you mention it, is a little bit weird, or at least not very Rust-y. This does have the benefit of mirroring println! too. I'm not sure how you would know what types to use for the variables if you returned them, rather than taking out params.

I think you could probably get away without the initialisation too, e.g.,

let x: int;
let y: int;
scanln!("{} {}", x, y);

I want to avoid returning a value wrapped in an Option or Result so that this is a very lightweight, easy to use API. There might need to be a recoverable form too which returns a Result. I kind of feel that if your software quality bar is high enough to check the result of io operations, you should be using a proper IO library and not scanln.

milibopp · 2014-05-05T11:15:36Z

Would this work?

let (x, y): (f64, str) = scanln!("{} {}");

It's clear from the context, but I don't know how powerful the macro system is.

pnkfelix · 2014-05-05T11:36:26Z

There might need to be a recoverable form too which returns a Result. I kind of feel that if your software quality bar is high enough to check the result of io operations, you should be using a proper IO library and not scanln.

This comment IMO ties into questions raised about the println! API's handling of errors like EPIPE on rust-lang/rust#13824.

More specifically: there is a viewpoint put forward in the comments for that ticket that if you want to handle errors printing to stdout, you should be using io::stdout() rather than println!.

Note that "errors" here includes broken pipes that one can get when piping output to a unix tool like head.
I do not necessarily disagree with that viewpoint, but I do suspect that we need to provide an easier middle ground between the nice-but-fail-sy println! macro versus a series of method calls on stdout's LineBufferedWriter<StdWriter>.

nrc · 2014-05-05T22:03:32Z

@aepsil0n it might. I'm also not so clear on if the macro system is powerful enough to do that. It works for vec!, but I'm not sure if the cases are comparable.

@pnkfelix yeah, I agree with that viewpoint on println!. I also think we need a nice middle ground. I'm not sure if that is a recoverable version of println!/scanln! or if we just need a better API for 'serious' IO.

o11c · 2014-05-17T22:42:16Z

I disagree with several of the assumptions made in this RFC. scan should not be perfectly symmetrical with the way print! works (though perhaps a variant of print! could be improved to follow scan!)

In my experience, there are only two kinds of input:

token-based input, where you only every want one token at a time, then you switch on the type of token in your parser
line-based input, where you read a line at a time, then perform splitting operations on that line. In some cases, that may lead to requesting another line.

... and I've found that it makes for better error messages if I implement token-based input on top of line-based input.

In my C++ codebase, I currently have the following kinds of splitters:

an extract family that is recursive and intended for machine-produced data such as CSV (or any other separator ... space is confusing to humans since each space is its own split), but also contains the terminals for things like "extract just an int" (but Rust has FromStr for the latter), which are reused by the other families. Since Rust doesn't have variadic templates yet, it would have to use an array-of-trait for the recursive case, which is quite awkward, but still better than other approaches IMO.
an asplit family that splits on any run of whitespace, like a shell (I have a couple of variants of this actually, to handle different quoting styles). This family also supports only parsing the head part of a string and returning the unparsed portion.
a config_parse method that just splits a leading key: and then converts the value to an the type of the variable (this is an interesting exercise in "which is faster, linear scan or multiple allocation + virtual function ", though since I only use it at program startup it doesn't matter ... currently I'm using a linear scan in code. Because external vtables, Rust could avoid the per-variable allocation by doing a map of &mut Trait, but still has to pay the cost of allocating the map data itself).

nrc · 2014-05-17T23:20:02Z

@o11c I think you are right for a general purpose IO library. But I am proposing scan as a 'toy' IO library. The kind if thing which is suited for programming exercises (in the tutorials, or for a university course, for example) or programming competitions. We just want ease of use, really. Robustness, extensibility, and efficiency are not primary as they would be for a real-world IO library.

o11c · 2014-05-17T23:27:56Z

@nick29581 I completely disagree that a toy library is a good idea - and that is certainly not something that parallels print!. Tutorials and university courses should not teach you to ignore sanity. All too often, they never get around to telling you that all the code you learned to use is a completely wrong approach - and that's not nearly as good as just teaching the right approach in the first place.

It's no harder to use my libraries than to use scanf, and it's much safer.

nrc · 2014-05-18T00:01:28Z

@o11c I guess we just have to disagree. I can see the merit in your approach though. My feeling is that when teaching, you should teach one thing at a time, so when teaching about IO, you should teach the good stuff that matters, but when teaching something else where IO is peripheral and you just want to get some input to make a fun exercise, then you just want something that is as simple as possible. Any boilerplate at all is a distraction.

There are certainly pros and cons for matching scan with print, they are not doing the same thing, but then the symmetry is appealing.

I'm not sure from your library description how they are used, but it seems more complex than scanf, which is just a single function call.

In the same way that println! exists just to do output in exercies/prototyping/debugging, I think there should be something similarly simple for input.

o11c · 2014-05-18T05:56:02Z

@nick29581

I'm not sure from your library description how they are used, but it seems more complex than scanf, which is just a single function call.

Nope. All of the below linked code is replacing calls to sscanf, and the new code is much more robust, shorter, and faster.

asplit is just as simple as sscanf, except there's no format string (it always splits on whitespace).

For extract, it depends on what you're doing:

if you're only extracting a single item, it's simpler than sscanf because there is no format string
If you're extracting non-nested csv-like data of known or unknown length, you specify the character to split on as a template argument (would work a normal argument, since Rust doesn't have value template arguments yet) to the record or vrec factory functions, and pass the resulting object as the argument to extract (this is much simpler to do than to explain, see links below)
If you're extracting nested csv-like data, you just nest the record calls (but usually you don't need to nest them, because the inner objects have their own extract implementation)

(obviously in Rust these would be a trait implementation rather than overloaded functions)
(XString is basically &str, ZString is the same but with a guaranteed '\0', and LString (which is new) is &'static str. All my other string classes (excluding FormatString, which doesn't count) are owned.)

Links to how simple or complicated extract is for various purposes:

simple extractor implementations
more complex extractor for IPv4 addresses and masks
nested use and nesteder use
standalone example that loads a simple savefile and also has one variable of conditional type (this should use the optional-fields thing to simplify the first if)
one and another motivating example for optional fields
using vrec to work around a badly-designed data format
extractor that just checks whether two strings are equal and one of the few uses of it (this is new, could be used in more places)
hacky splitting on multi-char :: (this could use the new literal extractor for the middle, or implement a new version of record that takes a string instead of a char. For Rust I wouldn't bother with the char one)
extractor for enum from string name (template stuff because I originally decided to extract enums as their underlying ints, I think now that this was a mistake - instead if an enum wants to be that, just let them write bool extract(XString str, Enum *e) { return extract_as_int(str, e); })

Also, the human-facing functions which are only called the same file they're defined in:

asplit, which splits on runs of whitespace and returns the unparsed string in the last argument (a few of its callers are still badly-behaved. In the long run, I plan to make the actual extracted things into function parameters, and do some template magic to automatically extract whatever needs to be extracted in the function dispatcher, but this will interact in strange ways with optional arguments, however)
qsplit, which deals with quoted arguments from a shell

I haven't linked config_parse functions since they don't yet do everything I want them to do - particularly, they are currently only robust against key errors, not value errors.

nielsle · 2014-05-19T16:15:43Z

Perhaps:

let (x, y) = try!(scanln!("{:f} {:s}"));

alexcrichton · 2014-06-17T22:54:49Z

This was discussed in today's meeting and it was decided that a feature such as this should bake in a library before being accepted into the main repo, so I'm going to close this for now.

This would be a nice feature to have though!

uazu · 2014-06-18T14:05:34Z

I don't know whether you've considered the possibility of matching/parsing either all or nothing (but nothing in between). If the whole thing matches, then all variables are assigned. Otherwise all variables are unset (or nil'd/zero'd) and all the characters read are 'ungot'. This means you can 'try' various patterns against the input 100% safely. This approach worked well for one parsing library I wrote.

gsingh93 · 2015-01-20T06:47:58Z

Looks like there's a library that's doing this: https://github.com/mahkoh/scan

Also, should this be closed or closed as postponed?

Fix typo in zip documentation

User-friendly input macros.

2154120

Add `scan!`, `scanln!`, etc. macros which mirror `print!`, `println!`, etc. for doing straightforward input from stdin, a file, or other Reader.

nrc mentioned this pull request May 4, 2014

Number to/from string API rust-lang/rust#6220

Closed

alexcrichton closed this Jun 17, 2014

sinistersnare mentioned this pull request Jan 29, 2015

Simplify Standard Input #768

Open

withoutboats pushed a commit to withoutboats/rfcs that referenced this pull request Jan 15, 2017

Merge pull request rust-lang#67 from killercup/patch-5

6dfd2db

Fix typo in zip documentation

kennytm mentioned this pull request Oct 18, 2021

[Draft] RFC: Console Input Simplified #3183

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

User-friendly input macros. #67

User-friendly input macros. #67

nrc commented May 4, 2014

nrc commented May 4, 2014

huonw commented May 4, 2014

lifthrasiir commented May 5, 2014

milibopp commented May 5, 2014

nrc commented May 5, 2014

milibopp commented May 5, 2014

pnkfelix commented May 5, 2014

nrc commented May 5, 2014

o11c commented May 17, 2014

nrc commented May 17, 2014

o11c commented May 17, 2014

nrc commented May 18, 2014

o11c commented May 18, 2014

nielsle commented May 19, 2014

alexcrichton commented Jun 17, 2014

uazu commented Jun 18, 2014

gsingh93 commented Jan 20, 2015

User-friendly input macros. #67

User-friendly input macros. #67

Conversation

nrc commented May 4, 2014

nrc commented May 4, 2014

huonw commented May 4, 2014

lifthrasiir commented May 5, 2014

milibopp commented May 5, 2014

nrc commented May 5, 2014

milibopp commented May 5, 2014

pnkfelix commented May 5, 2014

nrc commented May 5, 2014

o11c commented May 17, 2014

nrc commented May 17, 2014

o11c commented May 17, 2014

nrc commented May 18, 2014

o11c commented May 18, 2014

nielsle commented May 19, 2014

alexcrichton commented Jun 17, 2014

uazu commented Jun 18, 2014

gsingh93 commented Jan 20, 2015