RFC: Split Show into String and Show #504

alexcrichton · 2014-12-08T02:08:28Z

Today's Show trait will be tasked with the purpose of providing the ability to
inspect the representation of implementors of the trait. A new trait, String,
will be introduced to the std::fmt module to in order to represent data that
can essentially be serialized to a string, typically representing the precise
internal state of the implementor.

The String trait will take over the {} format specifier and the Show trait
will move to the now-open {:?} specifier.

Rendered

Today's `Show` trait will be tasked with the purpose of providing the ability to inspect the representation of implementors of the trait. A new trait, `String`, will be introduced to the `std::fmt` module to in order to represent data that can essentially be serialized to a string, typically representing the precise internal state of the implementor. The `String` trait will take over the `{}` format specifier and the `Show` trait will move to the now-open `{:?}` specifier.

alexcrichton · 2014-12-08T02:08:49Z

cc @nick29581, @wycats, @seanmonstar, @aturon, I believe this is what we discussed during the work week.

wycats · 2014-12-08T02:16:23Z

Looks reasonable to me. I think we should also add that if a type implements both String and parse-from-string APIs, parsing an emitted string should likely produce a semantically equivalent value.

alexcrichton · 2014-12-08T02:17:32Z

@wycats Is this what you were thinking?

assert_eq!(foo, from_str(format!("{}", foo).as_slice()).unwrap());

(see here)

sfackler · 2014-12-08T03:11:43Z

This seems pretty reasonable in general.

I think there may be a bit of a distinction that should be made in the intended behavior of String implementations. For types that have reasonable round trips to and from strings (e.g. Duration and the ISO 8601 format) the proposal makes sense. There are other types, however, where you would want a String implementation but making it fully describe the type is undesirable or maybe even impossible.

For example, rust-postgres's DbError type contains a ton of information, but the current implementation of Show (that would become an implementation of String) only prints <severity>: <message>. That format's descriptive enough to generally tell you what's going on, and printing everything would be totally overwhelming, especially when fields like routine and file are only useful if you're trying to dig into some bizarre Postgres bug.

Thoughts? It seems like the intent for String implementations is "something reasonable", which for types with FromStr impls is almost certainly round-trippable output but for other types may be less descriptive.

alexcrichton · 2014-12-08T03:21:43Z

That is a good point @sfackler! I added a drawback at the last second about how the guarantees provided by String may be a bit too strict and I'm worried about it motivating a new trait on the side in the future. I'd be fine wording it like:

An implementation of the String trait is an assertion that the type can be faithfully represented as a UTF-8 string at all times. If the type can be reconstructed from a string, then the following relation must be true:
assert_eq!(foo, from_str(format!("{}", foo).as_slice()).unwrap());
If the type cannot necessarily be reconstructed from a string, then the output may be less descriptive than the type can provide, but it is guaranteed to be human readable for all users.

wycats · 2014-12-08T03:22:17Z

I didn't mean to suggest that round-tripping would be mandated; just that if a from-string implementation existed, and String was implemented, it should roundtrip.

huonw · 2014-12-08T03:27:09Z

I vaguely recall a suggestion that we would implement Show for u8 etc as 1u8; requiring round-tripping would mean that from_str needs to handle the suffixes, or that value would just be printed as 1.

(Just something to consider.)

seanmonstar · 2014-12-08T04:12:56Z

I specifically would never mention round-tripping, because it causes confusion. If you want round-tripping, use the serialize crate. I would never suggesting that the fmt::Show output should be used to save an object, and later recreate it.

@huonw yes, fmt::Show should have 3u8 output as 3u8, and the String trait would output 3.

fwiw, the RFC I wrote about this on Friday: https://gist.github.com/seanmonstar/94208ff568d1b9f71914

alexcrichton · 2014-12-08T04:30:16Z

@huonw yes as @seanmonstar mentioned the String impl for u8 wouldn't append a suffix, whereas the Show impl would. I tried to call this out in the RFC, but do you think I should place more emphasis on it?

@seanmonstar I do think it's useful to have a distinction where if FromStr is implemented then it should be equivalent, otherwise it needs to be an accurate representation as much as possible. I'll add some specific wording about how types should specifically never be constructed from the Show output, which I think is a good idea!

ftxqxd · 2014-12-08T04:33:52Z

The name String seems like it would be too easily confused with std::string::String. Perhaps something like Fmt would be more appropriate? A trait named String sounds like it represents any string type (&str, String, a rope, etc.) rather than anything that can be represented as a human-friendly string. Also, I think that the required method in either the String or Show trait should perhaps be renamed to avoid name conflicts.

I wonder if it would be useful to enforce an invariant that anything implementing Show should be evaluable as a Rust expression with the same value (perhaps excepting lifetime/ownership issues). That might make output for String a bit too noisy, and things like HashMap don’t have any literal-like construction method (other than perhaps vec![(key, val), ...].collect()), so maybe that’s not practical. (Incidentally, Python recommends that eval(repr(foo)) == foo, which is the same idea.)

seanmonstar · 2014-12-08T04:36:11Z

@alexcrichton I use FromStr for basic parsing in hyper. For example, from_str::<Mime>("text/html") == Some(Mime(Text, Html, vec![]). I would implement fmt::String to output text/html, and I would use #[deriving(Show), which would mean it would output Mime(Text, Plain, []).

seanmonstar · 2014-12-08T04:38:13Z

@P1start it's easier to see if you view it as fmt::String. Also, while the Python docs claim that, the Python community all agree that it is a docs bug.

reem · 2014-12-08T04:45:43Z

I agree with @seanmonstar that we should not use either of these traits for round-trip parsing (or at least we shouldn't grow to expect that behavior) - you should use a more principled system like encodable/decodable.

alexcrichton · 2014-12-08T04:52:31Z

@P1start RFC 356 has outlined that module prefixes should not be included in type names while also emphasizing that the name of a type or trait should be scoped to the module that it is defined in. For example the IoError type will be renamed to io::Error and it can be locally renamed. This is distinct from the Error trait, however, in the std::error module. For this reason I do not think that string::String and fmt::String will be confused all that often in practice because this will already be a convention in the standard library. Can you elaborate more on the renaming of the methods? The methods align with the current design of the fmt module and its formatting traits.

I also have added some specific text that the Show output can never be used to reconstruct the type. For example a Path can never be constructed from UTF-8 output.

@seanmonstar I'm not sure I understand your example, that sounds like it fits into the design outlined in this RFC?

@reem note that these traits are not necessarily connected to round-trip parsing at all. The Show trait is explicitly stated as not being used for this purpose, and String is only used for this if FromStr is implemented. I believe it would be quite odd if a FromStr impl were unable to parse the default output of the type represented as a string.

ftxqxd · 2014-12-08T05:10:18Z

@alexcrichton Quite often I do something like this in a Show impl:

struct Foo {
    x: Vec<int>,
    y: Vec<f64>,
}

impl fmt::Show for Foo {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        self.x.fmt(f);
        self.y.fmt(f);
    }
}

With the name fmt being overloaded for both String and Show, I don’t think that would compile, and would require UFCS or some other way of specifying which trait to use. It’s only a mild inconvenience, but it would still be nice for Show to perhaps have its method called fmt_repr or something. I see now that all the traits in std::fmt today use fmt anyway (not just Show), but my concern is that two traits that are both commonly used will have conflicting names. (Show is the only trait in std::fmt that is commonly used today, so the name conflict isn’t a problem.)

Regarding String’s naming, I can’t quite find where I read it, but I recall a guideline that trait names should preferably be verbs, then nouns, and at worst adjectives (or something like that) that describe what the trait provides/requires. String is not only a noun, but also to me doesn’t sound like a trait for ‘something that can be represented as a (user-friendly) string’, but rather a trait for ‘something that is a string’ (much like how the old collections traits were things like Deque for anything that is a deque). Something like Stringify I suppose would be ideal, but it sounds a little silly. Even in context, fmt::String sounds like a formatting trait for strings only (like how fmt::Pointer is only implemented for pointers). I understand your point about confusion with string::String, though—I now realise that such a confusion probably won’t happen (except perhaps in rustdoc, although that problem applies to many things and is a separate issue).

SimonSapin · 2014-12-08T05:11:43Z

An implementation of the String trait is an assertion that the type can be faithfully represented as a UTF-8 string at all times.

…

The output format will not change from what it is today (no extra escaping or debugging will occur).

You can’t have both. If &str is not quoted and escaped, it takes all of the output space and there is no way to tell where it ends when it is part of a larger type. For example, ["a", "b"] and ["a, b"] would print to the same, and so &[&str] can not be round-tripped.

I was confused when first reading the RFC: I expected the round-trippable trait to be the debugging one, both separate from the "default" one.

SimonSapin · 2014-12-08T05:15:37Z

This RFC establishes a convention that Show and String produce valid UTF-8 data, but no static guarantee of this requirement is provided. Statically guaranteeing this invariant would likely involve adding some form of TextWriter which we are currently not willing to stabilize for the 1.0 release.

I find this very unfortunate, since that static guarantee can likely not be added backward-compatibly. Can you expand on why you’re not willing to stabilize TextWriter for 1.0? A minimal variation of it only needs fn write_str(&mut self, data: &str) -> Result<(), ()>, more general-purpose Writer-like functionality can be added later.

alexcrichton · 2014-12-08T05:22:13Z

@P1start that should compile today and use the Show impl for the member types. Disambiguation will require UFCS, yes. The naming for this trait should be thought of as fmt::String which translates to "format string" which is generally what the trait is performing. In general though, it follows the current design of the std::fmt traits.

@SimonSapin that is correct about string slices, but &[&str] does not implement FromStr for precisely this reason. We can't attach round-tripping to the debugging trait because it's defined for all Rust types as not all types can be represented (examples given in the RFC).

I also still believe that we cannot stabilize TextWriter for 1.0 at this time. We're cutting it real thin stabilizing std::io as is at this time. We can of course add a simple trait with one write_str method, but it's not quite as simple as we need to consider the ramifications of how TextWriter and Writer interact. For example do we now need to add BufferedTextWriter and BufferedWriter? For most I/O objects that are generic over the inner type, do we need to add duplicate types for all of these? Questions like this are why I'm not confident that we can deliver a quality experience with a TextWriter trait.

seanmonstar · 2014-12-08T05:39:34Z

Ah, I was confused just as Simon was: I misread and thought fmt::Show <=> FromStr.

Still, I think the round-trip promises should be left to the serialize crate. Decodable <=> Encodable.

seanmonstar · 2014-12-08T05:44:57Z

As for @P1start concerns, if you imported both into the same module, you would need to use UFCS to disambiguate. See http://is.gd/gjCmby

Instead, I'd need to write LowerHex::fmt(self.0, f) or Show::fmt(self.0, f).

quantheory · 2014-12-08T05:48:21Z

I'm also just slightly against the String name:

I'm still not sure I actually agree with the RFC 356 reasoning about prelude names in general. I think that we can get away with io::Result not being too confusing because it actually is a partially specialized result::Result (which I think is also true of all the Results in other std modules?). string::String and fmt::String don't have that close relationship.
Even assuming that string::String and fmt::String don't get mixed up too much in the code itself (though I think this will trip up beginners), it opens the door to more confusion in conversation and documentation. As a special case of this, the docs might get a bit stranger in places where both Strings are being mentioned (though maybe the real issue is that rustdoc should not leave off the fmt:: or string:: where there could be ambiguity).

SimonSapin · 2014-12-08T06:07:22Z

@alexcrichton

I also still believe that we cannot stabilize TextWriter for 1.0 at this time. We're cutting it real thin stabilizing std::io as is at this time.

Did that decision about std::io explicitly include consciously giving up forever on a static UTF-8 guarantee for std::fmt? I don’t think deciding to freeze a module means you can absolutely not change it when new information comes up. I’m willing to do the work to make this happen (though I understand that someone else still need to take time to review it.)

We can of course add a simple trait with one write_str method, but it's not quite as simple as we need to consider the ramifications of how TextWriter and Writer interact. For example do we now need to add BufferedTextWriter and BufferedWriter? For most I/O objects that are generic over the inner type, do we need to add duplicate types for all of these?

All interesting questions, that I think can be resolved backward-compatibly after 1.0.

By the way, from the RFC:

The formatting traits today largely provide clear guidance to what they are intended for. For example the Binary trait is intended for printing the binary representation of a data type.

It’s not clear to me at all what the Binary trait is for. Its documentation is a bit tautological, it only says “Format trait for the b character”. Is it for writing raw, non-UTF-8, binary bytes?

sfackler · 2014-12-08T06:26:51Z

It’s not clear to me at all what the Binary trait is for. Its documentation is a bit tautological, it only says “Format trait for the b character”. Is it for writing raw, non-UTF-8, binary bytes?

No, it prints numbers in binary, just like Hex prints them in hexadecimal.

wycats · 2014-12-08T06:33:34Z

I actually got my traits mixed up ;)

For what it's worth, what I was trying to get across actually wasn't "round tripping", but rather the idea that unlike fmt::String, which may be lossy (for example, format!("{}", path)), the new Show trait should try to include enough semantically important aspects of the value so that a human can re-create it.

An example of a possible problem for Show would be serializing a Vec as [ foo, bar, baz ], because it would be ambiguous between [ "foo", "bar", "baz" ] and something like [ "foo, bar", "baz" ].

Essentially, what I'm trying to say is that Show should try to let a human understand enough of the representation to understand its public semantics, but not necessarily the internal guts of the value.

seanmonstar · 2014-12-08T06:50:45Z

That makes sense. fmt::Show for a String perhaps would be String("foo bar"), not String { vec: *mut Vec }

alexcrichton · 2014-12-08T22:39:44Z

@SimonSapin

Did that decision about std::io explicitly include consciously giving up forever on a static UTF-8 guarantee for std::fmt?

Yes. When @aturon and I were discussing the traits the questions we raised (some I pasted above) were why we avoided attempting to add a TextWriter at this time.

All interesting questions, that I think can be resolved backward-compatibly after 1.0.

Maybe, but I'm not personally convinced that they will have answers in a reasonable time frame (e.g. taking into account possible language features coming down the pike).

@seanmonstar, @wycats I intend for the following impl to exist:

impl fmt::String for String {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { self.as_slice().fmt(f) }
}

impl fmt::String for str {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { f.write(self.as_bytes()) }
}

impl fmt::Show for String {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { self.as_slice().fmt(f) }
}

impl fmt::Show for str {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        try!(write!(f, "\""));
        for c in self.chars().flat_map(|c| c.escape_default()) {
            try!(write!(f, "{}", c));
        }
        write!(f, "\"") 
    }
}

Does that help clear up what String and Show are intended for?

barosl · 2014-12-09T04:49:47Z

Personally I think Inspect is less confusing than Show for newcomers, but I also agree that the churn required to change the name does not worth it.

So, in a Ruby sense, this proposal suggests Show for .inspect, String for .to_s, but not .to_str because it should be handled through the serialize crate, as Show is not guaranteed to be lossless (or round-trippable). Am I right?

aturon · 2014-12-10T07:59:48Z

Note: this may be a good time to reconsider rust-lang/rust#16544

SimonSapin · 2014-12-10T08:35:02Z

@aturon That sounds familiar… #46 :)

aturon · 2014-12-11T06:01:45Z

@SimonSapin Wow, indeed! Looks like you posted that just before I started on Rust :-)

chris-morgan · 2014-12-11T13:36:54Z

Demonstrating my Python background (where these things are the magic methods __str__ and__repr__), I’d rather like String and Repr as the names for these things. Show is rather a vague name. That would also to me suggest {:r} for raw rather than {:?} (in Python it’s {!r}, and Python uses ! where we use :), though either will do.

I also feel that the semantics of these traits needs to be tightened.

alexcrichton · 2014-12-11T15:21:43Z

@aturon That's a good point, I had forgotten about that issue! I think we may want to focus this RFC on the semantic meanings of Show and String rather than the implementation details though, I suspect we can have a separate RFC (if necessary) for size hints.

@chris-morgan could you elaborate on how you think the semantics need to be tightened? Do you feel that the Show and String definitions here are still ambiguous?

Ericson2314 · 2014-12-13T04:05:54Z

+1, Show has similar problems in Haskell, where I first became familiar with those problems. Because Show for debugging purposes was more likely to be derived, I thought of the debugging usage as the "more isomorphic" one. But based on the other comments, and the fact that deriving a String isomorphism is certainly less possible in Rust than it would be in Haskell, I agree that only (de)serializable should make a such a claim.

I like Repr for Show, and I like making Show part of the default bound, though as I mentioned in another thread I think making the list of traits entirely customizable would be neat. I also wonder if the deriving should be opt-out, or even mandatory. My only fear to making Show mandatory is it does subvert parametricity, and evil or misguided souls could exploit that.

seanmonstar · 2014-12-13T07:40:50Z

I also wonder if the deriving should be opt-out, or even mandatory. My
only fear to making String mandatory is it does subvert parametricity, and
evil or misguided souls could exploit that.

So, String couldn't be mandatory, because not all types have semantic
string values. But Show could. Maybe it can be added to the lint that
notices when a type forgets to optin to Copy?

Ericson2314 · 2014-12-13T10:34:17Z

Ah, I meant Show, edited for posterity. Yeah that names confuses me because "Show" sounds, well theatrical and thus user-facing.

seanmonstar · 2014-12-15T05:25:29Z

@alexcrichton what do you think of including a lint that checks all things implement Show (or allow the lint), similar to Copy? It'd greatly help deriving Show on new structs.

sfackler · 2014-12-15T05:39:52Z

A lint seems like a good idea to me.

alexcrichton · 2014-12-15T16:08:56Z

I'd be ok with adding a lint so long as it was allow-by-default. I personally also believe that the Copy lint should be allow-by-default as it's too noisy as warn-by-default.

SimonSapin · 2014-12-15T16:41:01Z

allow-by-default lints are much less useful for discoverability: by definition you have to already know about them to see them.

Would it make sense to have another level between allow and warn, where a given lint will print at most one message per crate? Something like "Some types don’t implement Copy. Add #![warn(missing_copy_impl)] to your crate for more details.". Maybe including the name of the offending types, but kept on one (logical) line.

sfackler · 2014-12-15T17:09:36Z

I'm more okay with a Show lint being warn-by-default than the Copy lint, since there is really no reason for any public type to not implement it, while there are valid reasons for a public type to not opt in to Copy.

alexcrichton · 2014-12-19T07:57:59Z

I've pushed a commit which tones down the wording about "round tripping" through String and FromStr. It is now recommended, but not required for this property to hold.

aturon · 2014-12-19T08:09:06Z

This RFC has been accepted.

Thank @alexcrichton for writing this RFC (and @seanmonstar for helping kick it off). It's great to finally resolve this thorny question!

There's broad support for the proposed design here, especially now with the toned-down guideline about roundtripping. We'll leave open the question of size hints for formatters (which we intend to address).

Tracking issue

Per this RFC: rust-lang/rfcs#504

Add a drawback of the String trait as defined

4782557

alexcrichton mentioned this pull request Dec 8, 2014

Add enum namespacing to the Guide. rust-lang/rust#19616

Merged

Clarify FromStr and String

48f778e

Add a sentence about not using Show output for reconstruction

c380963

seanmonstar mentioned this pull request Dec 10, 2014

Allow header hyperium/hyper#188

Merged

nrc assigned alexcrichton Dec 10, 2014

aturon mentioned this pull request Dec 13, 2014

Stabilization metabug: 1.0-alpha rust-lang/rust#19260

Closed

alexcrichton mentioned this pull request Dec 15, 2014

RFC: Statically enforce Unicode in std::fmt #526

Merged

Tweak wording to recommend, but not require round-tripping

fb4f25c

aturon mentioned this pull request Dec 19, 2014

Tracking issue for splitting Show into String and Show (RFC 504) rust-lang/rust#20013

Closed

aturon merged commit fb4f25c into rust-lang:master Dec 19, 2014

aturon mentioned this pull request Dec 27, 2014

RFC: Error Reform and Failure Box #492

Closed

huonw mentioned this pull request Jan 9, 2015

Remove integer suffixes from Show implementation rust-lang/rust#20792

Closed

aturon mentioned this pull request Jan 9, 2015

RFC: fmt::Show and fmt::String guidelines #565

Merged

sampsyo added a commit to sampsyo/fake1 that referenced this pull request Jan 13, 2015

Track Rust's String/Show split

2587557

Per this RFC: rust-lang/rfcs#504

lifthrasiir mentioned this pull request Sep 5, 2015

DelayedFormat error handling is not as optimal chronotope/chrono#47

Closed

Centril added the A-fmt Proposals relating to std::fmt and formatting macros. label Nov 23, 2018

RFC: Split Show into String and Show #504

RFC: Split Show into String and Show #504

Conversation

alexcrichton commented Dec 8, 2014

alexcrichton commented Dec 8, 2014

wycats commented Dec 8, 2014

alexcrichton commented Dec 8, 2014

sfackler commented Dec 8, 2014

alexcrichton commented Dec 8, 2014

wycats commented Dec 8, 2014

huonw commented Dec 8, 2014

seanmonstar commented Dec 8, 2014

alexcrichton commented Dec 8, 2014

ftxqxd commented Dec 8, 2014

seanmonstar commented Dec 8, 2014

seanmonstar commented Dec 8, 2014

reem commented Dec 8, 2014

alexcrichton commented Dec 8, 2014

ftxqxd commented Dec 8, 2014

SimonSapin commented Dec 8, 2014

SimonSapin commented Dec 8, 2014

alexcrichton commented Dec 8, 2014

seanmonstar commented Dec 8, 2014

seanmonstar commented Dec 8, 2014

quantheory commented Dec 8, 2014

SimonSapin commented Dec 8, 2014

sfackler commented Dec 8, 2014

wycats commented Dec 8, 2014

seanmonstar commented Dec 8, 2014

alexcrichton commented Dec 8, 2014

barosl commented Dec 9, 2014

aturon commented Dec 10, 2014

SimonSapin commented Dec 10, 2014

aturon commented Dec 11, 2014

chris-morgan commented Dec 11, 2014

alexcrichton commented Dec 11, 2014

Ericson2314 commented Dec 13, 2014

seanmonstar commented Dec 13, 2014

Ericson2314 commented Dec 13, 2014

seanmonstar commented Dec 15, 2014

sfackler commented Dec 15, 2014

alexcrichton commented Dec 15, 2014

SimonSapin commented Dec 15, 2014

sfackler commented Dec 15, 2014

alexcrichton commented Dec 19, 2014

aturon commented Dec 19, 2014