Skip to content

Conversation

@JackStouffer
Copy link
Contributor

This proposal

  • Modifies std.format to recognize any toString that accepts a character accepting output range as having a valid toString overload
  • Modifies std.format to prioritize and use the output range method when formatting aggregate types if it exists
  • De-emphasizes the scope delegate toString methods in the docs

Rationale

Firstly, speed. According to Walter, the delegate passed as a parameter can never be in-lined because of how DMD does data flow analysis. Using instead the defined put method of an output range would solve that problem.

Secondly, the vision of Phobos. Ranges are the present and the future. Output ranges really need more support in Phobos (std.conv).

Thirdly, safety. The vast majority of the code I've seen doesn't mark the delegate as @safe to allow non-safe sinks to work with it. This means that the toString method itself can never be marked @safe either. Making toString a template who's safety relies on the put method of the output range side-steps this problem entirely.

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @JackStouffer!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

@JackStouffer
Copy link
Contributor Author

Ho man, this is a fun bug. Turns out, if you don't define your FormatSpec as const ref in the parameters of your toString (and why would you when the docs don't tell you to do that), it wasn't recognized as a valid overload by formatObject before this change.

Case in point, BigInt's toString(scope void delegate(const(char)[]) sink, ref FormatSpec!char f) overload wasn't ever called by std.format. It called the string format option every time.

Fix incoming.

JackStouffer added a commit that referenced this pull request Jan 3, 2018
@JackStouffer
Copy link
Contributor Author

#5992

std/format.d Outdated
* ---
*
* Where `W` allows a character accepting
* $(REF_ALTTEXT ouput range, isOutputRange, std,_range,primitives) with its template constraints.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: ouput -> output

std/format.d Outdated
* string toString();
* ---
*
* Where `W` allows a character accepting
Copy link
Member

@quickfur quickfur Jan 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about "Where W is an output range that accepts ...".

That word order seems a little more clear. IMO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it really should be ref W. Accepting an output range by value is begging for trouble.

std/format.d Outdated
* The template type does not have to be called `W`.
*
* The following overloads are also accepted for legacy reasons. It's recommended that
* any new code forgo these overloads for speed and attribute acceptance reasons.
Copy link
Member

@quickfur quickfur Jan 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about also accepting a toString overload that takes an output range that can accept any range of characters? If efficiency is a concern here, it would make sense for toString to be able to take an output range that accepts a range of char/wchar/dchar, instead of needing a function call roundtrip per character.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really just for legacy reasons though? What about classes? As soon as you templatize toString, it can't be virtual.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's true. Virtual toString methods could be problematic because format may not know how to construct an output range of a type that will be accepted by toString.

Perhaps one way out is use std.range.interfaces.OutputRangeObject. Since such an object is an output range, it can be recognized by format, and since it gives an opaque interface to arbitrary underlying output range implementations, it will also give us the flexibility to choose the best implementation used by format.

OTOH, since scope void delegate(const(char)[]) is already established in current code, and already allows arbitrary implementations, perhaps we should just retain support for it. Code that can afford to have toString templatized can still take advantage of a more flexible implementation of format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@safety is still an issue for virtual toString functions, though. Well, as well as other attributes. I've no idea how to work around that.

@quickfur
Copy link
Member

quickfur commented Jan 3, 2018

In general I like the direction this is going. The main concern I have is that we should support toString taking an output range that accepts ranges of characters, in order to minimize roundtrips when outputting strings.

@JackStouffer
Copy link
Contributor Author

Changed the requirements to look for a string accepting output range instead of a char accepting one. Also added checks to make sure the output range is being taken by ref.

As for virtualizing toString, IMO the best option would be to use the string toString() overload in that case, as it's probably going to be faster with the inlining issue I mentioned. Plus, it can also be @safe in that case.

@JackStouffer
Copy link
Contributor Author

#5995

@JackStouffer
Copy link
Contributor Author

Thought about the round-trip per character issue some more, and I remembered that put solves this problem automatically for the user. Emphasized it in the docs.

@quickfur
Copy link
Member

quickfur commented Jan 4, 2018

@JackStouffer It took me a few minutes to understand why and how exactly std.range.put solves the problem of round-trip per character. You might want to put in a slightly more detailed explanation than just pointing the reader to put. :-) Also, it would be nice if there's a hyperlink to the documentation for put somewhere.

Copy link
Member

@quickfur quickfur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

The only remaining question is what to do with class hierarchies that might wish to define a toString overload that takes a FormatSpec parameter, since the only ways that work with virtual functions will now be deprecated.

@JackStouffer
Copy link
Contributor Author

@quickfur I'm not planning on deprecating the other overloads anytime soon, if ever. There's just too much code out there that uses it, and this is such a common function, that I'm not comfortable removing the delegate overloads from std.format.

std/format.d Outdated
* which accepts characters. The template type does not have to be called `W`.
*
* The following overloads are also accepted for legacy reasons. It's recommended that
* any new code forgo these overloads for speed and attribute acceptance reasons.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to reword this, if the older overloads are still recommended for class hierarchies where toString needs to be virtual.

@JackStouffer
Copy link
Contributor Author

@quickfur Addressed comments

Copy link
Member

@quickfur quickfur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than that, LGTM.

}
// not using the overload enum to not break badly defined toString overloads
// e.g. defining the FormatSpec as ref and not const ref led this function
// to ignore that toString overload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought: to prevent similar breakages in the future, what about if the new unittest added above is changed to have non-trivial return values for the toString overloads, and the unittest actually checks for these values? The idea being that if a particular toString overload is ignored or not called correctly for whatever reason, the unittest would catch it.

Copy link
Contributor

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should enforce that the FormatSpec parameter is const ref. It will otherwise result in very confusing error messages as formatObject and formatValue already do so.

std/format.d Outdated
* string toString();
* ---
*
* Where `W` is an $(REF_ALTTEXT output range, isOutputRange, std,_range,primitives)
Copy link
Contributor

@wilzbach wilzbach Jan 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be any need for the underscore at _range here. No?

std/format.d Outdated
* Formatting a struct by defining a method `toString`, which takes an output
* range.
*
* It's recommended that any `toString` using $(REF_ALTTEXT output _rangew, isOutputRange, std,_range,primitives)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output ranges?

std/format.d Outdated
* range.
*
* It's recommended that any `toString` using $(REF_ALTTEXT output _rangew, isOutputRange, std,_range,primitives)
* use $(REF put, std,_range,primitives) rather than use the `put` method of the range
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again no need for _range

std/format.d Outdated

void toString(W)(ref W writer, const ref FormatSpec!char f) if (isOutputRange!(W, char))
{
// make sure to use std.range.primitives.put and not writer.put directly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioning this in the summary page.

std/format.d Outdated
* this time using the `scope delegate` method.
*
* `formatValue` also allows to reuse existing format specifiers:
* This method is now discouraged. Please use the output range method
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$(RED ..) the first sentence. Or maybe just bold?

std/format.d Outdated
{T val = void; FormatSpec!Char f; static struct S {void put(Char s){}}
S s; val.toString(s, f);
// force toString to take output range by ref
static assert(ParameterStorageClassTuple!(T.toString!(S))[0] == ParameterStorageClass.ref_);}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about enforcing that the second parameter (FormatSpec) is const and ref?

* const void toString(scope void delegate(const(char)[]) sink);
* const string toString();
* void toString(W)(ref W w, const ref FormatSpec fmt)
* void toString(W)(ref W w)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have a public unittest showing this to the user. I know it's trivial, but I still would add one to avoid guaranteed questions on the NG.

@wilzbach wilzbach added the Review:Needs Changelog A changelog entry needs to be added to /changelog label Jan 9, 2018
@JackStouffer JackStouffer removed the Review:Needs Changelog A changelog entry needs to be added to /changelog label Jan 9, 2018

formatValue(w, p, spec);
assert(w.data == "(16,11)");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This public tests is failing, run with:

make -f posix.mak std/format.publictests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line number on the error message is the one from the generated file.
(it probably makes sense to insert empty lines in it, s.t. the line numbers match up)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure how to debug this. This is green on the autotester

Copy link
Contributor

@wilzbach wilzbach Jan 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Compile Phobos
  2. Run the extracted test with Phobos-dev:
unittest
{
    import std.array : appender;
    import std.format;

    static struct Point
    {
        int x, y;

        void toString(W)(ref W writer, const ref FormatSpec!char f) if (isOutputRange!(W, char))
        {
            // std.range.primitives.put
            put(writer, "(");
            formatValue(writer, x, f);
            put(writer, ",");
            formatValue(writer, y, f);
            put(writer, ")");
        }
    }

    auto w = appender!string();
    auto spec = singleSpec("%s");
    auto p = Point(16, 11);

    formatValue(w, p, spec);
    assert(w.data == "(16,11)");
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that w.data is Point(16, 11) here.

std/format.d Outdated
{
int x, y;

void toString(W)(ref W writer, const ref FormatSpec!char f) if (isOutputRange!(W, char))
Copy link
Contributor

@wilzbach wilzbach Jan 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If constraints should have the same indentation as their declaration:

void toString(W)(ref W writer, const ref FormatSpec!char f)
if (isOutputRange!(W, char))

@JackStouffer
Copy link
Contributor Author

@wilzbach Fixed. The public tests didn't have access to std.range.primitives.

@JackStouffer
Copy link
Contributor Author

Ping @andralex

Copy link
Member

@andralex andralex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea, thx @JackStouffer and sorry for the long wait. I thought it's a lot more complicated :)

writeln(t); // writes "Custom toString"
-------

This has several benefits for the user. Firstly, this design is much friendlier
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly -> First, Secondly -> Second...

@schveiguy
Copy link
Member

This is cool, I like it.

@JackStouffer JackStouffer changed the title [RFC] Make Output Ranges The Preferred toString Method [RFC] Make Output Ranges Using Methods The Preferred toString Option Jan 24, 2018
@CyberShadow
Copy link
Member

This pull request seems to have exposed an -allinst DMD bug, thus causing a regression:
https://issues.dlang.org/show_bug.cgi?id=19406

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants