Skip to content

Add shift(): wrapper for front and popFront#4010

Closed
wilzbach wants to merge 1 commit intodlang:masterfrom
wilzbach:next_fn
Closed

Add shift(): wrapper for front and popFront#4010
wilzbach wants to merge 1 commit intodlang:masterfrom
wilzbach:next_fn

Conversation

@wilzbach
Copy link
Contributor

Hey all,

I am quite new to D and I was wondering why there is no convenience wrapper which combines front and popFront. I know there is takeOne (which doesn't modify the source) and dropOne (which will return the resulting array after dropping one element).

I think the best example of such a method in other languages is the next method in Python, thus the naming. However feel free to suggest other names. Other ideas would be shift, takeFirst, removeFirst, shiftFirst.
If it makes sense to you I am happy to add the opposite method (takeLast,removeLast,shiftLast).

Pros:

  • syntactic sugar (see below for an example)
  • supported natively in most languages
  • I learned stack syntax with return at university (it's even on wikipedia
auto myFile = ["<crazy header>","1","2"];
// cut off the header
myFile.next.writeln;
// do sth with the body
myFile.map!(to!int).map!"a + 1".map!(to!string).joiner(",").writeln;

Cons:

  • yet another method
  • possible side effects?

As mentioned above I am just starting with D, so let me know if there is a better way to write this PR or you would add more checks (afaik front and popFront will throw an Assertion Error if the range is empty, so this behavior should be induced).

I initially asked this question at the DLang forum, which lead me to submit this PR.
https://forum.dlang.org/post/orcjkabkcffueduqkyml@forum.dlang.org

@wilzbach
Copy link
Contributor Author

Update: there's another argument to add this method - there seems to be justified confusion that moveFront is the convenience wrapper (example 1, example 2)

@JakobOvrum
Copy link
Contributor

This comes up from time to time, but the main issue is that popFront can often invalidate the return value of front, that is; the value of front is often only valid until the next call to popFront. We've sometimes called these ranges "transient ranges". A prominent example is File.byLine.

I think it's also a small issue in practice, as higher level range algorithms can take care of calling the range primitives correctly.

@e10s
Copy link
Contributor

e10s commented Feb 19, 2016

@quickfur
Copy link
Member

I think I prefer the names frontPop from issue 14598. next sounds a bit too generic for my liking, and seems too prone to name-coliisions with local variables and aggregate members.

@wilzbach
Copy link
Contributor Author

that is; the value of front is often only valid until the next call to popFront.

@JakobOvrum: Can't we then duplicate the front value as this method is really about convenience.

I think I prefer the names frontPop

Wouldn't it be a bit confusing to have frontPop and popFront? Do you prefer it over shiftFirst or takeFirst?

Any opinion from @MartinNowak?

@quickfur
Copy link
Member

@greenify Of course we can duplicate .front, if we knew it was a transient range... the problem is, how to know this in generic code? In general, you can't tell.

Having said that, though, transient ranges aren't very common, so most of the time this should work. And it does indeed simplify certain kinds of code, e.g., recursive descent parsers, where the pattern if (r.front.match(...)) { r.popFront(); doSomething; } occurs frequently.

@jmdavis
Copy link
Member

jmdavis commented Feb 20, 2016

The reality of the matter is that ranges with a transient front tend to break if you do much more than use foreach on them. They don't even work with std.array.array. So, I would not consider the fact that this won't work with transient front to be an issue at all, particularly since it's not like you can really fix that. This fundamentally doesn't work with transient front, just like plenty of other range-based stuff.

As for the name, I think that frontPop and backPop are very descriptive of what they're doing and definitely better than next, which will conflict with plenty of code. take has a very different meaning in std.range, so it would not make any sense to use something with that in its name, and shift isn't a concept that we have with ranges, so I expect that plenty of folks would be confused by it. So, while there might be some risk of confusion between frontPop and popFront, it seems like the best choice out of the currently suggested options at least.

@wilzbach
Copy link
Contributor Author

As for the name, I think that frontPop and backPop are very descriptive of what they're doing and definitely better than next

I renamed next to frontPop and added backPop too - any opinions on the code or tests?

r.popFront();
return e;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examples should come from ddoc-ed unit tests, not in documentation. All you have to do is add /// to the previous line

@wilzbach
Copy link
Contributor Author

thanks a lot @JackStouffer :)

Examples should come from ddoc-ed unit tests, not in documentation. All you have to do is add /// to the previous line

Thanks! I didn't know that - I added the /// magic to all four test or should it be only be added to first test of a function (I found both cases in primitives.d)?

Why ref?

Because otherwise (as far as I understood it) the pointers to the range will be copied and popping will only modify the copy - see also e.g. popFrontN

Params and Returns sections needed here. See other functions in this file for examples

I could only find one place in primitives.d (the module documentation) that defines Params - is this a new convention?

-> Added & squashed.

@JackStouffer
Copy link
Contributor

Thanks! I didn't know that - I added the /// magic to all four test or should it be only be added to first test of a function (I found both cases in primitives.d)?

The secondary unit tests don't add any extra value, so I don't think that they should be ddoc-ed.

I could only find one place in primitives.d (the module documentation) that defines Params - is this a new convention?

They were added a while ago. All of the functions in Phobos should have them, but a lot haven't been updated yet.

These functions could also use the following contract

auto frontPop(Range)(ref Range r)
if (isInputRange!Range)
in
{
    assert(!r.empty, "r cannot be empty");
}
body
{
    auto e = r.front;
    r.popFront();
    return e;
}

@wilzbach
Copy link
Contributor Author

The secondary unit tests don't add any extra value, so I don't think that they should be ddoc-ed.

Okay.

These functions could also use the following contract

I thought that popFront/popBack has this contract too and will throw it up throw the stack. However this makes it more obvious to the user. Thanks again :)

@JackStouffer
Copy link
Contributor

LGTM

@quickfur
Copy link
Member

Please add links to frontPop and backPop in the comment headers in std/range/primitives.d as well as std/range/package.d.

@wilzbach
Copy link
Contributor Author

@quickfur added to header in std/range/primitives.d, but I couldn't find a place in std/range/package.d. Could you give a more explicit point there? Thanks!

@quickfur
Copy link
Member

Hmm, I guess std/range/package.d doesn't have that table anymore. I guess that means you don't have to worry about it. :-)

@quickfur
Copy link
Member

LGTM

auto frontPop(Range)(ref Range r)
if (isInputRange!Range)
in
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion message is not useful (the r). Either it must be removed or changed for more value added.
I suggest assert(!r.empty, "empty range passed in " ~ __PRETTY_FUNCTION__);�. With pretty function, the type of the argument is also written.

same remark for backPop input contract.

@wilzbach
Copy link
Contributor Author

@bbasile wow thanks - I didn't know that special keyword existed!

auto backPop(Range)(ref Range r)
if (isBidirectionalRange!Range)
in
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the message be "... passed to" rather than "passed in"?

@wilzbach wilzbach force-pushed the next_fn branch 2 times, most recently from 038774b to 01abe82 Compare April 18, 2016 20:23
Next element of the input range
*/
auto shift(Range)(ref Range r)
if (isInputRange!Range
Copy link
Contributor Author

@wilzbach wilzbach Apr 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a typo - popFrontN does the attribute inference for us - just popFront wouldn't work due to the difference between attribute and function.
We could use a private alias if that's too cryptic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a private alias if that's too cryptic.

Updated to use one - it's more readable for the user this way ;-)

@wilzbach
Copy link
Contributor Author

I have found use for such a function many times, generally for arrays.

Wohoo - apparently I am not the only one missing this function.

@wilzbach I suggest:

done.

Good point, I think r.popFront should have to be nothrow but I think it doesn't matter if front throws.

done.

don't see that restricting this to forward ranges is worthwhile

Removed and added a warning about transient ranges.

Maybe call this consumeFront.

Good idea, what do I other people think?
(I personally think that we should try to have a similar naming to other languages).

@wilzbach wilzbach changed the title Add next(): wrapper for front and popFront Add shift(): wrapper for front and popFront Apr 18, 2016
@CyberShadow
Copy link
Member

(I personally think that we should try to have a similar naming to other languages).

Considering that it is very unlikely that D would be the first language where a user encounters such a function, I think it would be more useful to use the names existing in other languages, barring any strong reasons such as conflicts (which I don't see).

}

///
@safe pure unittest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add @nogc and nothrow. For @nogc you need to make the following change:

int[3] arr = [1, 2, 3];
int r[] = arr[];

Copy link
Contributor Author

@wilzbach wilzbach Apr 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks & added.
Even though I think this makes the first test for the user a bit uglier to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
Yes, but it shows an interesting (for newbies) effect: only the slice gets modified and not the original array, which IIRC is not the case in other languages like JS.
Maybe it's worth adding assert (arr.equals(4.iota)) at the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's worth adding assert (arr.equals(4.iota)) at the end.

done ;-)

Next element of the input range
*/
auto shift(Range)(ref Range r)
if (isInputRange!Range && hasNothrowPopFront!Range)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: please indent the if with 4 spaces. Ditto for shiftBack.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm after having a short look at the other template constraints in primitives , they also don't seem to use indention

Copy link
Member

@PetarKirov PetarKirov Apr 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure where you looked. The only exception is putChar:

$ pcregrep -MHn "\)\nif" range/primitives.d
range/primitives.d:252:correct primitive: $(D r.put(e)) if $(D R) defines $(D put), $(D r.front = e)
if $(D r) is an input range (followed by $(D r.popFront())), or $(D r(e))
range/primitives.d:348:private void putChar(R, E)(ref R r, E e)
if (isSomeChar!E)
$ pcregrep -MHn "\)\n    if" range/primitives.d
range/primitives.d:1596:auto walkLength(Range)(Range range)
    if (isInputRange!Range && !isInfinite!Range)
range/primitives.d:1610:auto walkLength(Range)(Range range, const size_t upTo)
    if (isInputRange!Range)
range/primitives.d:1667:size_t popFrontN(Range)(ref Range r, size_t n)
    if (isInputRange!Range)
range/primitives.d:1703:size_t popBackN(Range)(ref Range r, size_t n)
    if (isBidirectionalRange!Range)
range/primitives.d:1798:void popFrontExactly(Range)(ref Range r, size_t n)
    if (isInputRange!Range)
range/primitives.d:1814:void popBackExactly(Range)(ref Range r, size_t n)
    if (isBidirectionalRange!Range)
$ pcregrep -MHn "\)\n    if" range/package.d | wc -l
86
$ pcregrep -MHn "\)\n    if" algorithm/* | wc -l
170
~/dev/repos/dlang/phobos/std
$ find -type f -name "*.d" -exec pcregrep -MHn "\)\nif" {} \; | wc -l
342

~/dev/repos/dlang/phobos/std
$ find -type f -name "*.d" -exec pcregrep -MHn "\)\n    if" {} \; | wc -l
1278

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Andrei seems to prefer the current style.

#2011 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean Andrei prefers the less common style. OK, can't argue with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only exception is putChar

> grep "^if (" primitives.d
if (r.empty) {}   // can test for empty
if (isSomeChar!E)
if (isInputRange!Range && hasNothrowPopFront!Range)
if (isBidirectionalRange!Range && hasNothrowPopBack!Range)
if (!isNarrowString!(T[]) && !is(T[] == void[]))
if (isNarrowString!(C[]))
if (!isNarrowString!(T[]) && !is(T[] == void[]))
if (isNarrowString!(T[]))
if (!isNarrowString!(T[]) && !is(T[] == void[]))
if (!isNarrowString!(T[]))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean Andrei prefers the less common style. OK, can't argue with that.

Guys, I don't really care about this - the main point here is to get shift into Phobos :)
Btw we should just enable some code style tool like DScanner for Phobos and agree once on one of both ;-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I agree.

@wilzbach wilzbach force-pushed the next_fn branch 3 times, most recently from cc51776 to 2c8cd84 Compare April 19, 2016 07:01
import std.algorithm.comparison: equal;
import std.range: iota;
// original array stays unmodified
assert((cast(int[]) arr).equal(iota(1, 4)));
Copy link
Member

@PetarKirov PetarKirov Apr 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about: assert(arr[].equal(iota(1, 4))); ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks - that's a lot nicer :)

@PetarKirov
Copy link
Member

Overall LGTM, but I think we have to do something about ranges with non-copyable elelements to improve the API symmetry:

function action wrapper around
shift copy front + advance auto e = r.front(); r.popFront(); return e;
? move front + advance auto e = r.moveFront(); r.popFront(); return e;
dropOne [void] + advance r.popFront(); return r;

@wilzbach
Copy link
Contributor Author

Overall LGTM, but I think we have to do something about ranges with non-copyable elelements to improve the API symmetry:

Couldn't we call moveFront in shift if it's not copyable?

@PetarKirov
Copy link
Member

PetarKirov commented Apr 19, 2016

This won't work with elements can be both moved and copied - e.g. copying a RefCounted element will increase it's ref count, moving it will not. Users should be able to choose the semantics that fit their use case.

  • Copying is sharing ownership.
  • Moving is transferring ownership.

@PetarKirov
Copy link
Member

PetarKirov commented Apr 19, 2016

If we don't come up with a good name, we can always add a template parameter:

auto shift(bool moveInsteadOfCopying = false, Range)(ref Range r)
{
    static if (!isCopyable!(ElementType!Range) && !moveInsteadOfCopying)
        static assert (0, "Can't copy elements with disabled postblit! Use `shift!true` instead!");
    // ...
}

But using shift!true and shift!false is confusing for the reader and using a flag would be too verbose - shift!(Yes.moveInsteadOfCopying).

@ntrel
Copy link
Contributor

ntrel commented Apr 20, 2016

move front + advance

I think this is quite a bit less common so it's less clear a wrapper is needed for that.

@PetarKirov
Copy link
Member

PetarKirov commented Apr 21, 2016

Yes, maybe for scripts you are right. However when your application is main() @nogc, Phobos' lacking support for move-only types (like smart pointers) becomes a huge problem. You start hitting problems every step of the way and the only solution is to reimplement everything yourself, because even simple things, such as OutputRanges, don't work.

D is a great systems programming language, but unfortunately Phobos was not written and is not actively developed with @nogc in mind. (By @nogc I mean use in constraint environments in general, version (FreeStanding), etc., not just avoiding the GC). Phobos shortcomings look like D shortcomings to people new to D developing for constrained environments, which is bad and untrue.

Ranges are great in helping you avoid allocations and unnecessary work, but that works until your application grows large enough that resource management becomes unavoidable. Then suddenly the lacking move support becomes a deal-breaker.

We can't leave things as they are. We need to start somewhere.

@andralex
Copy link
Member

andralex commented Apr 26, 2016

I'm 75/25 against this.

  • + Other languages have it, some with the same name
  • + Helps eliminate possible confusion over moveFront
  • - Enshrines wrong semantics for transient input ranges. That alone carries a lot of weight.
  • - Forces return by value even when front yields a reference.
  • - It can be done in two lines of code, and does not save the user from learning the range API.
  • - Loss of information: if popFront throws, the popped value is lost so there's no shift for those ranges. Another view on this matter is the abstraction is just not that good.
  • - Although other languages have it, their range abstraction is considerably less flexible so they can afford a more rigid semantics.

The way I look at it there's just too much cons to it. There is real value to the function, but it's weighed against powerful issues.

I do see a few positives coming out of this:

  • Reopened the discussion of transient ranges.
  • Good learning experience for the submitter - this is one pass through the whole gamut of writing/documenting/unittesting/community discussion.

I'll close this, but don't take it as standoffish. If anyone finds a pro argument that they think is killer, please reopen.

@andralex andralex closed this Apr 26, 2016
@nordlow
Copy link
Contributor

nordlow commented May 9, 2016

What about enabling {front,back}Pop only for ranges whose ElementType is guaranteed to not be vulnerable to transient ranges and add a note about transient ranges in the documentation for these new primitives? Then we can start benefit from this right now for the ElementTypes that can be safely used and leave the rest until we get traits for transient ranges. Value types and immutable arrays/slices are the only safe ElementTypes who come to my mind right now. If the front returns an l-value we should make a copy of it before it's returned from frontPop so we don't risk exposing references to value-type elements in transient ranges.

Possible alternative namings {steal,take,pull}{Front,Back}.

Somewhat related: http://forum.dlang.org/post/dafmzroxvaeejyxrkbon@forum.dlang.org

@andralex
Copy link
Member

@nordlow thanks for your input. I'd say by the point the explanation is 5x the size of the two-liner, it's better to do without. Never mind {steal,take,pull}{Front,Back} etc. Go and do great things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.