Fix Issue 15798 - std.algorithm.mutation.copy takes target by value#6039
Fix Issue 15798 - std.algorithm.mutation.copy takes target by value#6039JackStouffer wants to merge 2 commits intodlang:masterfrom
Conversation
|
Thanks for your pull request, @JackStouffer! Bugzilla references
|
99e9d6f to
43f097f
Compare
std/stdio.d
Outdated
| .sort() // sort the lines | ||
| .copy( // copy output of .sort to an OutputRange | ||
| stdout.lockingTextWriter()); // the OutputRange | ||
| put( |
There was a problem hiding this comment.
This is a rare case where copy works as expected, the only reason being stdout.lockingTextWriter represents an output range which has only one global output destination, and therefore can be passed by value.
But generally it's a very bad idea to accept output ranges by value
There was a problem hiding this comment.
Boo, this breaks the UFCS chain. Is there as UFCS-able put in Phobos?
43f097f to
764fb77
Compare
std/stdio.d
Outdated
| stdin // read from stdin | ||
| .byLineCopy(Yes.keepTerminator) // copying each line | ||
| .array() // convert to array of lines | ||
| .sort() // sort the lines |
There was a problem hiding this comment.
I'm going to repeat @CyberShadow's comment as it got hidden with the last rebase:
Boo, this breaks the UFCS chain. Is there as UFCS-able
putin Phobos?
There was a problem hiding this comment.
This is an existential problem with put, because the member hook is also called put.
764fb77 to
8c13f74
Compare
|
@CyberShadow In this specific instance you can call |
|
I get that using |
|
But it's not UFCS-able :( |
Wasn't this example one of the few prime examples of how elegant D code can look like? 😞 |
8c13f74 to
ad3f074
Compare
|
UFCS is one of the more beautiful parts of D. Even though the code worked by chance, it saddens me that code like this now has to become uglier. I often try to structure my code to improve its UFCS-ability, e.g. [1] [2]. I remember using this very pattern (for copying a range to a file) some time recently as well. |
|
There is a solution here: define some function |
|
Yep, + invert args. I guess |
ad3f074 to
9210e90
Compare
|
I'm not 100% convinced this is the right path. "it should only work for arrays" seems short-sighted. It should only work for output ranges that don't copy their state by value. However, we have no mechanism to test for this. Even if we made it a rule that output ranges must be reference types, there's no static way to check for this. Is there no other path forward here? |
9210e90 to
3a78b6e
Compare
|
@schveiguy it also works for ranges with assignable values a-la the |
3a78b6e to
b3b8311
Compare
That's not necessarily a requirement: struct HasAssignableElements
{
int front;
void popFront() {front = 0;}
enum empty = false;
} |
|
@schveiguy I don't understand what that's supposed to illustrate. This would be treated as an output range by |
I don't think there are many output ranges that have this problem. Most reference the data anyway, or are lvalues. I am on the side of not breaking working, valid, reasonable code like the example |
andralex
left a comment
There was a problem hiding this comment.
This doesn't look too good. @JackStouffer have you considered changing copy such that it takes its target by auto ref?
std/stdio.d
Outdated
| output, // writing directly to stdout | ||
| stdin // read from stdin | ||
| .byLineCopy(Yes.keepTerminator) // copying each line | ||
| .array() // convert to array of lines |
There was a problem hiding this comment.
as an aside, the parens here must go
We can't do that completely, because that changes the semantics of auto buffer = new int[100];
iota(15).copy(buffer);
writeln(buffer[0 .. 15]);Currently, this will print the numbers from 0 to 14. If I think we can use @JackStouffer's identification of the types of ranges that copy has problems with in this PR and just make THAT overload auto-ref. See #6039 (comment) |
|
As no one pointed this out before - Jenkins failure is at tools: https://github.com/dlang/tools/blob/master/changed.d#L122 |
b3b8311 to
4296776
Compare
|
@schveiguy @andralex @CyberShadow Removed the deprecation and made the new overload take the output range by |
4296776 to
bda9ff7
Compare
|
I like this direction better. I have considered further that it's possible an output range accepted by reference may break some code. Namely an output range that does put elements into an external location, but isn't expected to affect the lvalue that was used. It might be possible to deprecate this functionality, but it might be too clever. If you pass in an lvalue by reference, you can |
schveiguy
left a comment
There was a problem hiding this comment.
I think this is solid, but we don't need to ban usage of copy now that it should work for lvalues.
std/digest/package.d
Outdated
| File file = File(filename); | ||
|
|
||
| //As digests imlement OutputRange, we could use std.algorithm.copy | ||
| //As digests imlement OutputRange, we could use std.range.put |
There was a problem hiding this comment.
Can we revert this? copy should work for this now. Alternatively, the comment looks odd since you are calling hash.put, which I first thought was a UFCS call of std.range.put. If we leave the comment as calling std.range.put, maybe include the full call to show how it's different.
std/digest/package.d
Outdated
| File file = File(filename); | ||
|
|
||
| //As digests implement OutputRange, we could use std.algorithm.copy | ||
| //As digests implement OutputRange, we could use std.range.put |
std/digest/package.d
Outdated
| auto oneMillionRange = repeat!ubyte(cast(ubyte)'a', 1000000); | ||
| auto ctx = makeDigest!MD5(); | ||
| copy(oneMillionRange, &ctx); //Note: You must pass a pointer to copy! | ||
| put(ctx, oneMillionRange); |
There was a problem hiding this comment.
Can revert this, and remove the &. Leave the note, but just say how ctx must be an lvalue for this to work.
| Hash hash; | ||
| hash.start(); | ||
| copy(range, &hash); | ||
| put(hash, range); |
There was a problem hiding this comment.
Can revert, and also switch to UFCS if you want.
There was a problem hiding this comment.
Since this one isn't a public example, and std.range is more likely to have already been imported, I'll leave this
There was a problem hiding this comment.
OK, I didn't notice that it wasn't an example.
std/digest/package.d
Outdated
| auto oneMillionRange = repeat!ubyte(cast(ubyte)'a', 1000000); | ||
| auto ctx = new MD5Digest(); | ||
| copy(oneMillionRange, ctx); | ||
| put(ctx, oneMillionRange); |
| { | ||
| copy(data, File(fileName, "wb").lockingBinaryWriter); | ||
| auto w = File(fileName, "wb").lockingBinaryWriter; | ||
| put(w, data); |
There was a problem hiding this comment.
I'd prefer reverting this, but I'm fine with the new version as well.
There was a problem hiding this comment.
I left this one to remove the global std.algorithm.mutation import.
bda9ff7 to
67c4512
Compare
changelog/copy-auto-ref.dd
Outdated
|
|
||
| Before this release, $(REF copy, std, algorithm, mutation) took all | ||
| $(REF_ALTTEXT output ranges, isOutputRange, std, range, primitives) by value. | ||
| Leads to some issues when output ranges aren't designed around a reference to |
changelog/copy-auto-ref.dd
Outdated
| Before this release, $(REF copy, std, algorithm, mutation) took all | ||
| $(REF_ALTTEXT output ranges, isOutputRange, std, range, primitives) by value. | ||
| Leads to some issues when output ranges aren't designed around a reference to | ||
| heap memory, e.g. the hash ranges in $(MREF std,digest,package): |
There was a problem hiding this comment.
I think you need to drop the package
There was a problem hiding this comment.
Is there a place to see what this actually produces?
There was a problem hiding this comment.
Yes, DAutoTest will show a preview, but it's failing atm due to trailing whitespace:
Searching for trailing whitespace
./changelog/pending.dd:180:With this release, when the target output range in a call to
./changelog/pending.dd:188:Before this release, $(REF copy, std, algorithm, mutation) took all
posix.mak:875: recipe for target 'test' failed
changelog/copy-auto-ref.dd
Outdated
| assert(h2.finish().toHexString == "D41D8CD98F00B204E9800998ECF8427E"); | ||
| ------- | ||
|
|
||
| This issue no longer occurs as long as `target` is an lvalue. |
There was a problem hiding this comment.
Just for clarity I would copy the same test with the behavior in >= 2.079
There was a problem hiding this comment.
You can just note that in 2.079 going forward, h1 and h2's results will be identical.
| isInputRange!SourceRange && | ||
| isOutputRange!(TargetRange, ElementType!SourceRange) && | ||
| !isArray!TargetRange && | ||
| !hasAssignableElements!TargetRange) |
There was a problem hiding this comment.
Only the last two overloads differ, so static if could simply things and provide better error messages here.
I assume you wanted to make it explicit that void is returned? The docs already do so.
There was a problem hiding this comment.
Not sure what you are meaning here, but I don't know how you do auto ref correctly without an overload.
std/algorithm/mutation.d
Outdated
| Returns: | ||
| The unfilled part of target | ||
| The unfilled part of `target` if `target` is not a `put` defining output | ||
| range. |
std/algorithm/mutation.d
Outdated
|
|
||
| import std.range.primitives; | ||
| import std.traits : isArray, isBlitAssignable, isNarrowString, Unqual, isSomeChar; | ||
| import std.traits : isArray, isBlitAssignable, isNarrowString, Unqual, isSomeChar, isDynamicArray; |
There was a problem hiding this comment.
Why is this import needed?
There was a problem hiding this comment.
I think this is what the test originally was (not isArray). It may be leftover.
| isInputRange!SourceRange && | ||
| isOutputRange!(TargetRange, ElementType!SourceRange)) | ||
| isOutputRange!(TargetRange, ElementType!SourceRange) && | ||
| (isArray!TargetRange || hasAssignableElements!TargetRange)) |
There was a problem hiding this comment.
Shouldn't this be just hasAssignableElements!TargetRange?
string a;
isArray!(typeof(a)).writeln; // true
isOutputRange!(typeof(a), char).writeln; // false
a[0] = "b"; // cannot modify immutable expressionThere was a problem hiding this comment.
Hm.. I think hasAssignableElements would fail on a char[], but not sure if that's a valid output range anyway.
There was a problem hiding this comment.
tested, char[] is not an output range for char, which is... strange. It looks like there's special handling for characters in put, so I would have expected this to work.
| $(REF copy, std, algorithm, mutation) is any `put` defining output range, the | ||
| target is taken as `auto ref`. This new overload is also `void` returning | ||
| vs the other overloads which return the unfilled portion of the given range | ||
| or array. |
There was a problem hiding this comment.
This isn't exact enough. It's really if the output range is not an input range with assignable elements. By deduction, this means it defines put, which is a necessary condition, but not a sufficient condition to trigger the auto-refness.
There was a problem hiding this comment.
e.g. one could define an input range with assignable elements that ALSO defines put for some optimization. In this case, it won't be auto-ref.
There was a problem hiding this comment.
Gah! If someone defines an output range which defines put and also has assignable elements, logically one would assume that any library function would prioritize put because that's supposedly the fundamental operation of output ranges. But copy prioritizes the r.front assignment, which according to the principal of least surprise is wrong. But "fixing" it would be a breaking change!
Thankfully, an output range which is also an input range with assignable elements is really rare.
There was a problem hiding this comment.
Well, I would expect that if it does define assignable elements, either interface will properly work in this scenario (copy by value). So I think it's OK.
The real problem would be if you somehow had an assignable element input range of T, which was also an output range of U ;) But I won't worry about that case...
|
I would like to hear from @andralex about the implications of using auto ref here, particularly surrounding this scenario: #6039 (comment) But other than that, LGTM. |
bcd57cd to
e890252
Compare
|
Fixed |
changelog/copy-auto-ref.dd
Outdated
| assert(h1.finish().toHexString == "E134CED312B3511D88943D57CCD70C83"); | ||
| assert(h2.finish().toHexString == "D41D8CD98F00B204E9800998ECF8427E"); | ||
| // this fails in 2.079 and later | ||
| assert(h1.finish().toHexString != h2.finish().toHexString) |
There was a problem hiding this comment.
Are you allowed to call finish more than once? I'm not familiar with the API of digests. Also the assert is trivially obvious (the two strings are different per the above asserts).
I think in the suggestion I was saying, you could just put a note after the code saying the two digests will be the same in 2.079 and later.
Alternatively, you could leave out the 3rd assert and put as a comment: "In 2.079 and later, both digests will have the same value".
e890252 to
17a0715
Compare
|
Ping @andralex |
|
Just wanted to say I am good with the rewording, thanks. |
andralex
left a comment
There was a problem hiding this comment.
Let's integrate the two versions into one that takes auto ref. Then distinguish cases with static if inside.
This will simplify the constraints. For the arrays, the auto ref won't matter anyway. Thanks!
|
@andralex sorry, I'm not sure if you saw my previous comment. Switching to auto-ref does matter for arrays -- now you will advance the array along. I guarantee there is code out there that does something like this: auto buffer = new int[100];
iota(15).copy(buffer);
writeln(buffer[0 .. 15]);Which will be semantically different if copy takes it's target via ref. |
|
@schveiguy just create a local copy of the array inside the |
|
Closing this as I'm no longer interested. If someone else wants to shepherd these changes, feel free. |
copyis designed in such a way that passing in a generic output range as the target value makes no sense, ascopyis supposed to return the "remainder" of target. As such, its inputs should be restricted to inputs that are assignable input ranges or slices.Consider the following from the stackoverflow example
This outputs
Because copy takes the target by value. What the user should have done is use
std.range.primitives.putto copy an input range to an output range which is taken by reference.This change adds a deprecation message that alerts the user. Any usage of
copyin this way is very likely deprecating broken code.