-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Fix LargeArrayBuilder.CopyTo returning incorrect end-of-copy position #23817
Conversation
src/System.Linq/tests/ConcatTests.cs
Outdated
@@ -421,5 +421,18 @@ public void GetEnumerableOfConcatCollectionChainFollowedByEnumerableNodeShouldBe | |||
Assert.Equal(0xf00, en.Current); | |||
} | |||
} | |||
|
|||
[Fact] | |||
public void CollectionInterleavedWithLazyEnumerables_ToArray_Regression() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Do we need the "_Regression" suffix on these tests? In a year it won't matter to someone reading the code whether we added a test proactively or in response to a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephentoub I personally added _Regression
because someone might think "what a weird test. What is this supposed to cover?" / etc. in the future. Admittedly, it might be more helpful to remove the suffix and just link to the github issue in a comment:
public void CollectionInterleavedWithLazyEnumerables_ToArray()
{
// This is a regression test for https://github.com/dotnet/corefx/issues/23680
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what a weird test. What is this supposed to cover?
That's why test naming is important, to qualify what the test is for. Comments are good, too, of course. But "_Regression" as a suffix is so moment-in-time. If you were to follow true TDD, for example, everything would need to be suffixed with "_Regression" because the tests are all written before any code is and thus all fail initially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a comment with the bug URL.
{ | ||
buffer = GetBuffer(++row); | ||
copied = CopyToCore(buffer, 0); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have any tests that enter this loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests with larger data sets
// end of the current buffer. | ||
int copyCount = Math.Min(buffer.Length, count); | ||
// Copy until we satisfy `count` or reach the end of the current buffer. | ||
int copyCount = Math.Min(sourceBuffer.Length - sourceIndex, count); | ||
|
||
if (copyCount > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have any tests where copyCount == 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@OmarTawfik I don't think this can happen anymore. Add these asserts and remove the conditional:
+Debug.Assert(count > 0);
Debug.Assert(sourceBuffer.Length > sourceIndex);
// Copy until we satisfy `count` or reach the end of the current buffer.
int copyCount = Math.Min(sourceBuffer.Length - sourceIndex, count);
+Debug.Assert(copyCount > 0);
public void CollectionInterleavedWithLazyEnumerables_ToArray_Regression() | ||
{ | ||
var results = new[] { 4, 5, 6 } | ||
.SelectMany(i => i == 5 ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should have tests with other operators and with varying sizes, not just SelectMany/Concat and not just with one or a few items.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, since various operators specialize on the source collection type, seems like we need tests on various source types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not just SelectMany/Concat
I explained in my last PR's description that since Append/Prepend never use the incorrect returned CopyPosition
, they shouldn't be affected by this bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explained in my last PR's description that since Append/Prepend never use the incorrect returned CopyPosition, they shouldn't be affected by this bug.
Even if that's true of the current implementation, you're assuming the current implementation will never change. Tests are important both to verify existing behavior and to help prevent future regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug is affecting only SelectMany() and Concat(). I've added tests for small and large data sets.
@@ -149,8 +149,11 @@ public void CopyTo(T[] array, int arrayIndex, int count) | |||
count -= reservedCount; | |||
} | |||
|
|||
// Finish copying after the final marker. | |||
_builder.CopyTo(position, array, arrayIndex, count); | |||
if (count > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is beneficial for scenarios when we insert a marker at the very end, e.g. nonCollection.Append().Append()./* a bunch of times */
or nonCollection.Concat(collection)
, so 👍
? new CopyPosition(row + 1, 0) | ||
: new CopyPosition(row, copied); | ||
|
||
int CopyToCore(T[] sourceBuffer, int sourceIndex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Is it customary in corefx to put a local func at the very end of the containing method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have very few local methods; there's no convention yet. But unless there's an obviously better place, end of the method makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what is the custom for this repo. I've not seen a lot of local functions used here yet. But that pattern is common.
{ | ||
T[] buffer = GetBuffer(index: row); | ||
Debug.Assert(sourceBuffer.Length > sourceIndex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mentioned previously this assert failed. It's good to see that's no longer the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because of the if (count > 0)
check added to the calling method.
// end of the current buffer. | ||
int copyCount = Math.Min(buffer.Length, count); | ||
// Copy until we satisfy `count` or reach the end of the current buffer. | ||
int copyCount = Math.Min(sourceBuffer.Length - sourceIndex, count); | ||
|
||
if (copyCount > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@OmarTawfik I don't think this can happen anymore. Add these asserts and remove the conditional:
+Debug.Assert(count > 0);
Debug.Assert(sourceBuffer.Length > sourceIndex);
// Copy until we satisfy `count` or reach the end of the current buffer.
int copyCount = Math.Min(sourceBuffer.Length - sourceIndex, count);
+Debug.Assert(copyCount > 0);
src/System.Linq/tests/ConcatTests.cs
Outdated
@@ -421,5 +421,18 @@ public void GetEnumerableOfConcatCollectionChainFollowedByEnumerableNodeShouldBe | |||
Assert.Equal(0xf00, en.Current); | |||
} | |||
} | |||
|
|||
[Fact] | |||
public void CollectionInterleavedWithLazyEnumerables_ToArray_Regression() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephentoub I personally added _Regression
because someone might think "what a weird test. What is this supposed to cover?" / etc. in the future. Admittedly, it might be more helpful to remove the suffix and just link to the github issue in a comment:
public void CollectionInterleavedWithLazyEnumerables_ToArray()
{
// This is a regression test for https://github.com/dotnet/corefx/issues/23680
...
}
public void CollectionInterleavedWithLazyEnumerables_ToArray_Regression() | ||
{ | ||
var results = new[] { 4, 5, 6 } | ||
.SelectMany(i => i == 5 ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not just SelectMany/Concat
I explained in my last PR's description that since Append/Prepend never use the incorrect returned CopyPosition
, they shouldn't be affected by this bug.
|
||
return copied == buffer.Length | ||
? new CopyPosition(row + 1, 0) | ||
: new CopyPosition(row, copied); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I believe the previous setup I had in place which looked like
/* first copy */
if (count == 0)
{
// Note it's `column + copied` instead of `copied`
return new CopyPosition(row, column + copied).Normalize(buffer.Length);
}
do
{
/* subsequent copies */
}
while (count > 0);
return new CopyPosition(row, copied).Normalize(buffer.Length);
was correct. With the current code, something like
new TE<int>(new[] { 1 }).Concat(new TE<int>(new[] { 2 })).Concat(new TE<int>(new[] { 3 })).Concat(new[] { 4 }).ToArray()
or
new[] { 1, 2, 3, 4 }.SelectMany(i => i == 4 ? new[] { i } : new TE<int>(new[] { i })).ToArray()
should result in 1 2 2 4
. Can you verify that's true?
If it is, you can add a regression test for them named NonCollectionThatDoesNotBeginAtStartOfRow_FollowedByOtherNonCollections
. What happens is the LAB the non-collections are buffered into looks like this:
C0 C1 C2
R0: 1 2 3
The first LAB.CopyTo takes (row = 0, column = 0) as input, copies 1 item (1
), and returns (0, 1). The 2nd LAB.CopyTo takes (0, 1) as input, copies 1 item (2
) and returns (0, 1) when it should return (0, 2). The 3rd LAB.CopyTo then takes (0, 1) as input and copies 2
instead of 3
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous code is not needed because we now assert that count
is never zero. However, I'll add the tests mentioned.
@stephentoub I can see where you're coming from, and the practical part of me agrees with you. However, a part of me would be sad to see the code go... 😢 If you want, I can try cleaning up the code and renaming things after this PR to make it easier for other people to grasp what it does, to make any future bugs intuitive/obvious to fix for others. So far, although I've tried explaining it a lot, nobody seems to understand my code but me. I am waiting to hear what @VSadov has to say though. |
@OmarTawfik OK, I was wrong in my earlier comment when I said that new TE<int> { 1 }.Concat(new TE<int> { 2 }).Concat(new TE<int> { 3 }).Concat(new[] { 4 }).ToArray() would fail. I've debugged a little to see why. Here's a test case that actually does fail: new TE<int> { 1 }.Concat(new[] { 2 }).Concat(new TE<int> { 3 }).Concat(new[] { 4 }).Concat(new TE<int> { 5 }).ToArray()
// Expected: 1 2 3 4 5
// Actual: 1 2 3 4 3 The fix is to go back to the previous setup in LAB.CopyTo which looked like /* first copy */
if (count == 0)
{
// Note it's `column + copied` instead of `copied`.
return new CopyPosition(row, column + copied).Normalize(buffer.Length);
}
do
{
/* subsequent copies */
}
while (count > 0);
return new CopyPosition(row, copied).Normalize(buffer.Length); |
@OmarTawfik After you apply the above change to avoid the bug I pointed out, this PR looks fine to me. I have some minor nits but those can wait until a later PR. |
if (copyCount > 0) | ||
{ | ||
Array.Copy(buffer, column, array, arrayIndex, copyCount); | ||
int CopyToCore(T[] sourceBuffer, int sourceIndex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be simpler without a local function, but this will work too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@jamesqo thanks for the additional test case! |
T[] buffer = GetBuffer(row); | ||
int copied = CopyToCore(buffer, column); | ||
|
||
if(count == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: in case you end up making other edits, there's a missing space between if and (
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, FWIW, I find it very confusing that CopyToCore is capturing count and modifying it. There's no indication at the call site that's happening, which means it looks like the only way count could be 0 is if it was passed in as 0, and yet there's a debug assert for that above.
src/System.Linq/tests/ConcatTests.cs
Outdated
{ | ||
// See https://github.com/dotnet/corefx/issues/23680 | ||
|
||
var concats = arrays[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we try to use var only when the type is obvious from the right hand side, which it isn't here or below
If you believe this fully addresses the bug, with all corner cases covers, then LGTM (other than my comments). |
@stephentoub addressed the nit and added a couple more cases. I believe this is ready. I'll merge this as soon as it is green and prepare a PR for the release branch. |
@dotnet-bot test Tizen armel Debug Build |
…dotnet#23817) * Fix #23680 * PR Feedback * More tests * More tests
…dotnet/corefx#23817) * Fix dotnet/corefx#23680 * PR Feedback * More tests * More tests Commit migrated from dotnet/corefx@9b88997
Fixes #23680
Replaces #23730
cc @stephentoub @VSadov @jamesqo