This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
[WIP] Fix LargeArrayBuilder.CopyTo returning incorrect end-of-copy position #23730
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note: This only affects the
LAB.CopyTo
overload that returns aCopyPosition
. The other void overload isn't affected. The only type that uses this method isSparseArrayBuilder
(SAB), so really only consumers of SAB are affected. Only LINQ uses SAB even though it's in Common, specifically in the functionsConcat
,SelectMany
,Append
, andPrepend
.Append
andPrepend
weren't affected by this bug for reasons that will be explained later.The job of
LAB.CopyTo
is to accept aCopyPosition
to start copying from, and to return aCopyPosition
representing the position copied up to. Previously, it was always returning an incorrectCopyPosition
because of this line:The loop statement here,
row++, column = 0
, is the cause of the bug; it should not be running on the very last iteration of the loop. To give a better understanding of what it's causing, consider the following repro from @OmarTawfik:First, the SAB buffers the contents of all non-collections into a LAB, and stores references to the collections in an array (without copying the contents of those collections). It also maintains a list of "markers", which are indices at which the contents of the collections will be inserted in the future. So the above calls will create this:
Then in SAB.ToArray comes the bug. When a marker is sandwiched between the contents of two non-collections, this bug occurs.
We start out with
[ 0 0 0 ]
for the output array.What happens in the above example is SAB.ToArray (correctly) calls LAB.CopyTo to copy the contents of the first collection, with
position = (0, 0)
andcount = 1
. Now the output array is[ 1 0 0 ]
.(1, 0)
due to aforementioned bug, when it should really receive(0, 1)
.Then we (correctly) reserve space for the second collection, skipping over the 2nd slot.
Then we continue copying, starting from the position we received from the last LAB.CopyTo call.
(1, 0)
(which means index 0 in the buffer at index 1) should throw an exception/raise an assert/etc. However, due to the way it's currently written,GetBuffer(1)
returns the same asGetBuffer(0)
, the first buffer which is[ 1 3 ]
.[ 1 3 ]
to the output array, resulting in[ 1 0 1 ]
.Then we fill in the collections' contents, finally resulting in
[ 1 2 1 ]
.This bug doesn't affect
Append
orPrepend
because they don't use the faulty position fromCopyTo
; to put it another way, there's no way they can sandwich a marker between 2 non-collections. If we callnonCollection.Append(item).Prepend(item).ToArray()
, we get this during the first stage above:This PR fixes the implementation, refactors some loop logic into a
CopyToCore
method and adds regression tests.