-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-7040: [C#] - System.Memory Span.CopyTo - Crashes #6122
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix @abbotware.
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
public static void CopyToFix<T>(ReadOnlySpan<T> source, int sourceOffset, Span<T> target, int targetOffset, int length) | ||
{ | ||
for (int i = 0; i < length; ++i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a slower way of copying, and the underlying bug doesn't affect .NET Core.
We already multi-target for .NET Core. So can we only do the slow/workaround way when targeting netstandard1.3
? And when we are on .NET Core, we can continue to use Span.CopyTo
, which will perform faster than doing it one element at a time.
See
arrow/csharp/src/Apache.Arrow/Apache.Arrow.csproj
Lines 35 to 40 in e24825a
<ItemGroup Condition="'$(TargetFramework)' == 'netstandard1.3'"> | |
<Compile Remove="Extensions\StreamExtensions.netcoreapp2.1.cs" /> | |
</ItemGroup> | |
<ItemGroup Condition="'$(TargetFramework)' == 'netcoreapp2.1'"> | |
<Compile Remove="Extensions\StreamExtensions.netstandard.cs" /> | |
</ItemGroup> |
SpanExtensions.cs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eerhardt - targeting netcoreapp2.1
makes no sense (in my opinion) it should just be various netstandardX.Ys and Net Framworks if it were to be multi-target
I did mention in the Jira I would make it multi-target, but the above code makes me think the multi-targeting is not setup correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We multi-target to netcoreapp2.1
so we can take advantage of newer APIs that don't exist in netstandard. That way when this library is used in an application running on .NET Core, it can use the newer APIs.
Your new code is a perfect example. Using the built-in CopyTo
method is faster because it eventually calls memcpy
, which will copy data faster than doing it one element at a time. Since there is no issue using the built-in CopyTo
on .NET Core, it doesn't make sense to take the slower one-at-a-time route when running on .NET Core.
but the above code makes me think the multi-targeting is not setup correctly
Can you explain what you mean here? Multi-targeting is set up correctly in this project. I followed the same pattern that we do in https://github.com/dotnet/runtime, which is to not use #if
but instead split the different code into separate files, which makes it easier to reason about which code runs where. And it also aids in maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is an example of an API in NetCore that is not in NetStandard 2.0 or 2.1 ?
Maybe I am taking a purist approach - But i thought the generally accepted practice is to make libraries target NetStandard... if you build an app you target NetCore or Framework and then implementation specific api's are use by the run-time.
I agree with mult-targeting and not using #if / #def.
as you described.. I just don't understand the value of Targeting NetCore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is an example of an API in NetCore that is not in NetStandard 2.0 or 2.1 ?
All the methods on Stream that accept Span / Memory parameters instead of arrays.
ex.
arrow/csharp/src/Apache.Arrow/Extensions/StreamExtensions.netcoreapp2.1.cs
Lines 24 to 27 in 249a12f
public static int Read(this Stream stream, Memory<byte> buffer) | |
{ | |
return stream.Read(buffer.Span); | |
} |
These methods are not in netstandard2.0
and below. They were added in netstandard2.1
, but the only executable frameworks that supports netstandard2.1
is .NET Core 3.0 and 3.1. So if we don't target netcoreapp2.1
, then someone running on .NET Core 2.1 will get the slower implementation.
But i thought the generally accepted practice is to make libraries target NetStandard
That is correct, and this library DOES target netstandard. However, there are places where we want to take advantage of faster APIs (like the Stream.Read method above accepting a Span). In those situations, we compile multiple times - once for netstandard
and once for the lowest framework that has the faster API (in this case netcoreapp2.1
).
Consumers don't see anything different. They can still run on .NET Framework or Xamarin just fine. But when they run on .NET Core 2.1+, the code built specifically for that framework is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I am proposing is to reuse the existing targets(netstandard1.3
and netcoreapp2.1
), and only doing the one-element-at-a-time copy on the netstandard1.3
target, which is the one used when the consuming application is running on the .NET Framework. When an application is running on .NET Core (where there is no issue), it will use the Apache.Arrow assembly that is compiled for netcoreapp2.1
. When compiling for netcoreapp2.1
, we can use the built in CopyTo
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok - i thought you were arguing for minimal surface area of the affected target frameworks (since netstandard1.3 is much wider range of run times.. mono, Xamarin, etc) and the bug seems to only be in net framework. However... I am happy to put into netstandard1.3 as I am currently using a fork of arrow and this bug is the only reason I am not using the official package.
what do you think of file globing for the separate files? (see here:)
https://github.com/abbotware/abbotware/blob/master/Common/Abbotware.Core/Abbotware.Core.csproj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think of file globing for the separate files?
As long as the Solution Explorer in VS works correctly with it, I would be fine with that solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in either case - review the PR now as I did the split and added the globbing
{ | ||
for (int i = 0; i < length; ++i) | ||
{ | ||
target[targetOffset + i] = source[sourceOffset + i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when target
and source
overlap in memory? Does this handle that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas suggested pinning the spans before calling CopyTo
, which should avoid the access violation on .NET Framework. This should avoid issues when the memory overlaps and be more efficient than one-element-at-a-time. Can you try doing that as an alternative fix?
CopyToFix()
{
fixed (void* pFrom = MemoryMarshal.GetReference(spanFrom))
fixed (void* pTo = MemoryMarshal.GetReference(spanTo))
{
spanFrom.CopyTo(spanTo);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do some tests - I know the current implementation works. Without the fix it would crash randomly with in the Arrow library - but almost immediately in the sample code i provided in: dotnet/coreclr#27590
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the current implementation works
It may work for the cases you tested. But it doesn't work when target
and source
overlap in memory. If the beginning of target
and the end of source
are the same memory location, by the time the loop gets to the end of source
, you have overwritten the original values in source
. Thus the wrong information gets copied to target
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if they overlap I would expect that to be a program bug (although I can see where that would be useful) if Span.CopyTo handles overlaps, then yes my simple approach is not correct as a temp copy is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this is very clear)
https://docs.microsoft.com/en-us/dotnet/api/system.readonlyspan-1.copyto?view=netcore-3.1
(the remarks should probably match the the above wording)
https://docs.microsoft.com/en-us/dotnet/api/system.span-1.copyto?view=netcore-3.1
I'll test out @jkotas suggestion and change the patch... as that seems like it solves a few problems at once
made it internal add todo
@@ -71,3 +71,10 @@ site/ | |||
**/*.Rcheck/ | |||
**/.Rhistory | |||
.Rproj.user | |||
/.vs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to change the .gitignore
file with this change. Can you explain why this is being proposed here?
If there is a C# specific pattern that needs to be ignored (like .vs
), there is a csharp/.gitignore
file for the C# specific patterns - but it already has .vs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you click "Open in Visual Studio" on github - the root folder is opened and folders are created automatically... Any one else that does that will probably see those folders get created as well - so its just annoying to see those all the time..
@@ -32,10 +32,11 @@ | |||
</EmbeddedResource> | |||
</ItemGroup> | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) inserting this blank line is unnecessary.
using System.Runtime.InteropServices; | ||
|
||
namespace Apache.Arrow | ||
{ | ||
public static class SpanExtensions | ||
public static partial class SpanExtensions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) double spaces between public
and static
@@ -41,6 +41,7 @@ | |||
using System.Runtime.CompilerServices; | |||
using System.Runtime.InteropServices; | |||
using System.Text; | |||
using Apache.Arrow; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you revert this change as well?
{ | ||
for (int i = 0; i < length; ++i) | ||
{ | ||
target[targetOffset + i] = source[sourceOffset + i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we guaranteed that target
and source
will never overlap in memory?
If they do, this is not going to work correctly. If the beginning of target
is at the end of source
, this code will mutate the end of source
before it reads the last elements of source
. Thus the wrong data will be copied to target
.
} | ||
|
||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
internal static void CopyToFix<T>(ReadOnlySpan<T> source, int sourceOffset, T[] target, int targetOffset, int length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this method? Won't the above method that takes Span<T> target
be enough, since T[]
is implicitly convertible to Span<T>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might no longer be needed (I removed the usage in flat buffers) - I created different overloads to get the compilation to work when I did a global replace of Span.CopyTo - If i created it, it was being used at some point for compliation
What is the status of this? Needs a rebase |
Following up again on this since 10 weeks have passed |
I'm closing this since it has gone stale but maybe someone can pick it up in the future |
Just a follow up here - a fix has been checked into .NET Framework 4.7 and 4.8 that makes this change no longer necessary. According to dotnet/runtime#13701 (comment) the fix is scheduled to be deployed in September. |
Work around for random crashes due to bug in Jitter / System.Memory
https://github.com/dotnet/coreclr/issues/27590