-
Notifications
You must be signed in to change notification settings - Fork 790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Equality operator causes boxing on value types #526
Comments
FYI I'm attacking some boxing issues in #513, but this case is still kind of an issue (your second example no longer causes boxing with that change applied). As I've been playing around in the equality space for a while, I don't think there is a particularly easy answer to this (i.e. one that doesn't change existing code functionality) although I do offer some possibilities at the end of this posting. The reason is that their are three classes of equality; "PER", "ER" and "Other" (basically how to deal with floating point NaNs, as well as just any old equality to want to throw at it.) IStructuralEquatable deals with any of these by the IEqualityComparer that is passed in. The problem is that by default "=" does so with PER equality, and to change it would be a breaking change. So you can't just have any old struct deferring it's equality to the Equals method without checking all it's members. Here is an example of the issue:
With the results
So anyway, I said that your second version now doesn't cause boxing, and this is because it doesn't implement IStructuralEquatable, so it doesn't have to worry about this ER/PER split, and it does have the IEquatable<'a> so that is used. Anyway, a potential solution to solve this problem fully could be to recursively check a value type to ensure that it has not floating point numbers in it or subtypes; which is a bit dirty; or alternatively a better solution, although a bit bloaty, might be to create a new interface IEquatablePER<'a> and have the compiler generate those for value types (and records) and then defer to those. Now I'm not a Microsoft employee, or on any committee that makes such decisions, so I can't tell you if this is likely to be done. |
Or take the breaking change and fix the nan inconsistency... Having to box structs in simple equality comparisons is a high price to pay to maintain what is anyway a bug. You think you're saving on GC pressure and you're actually causing a ton more, this makes structs almost useless. Of course with your PR one will be able to work around the issue by defining custom equality, but it's an obscure fix and it feels like writing C#. |
@asik -- scenarios like this are exactly why the [<Struct>]
type MyVal =
val X : int
new(x) = { X = x }
static member op_Equality(this : MyVal, other : MyVal) =
this.X = other.X
module Default =
let test () =
for i in 0 .. 10000000 do
(MyVal(i) = MyVal(i + 1)) |> ignore
module NonStructural =
open NonStructuralComparison
let test () =
for i in 0 .. 10000000 do
(MyVal(i) = MyVal(i + 1)) |> ignore
// Real: 00:00:00.089, CPU: 00:00:00.093, GC gen0: 29, gen1: 1, gen2: 0
Default.test()
// Real: 00:00:00.003, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
NonStructural.test() That's not to say we should ignore potential perf improvements to the default operators, certainly not. But at least there is a fairly straightforward workaround available with 4.0. You just need to define |
@latkin Good to know! I'll update my Stackoverflow answer with this information. |
Yes, simply opening |
An alternative to NonStructuralComparison could be
Which would mean you wouldn't need to implement op_Equality, rather just use the c#esq operators (not sure if that is a good choice or not! Name them what you will...) |
Oh, and you're still going to run into lots of trouble using value types anyway. Using them as keys to containers, embedding them in other objects, in the 64-bit JIT when they are greater than 64-bits and used as parameters in a function that uses the "tail" IL instruction (this is due to calling conversion rules), using things like List.find on a list of structs... #513 resolves many of these issues, but they should still be used knowing that there are potential pitfalls all over the shop... |
@manofstick Looks like your (==) does the right thing for everything that's structurally equatable (records, DUs) as well as for structs. Everything is fast and correct. Might as well rename it (=), ignore the compiler warning, Only issue I can see is that this'll create compilation errors for types that don't implement IEquatable(T), but that's actually very nice, I'd like to be warned about that. |
Will I wouldn't really recommend it, but each to their own; and whoever has to support the code in the future! As overriding = would mean that they couldn't use it for structural equality on containers, including arrays. Plus the subtle change of meaning for floating point numbers doesn't help. And most of the time the performance just doesn't matter. I agree that when it does, it certainly does, but that can just be profiler guided. Anyway, this is really a stackoverflow discussion, rather than an issue here. So I will end it here. |
I disagree that this should resolved as by design. This is surprising behavior and a performance trap. It affects all library code that uses this operator or structural equality, like Array.contains, Array.except, Array.distinct (and the same in Seq, List); causing N allocations for an input of size N is not a small performance problem, especially for larger value types (for ex. a 4x4 matrix of floats). There is no way to know which functions to avoid except through benchmarking. This reduces F#'s appeal in low-latency or high performance applications. |
OK, reopening. As mentioned above much of the impact is reduced by the combination of #513 and the dilegent use of NonStructuralComparison. |
I have encountered this issue while making some research on F# equality. To me, this is surprisingly frightening. Any love for this? |
I'm getting the same performance numbers on the current Visual F# 2017 nightly.
|
I think the correct solution here is to define a new equals operator dedicated to structs. And compiler should give a warning for the old usage. This way nothing breaks and people can gradually fix their code And yes, the code for FSharp.Core also has to change for the things like List.contains which also would do boxing by default for user defined value types. |
Here's a script file demonstrating the problem for
and the output is:
And none of the above fixes will prevent this problem! |
It's not quite that simple. Internally in library functions such as groupBy or the list element comparisons or tuples etc. what you have suggested just doesn't work. This was the reason why #513 existed. |
@manofstick Sorry but I fail to see how #513 is relevant to this issue at all. Correct me if I am wrong but #513 is about an optimization regarding value types with generic type parameters. Here we don't have that. Finally I don't get why it is not simple to fix the below code for List.contains in F# Core:
There is a single e = h1 expression leading to boxing. We just need to get rid of it with a proper equality checking. |
Main issue with that is still a backwards compatibility and we don't exactly have a plan on how to solve it as well as don't have approval from @dsyme for an opt-out approach. I'm also not entirely sure how can we make all library functions work with that opt-out approach, since can't really do any shadowing in dlls |
I'm not sure how widespread knowledge is of This approach doesn't solve the issue generally:
Despite that it can certainly help root out problems and I think deserves to be much better known |
@brianrourkeboll It looks to me the huge difference you notice is really due to something else: the reduction of the integer sequence to a fast for-loop is not occurring in the first example. This has the flow on effect of boxing in equality - but solving the second problem wouldn't solve the first problem for this particular example. I think (not sure, it's been a while) this is because this particular "optimization" is done during type checking, and full type information is not known to allow it to proceed. I believe (not sure) that this can be "fixed" by also doing the reduction doing optimization if necessary (it still has to be done in type checking as well because the optimization is visible in quotations, when it applies). |
Hmm, I'm not sure I follow. The #time "on"
let fast (xs : int array) =
let genericFSharpEquality (value : 'T) (data : 'T[]) =
let mutable found = 0
for i = 0 to data.Length - 1 do
if data.[i] = value then
found <- found + 1
found
let _ = genericFSharpEquality 1 xs
()
let slow (xs : int array) =
let genericFSharpEquality (value : 'T) (data : 'T[]) =
let mutable found = 0
for i = 0 to data.Length - 1 do
if data.[i] = value then
found <- found + 1
found
ignore (genericFSharpEquality 1 xs)
let xs = [|1..1_000_000|] then: fast xs // Real: 00:00:00.002, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0 slow xs // Real: 00:00:00.024, CPU: 00:00:00.015, GC gen0: 0, gen1: 0, gen2: 0 |
@brianrourkeboll You're correct, my bad. |
I see. The difference is that Anyway, when it occurs this kind of inline+type-specialization is indeed a huge performance gain and can transform expensive generic equality calls to very inexpensive integer comparisons. No matter how generic equality is implemented you tend to get this kind of thing. That said, it may be possible to make dramatic improvements to the generic equality non-inlined implementation, e.g.
By type specializing I mean something like Again this doesn't fully address the boxing problem by any means. |
Could the optimizer be made smart enough to look at struct types recursively and inline the comparison if it doesn't contain any members that require structural comparison? That would at least solve the problem in many common cases, with e.g. Matrix and Vector types in game engines. |
Issue mentioned in F# developer stories: how we’ve finally fixed a 9-year-old performance issue. |
Boxing still happens in some cases. Take this code snippet for example, when a record is generic and it takes type of another struct record/union:
When comparing, it will still box the nested |
Comparing value types with the equality operator (=) causes unnecessary boxing to occur. Tested on Visual Studio 2015 RC. I described the issue in this StackOverflow question.
Example:
My totally uninformed guess here is that the type is casted to IEquatable<_> before the strongly typed Equals is then called.
But it gets worse! If custom equality is defined:
Then the equality operator calls
Equals(obj)
, boxing both operands in the process! Nevermind the casts then required. As you can see this results in roughly twice the amount of GC pressure.Even for reference types, this is suboptimal because casts are required.
The only workaround this problem is systematically avoid use of this operator, which may not be feasible when using generic code in third-party F# libraries. I see no reason why the compiler shouldn't generate a direct call to
IEquatable<T>.Equals
without boxing or casts; a call that could then be properly inlined by the CLR in many cases.The text was updated successfully, but these errors were encountered: