-
Notifications
You must be signed in to change notification settings - Fork 795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HashIdentity.Structural is a bad choice for Dictionary comparer #574
Comments
It's good to have this documented. However the results you're seeing from dict2 are not entirely unreasonable. NaN's are not equal, so if you store 100 NaNs you get a 100 NaNs - they all count as different. So I don't think a breaking change would be warranted for this at this point. |
I wouldn't care either way I guess if this didn't have a performance impact, but it does. It means that PER comparison is used which means that we can't do back to the Iequatable.equals method. And even that I could tolerate if there was a valid reason. But floating point numbers ashould never be used for equality operations (and can and do have strange results on Intel processors given that the underlying calculations occur on 80 bit precision numbers, although they can have equivalent 64 but representations are not equal! Yes this had bitten me in a real system!) So I feel that we are supporting a configuration that is invalid from the beginning... |
Hi @manofstick - could you add some perf figures for the performance difference here, especially for non-floating point key types? I need to get a picture for how it compares to the other costs for dict, groupBy and distinctBy. thanks |
OK, first example. Trying to be a little bit realistic, I have based this a stackoverflow question by ttsiodras from from 2011. His gist is here. My modified version is here.
The figures to not here are the comparison of the "default" lines in comparison to the "structural" lines (or "dynamic"). The reason for this is that "default" is using the EqualityComparers,Default implementation which is calling IEquatable<> which would be available if we had ER equality symantics rather than PER equality semantics. What should also be noted is that this example is quite dependent on the GetHashCode () implementation (as shown bty the difference between the "Custom" and the standard "Value" type times). And GetHashCode is unaffected by the PER symantics, so the results would be better than those noted. (The "tuple default" is bad due to their custom treatment by the f# compiler) So the results for the various types of ER vs PER is between about 70%-50% of the time, which is to my mind quite significant (although the whole things shows that really the right choice of data type is still the best way to get the best results!) |
Closing old discussion |
dict, groupBy & dict all use HashIdentity.Structural for the equality comparer, but it is not a good choice as it only supports partial equality relationship. i.e.
Which outputs the following
Now I'm not recommending the use of System.Collections.Generic.EqualityComparer.Default as this is obviously deficient in other ways, but rather an ER compatible version.
This would be a breaking change, although I think the current version is confusing, as I think most users would think it would act in the same way as the Default EqualityComparer, and most people would be using it in that fashion.
The text was updated successfully, but these errors were encountered: