Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Set in uniq() When Available In Environment #395

Merged
merged 1 commit into from
Nov 18, 2015
Merged

Conversation

svozza
Copy link
Collaborator

@svozza svozza commented Nov 18, 2015

A recent discussion on Ramda (ramda/ramda#1512) prompted me to have a look at our uniq function so I ported over the benchmarks from there and ran them against our code base. The results were not pretty:

.uniq x 10,000 (heterogenous)

underscore ######################### 61ms
lodash #### 8ms
highland ############################################################ 151ms

.uniq x 10,000 (numbers)

underscore ############################## 5ms
lodash ############ 2ms
highland ############################################################ 10ms

.uniq x 10,000 (all unique)

underscore #################### 316ms
lodash # 8ms
highland ############################################################ 966ms

However, by using Set when it is available in the environment the performance can be improved dramatically.

.uniq x 10,000 (heterogenous)

underscore ############################################################ 68ms
lodash ######## 8ms
highland ##### 5ms

.uniq x 10,000 (numbers)

underscore ############################################################ 5ms
lodash ############ 1ms
highland #################################### 3ms

.uniq x 10,000 (all unique)

underscore ############################################################ 329ms
lodash ## 8ms
highland ## 6ms

I normally shy away from this sort of feature sniffing but this is such an easy gain that will benefit all our users running Node 0.12, 4.x and 5.x that I think it's worth it in this case. Thoughts?

@@ -2258,6 +2258,28 @@ exposeMethod('uniqBy');
*/

Stream.prototype.uniq = function () {
if (!_.isUndefined(typeof Set)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typeof Set is never going to be undefined (it might be 'undefined'), so this block will always run.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I know, silly mistake.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use !_.isUndefined(_global.Set). We detect Symbol for iterator support in the same way.

@vqvu
Copy link
Collaborator

vqvu commented Nov 18, 2015

This looks like a big enough win that it's worth it for me. I'll let travis run before merging.

@svozza
Copy link
Collaborator Author

svozza commented Nov 18, 2015

Yep, and we're lucky we don't have to jump through the hoops Ramda does because they use value equality across the board.

@vqvu vqvu added this to the v2.6.0 milestone Nov 18, 2015
vqvu added a commit that referenced this pull request Nov 18, 2015
Use Set in uniq() when available in environment.
@vqvu vqvu merged commit 7cb7984 into caolan:master Nov 18, 2015
@svozza svozza deleted the uniq-set branch November 18, 2015 23:19
@vqvu
Copy link
Collaborator

vqvu commented Nov 19, 2015

Something just occurred to me. Equality in sets is not the same as === (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Set). So the current implementation will behave differently on different Nodes.

Specifically, Set uses SameValueZero, which is different from StrictEqualityComparison. Although the only difference is that NaN !== NaN, while SameValueZero(NaN, NaN) === true.

We need to check for NaN in the Set implementation and always pass it through.

Also, I personally don't think NaN !== NaN is intuitive despite IEEE's opinion on the matter, especially in the context of a uniq transform. Should we switch to using SameValueZero for the equality algorithm in 3.0.0?

For comparison

Edit: Typo. I meant to say that NaN !== NaN is unintuitive.

@svozza
Copy link
Collaborator Author

svozza commented Nov 19, 2015

No, I think we should stay with strict equality as that's what uniq is supposed to do. It's just a leaky abstraction if we allow the underlying implementation to change the semantics like that.

@vqvu
Copy link
Collaborator

vqvu commented Nov 19, 2015

You might have misunderstood me. For 2.x, we should definitely fix the Set implementation to use strict equality. My question was whether or not we should switch to SameValueZero for both implementation in 3.0.0.

@svozza
Copy link
Collaborator Author

svozza commented Nov 19, 2015

No, I got what you meant, I think we should keep strict equality for 3.0; for me, uniq always should be defined in terms of === .

@vqvu
Copy link
Collaborator

vqvu commented Nov 19, 2015

Fair enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants