Skip to content
This repository was archived by the owner on Apr 12, 2024. It is now read-only.

perf(copy): use ES6 Map to track source/destination objects #13209

Closed
wants to merge 4 commits into from

Conversation

jbedard
Copy link
Collaborator

@jbedard jbedard commented Oct 30, 2015

@lgalfaso here's the ES6 Map change I promised.

When running f7bf118 the larger objects are 10-20% faster, but the smaller objects are a bit slower (0-25%, but the times are so small I don't think this means much). On my box that meant the larger objects were ~400ms faster, the smaller were 1-3ms slower. Memory usage also goes down 15-30% (30 on the more complicated cases, 15 on the simpler ones).

When using the shim (manually enabling it on chrome) the larger objects are 20-25% slower, the smaller objects are a bit slower just like when using the native Map. This is because the shim causes an extra indexOf call compared to before this change (one on get, a second on set). This extra indexOf could be avoided if we remove the check if the key already exists in the shim set method, but then this wouldn't be a proper Map shim (or add a magic 3rd param to set called "iAlreadyKnowTheKeyDoesNotExist` and always pass true? :P). EDIT: 105dd5d removes this extra 20-25% when using the shim

Note that while profiling isTypedArray is what stands out the most, not Map or indexOf, see #12054.


// Object keys, must use instance as key and not call toString
&& m.set(o, 2) === m && m.get(o) === 2
&& m.get({}) === undefined;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are so specific tests necessary? Do you have examples of browsers that did it wrong?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might not be necessary. I was more concerned about bad non-native shims / helpers. An is-native check might also work but I don't think that is done anywhere else...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mzgol The set method in IE11 is not spec compliant as I found out when writing my own polyfills

https://connect.microsoft.com/IE/Feedback/Details/2003432

Edit Doesn't look like the return value from set is used, so it probably doesn't matter here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I was trying to make sure get/set were fully compliant so ES6Map could be used as if it was a real map. But it's already only a subset of ES6 (just get/set) so maybe those 2 methods don't have to be perfect either...

@petebacondarwin petebacondarwin added this to the 1.4.9 milestone Dec 1, 2015
@gkalpak
Copy link
Member

gkalpak commented Dec 9, 2015

AFAICT, very few people rely (atm) on native Map implementations being available in the browser. I believe most people, either don't care or use a shim. Thus, I wouldn't make a change that makes copy() slower for the majority of cases. (It will probably be a good change for a later time.)

But, I would be all for a change that has no effect for the majority of cases (or just a tiny impact) and improves the case where a native Map implementation is available.

There are 3 categories of "usecases":

  1. Native Map: YES
  2. Native Map: NO, Map polyfill: NO
  3. Native Map: NO, Map polyfill: YES

In case (1), we will harness the perf improvement. GOOD

In case (2), we shouldn't degrade performance. If we can achieve that via the iAlreadyKnowTheKeyDoesNotExist optimization: GOOD
(I'm OK with a negligible decrease in performance.)

Case (3), which potentially applies to a large number of people atm, is what worries me. I would investigate how is browser support and how is the performance with some of the popular polyfills.
If the vast majority of the commonly used browsers support it natively*: GOOD
ElseIf the performance is not worse than what we have now (or negligibly so): GOOD
ElseIf we can detect non-native Map implementations and fall back to our shim: GOOD
Else: NOT GOOD

(*): WRT browser support for Map, according to kangax the situation is good, but probably not good enough. IE10, Safari 7 and mobile browsers doesn't seem to have adequate support. Also, note that polyfills still might do some "patchwork" on browsers with good (but not complete) support, so there might be an impact there as well.

If all is GOOD, I would go for this change. If not, I would wait for native implementations to catch up a little more.

Just my 2c...

@gkalpak
Copy link
Member

gkalpak commented Dec 9, 2015

Note that while profiling isTypedArray is what stands out the most, not Map or indexOf

@jbedard: Does this mean the isTypedArray is the bottleneck for copy() atm ? If that's the case, we should try to improve it.
I wonder if replacing the RegExp with multiple string comparison (e.g. a switch block) would improve performance.

@lgalfaso
Copy link
Contributor

lgalfaso commented Dec 9, 2015

@jbedard: Does this mean the isTypedArray is the bottleneck for copy() atm ? If that's the case, we should try to improve it.
I wonder if replacing the RegExp with multiple string comparison (e.g. a switch block) would improve performance.

BTW, changing this to string comparisons would make the fix for ArrayBuffer more natural

@jbedard
Copy link
Collaborator Author

jbedard commented Dec 10, 2015

c8768d1 has been merged which avoids the regex. Could still be changed if it would help with ArrayBuffer though. copy could do a giant switch statement which might also avoid calling toString on the same object multiple times, not sure if that's worth it though.

Could probably change the map shim to remember the last key looked up. That would make things like like get+put only do one indexOf. I might try that later and see if it makes the shim case faster...

@jbedard
Copy link
Collaborator Author

jbedard commented Dec 11, 2015

Also note that memory usage seems to go down 15-30% when using the native map.

@jbedard
Copy link
Collaborator Author

jbedard commented Dec 11, 2015

I've added a commit which avoids the double indexOf when the shim is used. This basically makes the shim the same speed/memory as today, maybe just 1-2% slower which doesn't mean much.

I'm still not sure if this is worth having a shim like this though. Would be great if others can try it out with different apps or test data...

}
},
set: function(key, value) {
var idx = this._idx(key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we abuse of the fact that we know we are never going to override a value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking something like

set: function(key, value) {
  this._keys.push(key);
  this._values.push(value);
}
get: function(key) {
  var idx = this._keys.lastIndexOf(key);
  if (idx != -1) return this._values[idx];
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could for this copy specific use case. But I was trying to keep this shim equivalent to the native ES6 version so it could be used anywhere...

@petebacondarwin petebacondarwin modified the milestones: 1.4.9, 1.4.10 Jan 21, 2016
@petebacondarwin petebacondarwin modified the milestones: 1.4.10, 1.5.1 Jan 28, 2016
@gkalpak gkalpak modified the milestones: 1.5.1, 1.5.2 Mar 16, 2016
@matsko matsko modified the milestones: 1.5.2, 1.5.3 Mar 18, 2016
@dcherman
Copy link
Contributor

dcherman commented Sep 8, 2016

@gkalpak Actually, I was going to suggest that we consider using an implementation closer to $$HashMap but didn't want to without profiling. Since $$HashMap does property lookups and we already have precedent of adding private $$ member variables to objects, you could possibly make a more optimal solution that does not use indexOf at all.

@@ -103,6 +103,8 @@
"getBlockNodes": false,
"createMap": false,
"VALIDITY_STATE_PROPERTY": true,
"testES6Map": true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant

@gkalpak
Copy link
Member

gkalpak commented Sep 8, 2016

@dcherman, not sure what you mean 😁

I don't think property lookups (in shape-changing, unoptimizable objects) are preferrable to indexOfs.
The usecase that $$HashMap supports and Maps don't (for Angular's purposes) is the ability to recognize copied objects based on the hashKey. But I am not sure we do rely on that extra property of $$HashMap.

If we don't, then we could replace the (presumably) slower $$HashMap with ES6Map - If we do, then we can't, so I would keep $$HashMap where necessary and use the (presumably) faster ES6Map elsewhere.

Are my assumptions wrong? Did I miss something?

@jbedard
Copy link
Collaborator Author

jbedard commented Sep 8, 2016

Today $$HashMap will be faster then ES6MapShim, but probably slower then the native Map. This is because $$HashMap uses $$hashKey while ES6MapShim uses 2-arrays + indexOf.

We could update the ES6MapShim to use $$hashKey for faster lookups, but we'd still need to fallback to the 2-array solution for any objects which we can't put a $$hashKey property on (primitives, frozen objects etc). That might make it faster in most cases though, and could then maybe replace $$HashMap? That would have the side effect of putting $$hashKey onto copyed objects when the non-native Map is used.

I could try this if there's interest...

@dcherman
Copy link
Contributor

dcherman commented Sep 8, 2016

@jbedard Primitives are easy - the stringified version of a primitive is itself the key since primitives are always === to equivalents.

Never considered the frozen object use case when I implemented my own Map polyfill, good thought.

@jbedard
Copy link
Collaborator Author

jbedard commented Sep 8, 2016

Good point, and hashKey(...) seems to handles primitives anyway. Note that you must do a little more then just convert to a string though, multiple primitives can have the same string such as true and "true" etc.

I always assumed the frozen object case is why copy has never used $$HashMap.

@gkalpak
Copy link
Member

gkalpak commented Sep 8, 2016

hashKey does already support primitives without problem (because it stores the type along with the stringified version. Frozen objects are indeed a problem for $$HashMap.

Still, ignoring the lastKey optimization of ES6MapShim and ignoring assignments and === comparisons (which should be negligible compared to lookup operations) and assuming plain $$hashKey values (not functions etc):

$$HashMap ES6MapShim
Operation: get 2 prop lookups 1 indexOf + 1 array lookup
Operation: set 2 prop lookups 1 indexOf + 2 array lookups

Considering that the objects which $$HashMap needs to look up are non-optimizable, I am not convinced that it is faster than ES6MapShim. (It could be, of course.)

@jbedard
Copy link
Collaborator Author

jbedard commented Sep 8, 2016

For large collections I think that indexOf is what makes it slow...

@gkalpak
Copy link
Member

gkalpak commented Sep 8, 2016

I don't know that much about VM internals, but is indexOf more expensive than object property lookups (again for unoptimized objects)?

@jbedard
Copy link
Collaborator Author

jbedard commented Sep 8, 2016

I think looping over every entry in an array is much slower then looking up one property, doesn't matter what VM you're using.

@gkalpak
Copy link
Member

gkalpak commented Sep 8, 2016

I think looping over every entry in an array is much slower then looking up one property, doesn't matter what VM you're using.

Unless properties are stored in an array-like structure 😛
Note also that with the current $$HashMap implementation, the lookup process has to go through prototypes as well (but this can be eliminated of course).

@dcherman
Copy link
Contributor

dcherman commented Sep 8, 2016

@gkalpak I don't know about VMs other than V8, but unoptimized objects fallback to dictionary/hashmap mode. Optimized objects use hidden classes that can point as a specific place in memory where a property is stored.

Here's one article on the subject if you're interested:
http://jayconrod.com/posts/52/a-tour-of-v8-object-representation

@dcherman
Copy link
Contributor

dcherman commented Sep 8, 2016

All of that said, we're debating how to implement a data-structure that is only unsupported in IE9/10 amongst the supported browser set I believe :)

Any changes to the internals should be non-breaking since we have a well defined interface, so we can choose not to let this discussion hold up this improvement.

@gkalpak
Copy link
Member

gkalpak commented Sep 9, 2016

I totally agree that if this improves performance (most of the time) - which it seems it does - we should defnitely get it in. But I would like to sort the $$HashMap issue first - i.e. I don't want to have two similar structures (double the size, double the maintenance overhead - even if small), where one structure suffices.

So, I do believe we should sort this issues out before merging this. Do we need $$HashMap or can we get away with just ES6Map? If we can, will it noticeably affect performance? These are the questions that I am trying to figure out.

@gkalpak
Copy link
Member

gkalpak commented Sep 9, 2016

I run through all uses of HashMap and there doesn't seem to be a case where we rely on the ability to consider different objects as the same based on the hash-key. So, ES6Map should work. I'll give it a go.

@gkalpak
Copy link
Member

gkalpak commented Sep 9, 2016

Tests pass with the native Map, but some fail unexpectedly with ES6MapShim 😕

@gkalpak
Copy link
Member

gkalpak commented Sep 9, 2016

Nevermind, it was my faulty implementation of delete. Btw, the current implementation of ES6MapShim does not handle NaN correctly.

@jbedard
Copy link
Collaborator Author

jbedard commented Sep 9, 2016

So ES6MapShim works fine as a replacement for HashMap? Thanks for trying that! I can do that, should it still be in this PR? Seems a little unrelated to the original copy stuff here...

If ES6MapShim is updated to use $$hashKey where possible then that should handle the NaN case.

@gkalpak
Copy link
Member

gkalpak commented Sep 9, 2016

Yes, it seems that ES6Map can replace HashMap (not sure at what cost performance-wise).
I have played with it locally - will make a POC PR soon.

@jbedard
Copy link
Collaborator Author

jbedard commented Sep 9, 2016

I assume performance with the native Map is about the same or maybe better then HashMap, but it will now support frozen objects (not sure if that is even possible with the use cases of HashMap) and won't require putting $$hashKey on everything. The shim will be worse, but maybe only noticeable for large collections.

We can probably improve performance of the shim by using $$hashKey and only use the 2 arrays as a fallback. But I'd be a little worried about using a $$hashKey version for copy because copied objects will be polluted with $$hashKey, unlike previously...

@gkalpak
Copy link
Member

gkalpak commented Sep 9, 2016

Voilà #15114.

@Narretz Narretz modified the milestones: 1.5.x, 1.7.x Apr 12, 2018
@petebacondarwin
Copy link
Contributor

Jason just stop providing PRs!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants