Incremental performance improvements to element creation #3169

gbj · 2023-03-18T19:44:45Z

Description

This PR includes a series of incremental improvements to element creation speed. It is a smaller effect than I'd anticipated overall, but measurable for each step. There are two general performance approaches used:

Enabling wasm-bindgen string interning, which reduces the cost of copying frequently-used strings across the WASM-JS boundary
Using element.cloneNode() instead of document.createElement(), which is significantly faster on the browsers I've tested. (I implemented this with a very simple O(n) linear search, where n is the number of distinct elements used in an app, as I'm assuming the number of HTML elements used in any app is small enough for hashing the element names and using a HashMap not to be worth it, but I didn't benchmark a HashMap approach.)

Trade-offs:

String interning costs a constant amount for each distinct string used, in exchange for faster speed when reusing strings. This means that for applications that involve creating the same elements or setting the same attributes multiple times, it is a net win.
The node cloning approach adds some significant complexity over the more intuitive document().create_element() approach
Each level of increased runtime speed tends to come with a slight increase in binary size

I've made this in a series of commits with increasing levels of performance, each building on top of the last, so you can decide which, if any, you want to adopt.

intern element tag names
intern attribute and event listener names
intern attribute values
use cached node cloning instead of element creation

Here are local results for js-framework-benchmark for each approach:

I'd note there seems to be an exponential slowdown somewhere with element creation, such that current Yew scores 1.58 on "create 1000 rows" but 2.20 on "create 10,000 rows," and the node cloning approach is 1.41 on "create 1000" and 1.80 on "create 10,000."

I won't be sad if you decide none of this is worth it, given the magnitude of the improvements is not that big. Just wanted to offer it!

Checklist

I have reviewed my own code
I have added tests

github-actions · 2023-03-18T19:47:42Z

Visit the preview URL for this PR (updated for commit 10d5f89):

https://yew-rs-api--pr3169-performance-improvem-5zc0a3lu.web.app

_{(expires Sun, 09 Apr 2023 16:39:04 GMT)}

_{🔥 via Firebase Hosting GitHub Action 🌎}

github-actions · 2023-03-18T19:52:14Z

Benchmark - SSR

Yew Master

Benchmark	Round	Min (ms)	Max (ms)	Mean (ms)	Standard Deviation
Baseline	10	353.656	355.595	354.452	0.673
Hello World	10	628.596	631.143	629.305	0.692
Function Router	10	2150.388	2158.887	2155.662	2.460
Concurrent Task	10	1006.972	1008.672	1007.683	0.569
Many Providers	10	1644.552	1663.200	1652.729	6.190

Pull Request

Benchmark	Round	Min (ms)	Max (ms)	Mean (ms)	Standard Deviation
Baseline	10	353.459	355.630	354.645	0.692
Hello World	10	630.781	633.112	631.369	0.736
Function Router	10	2136.385	2177.344	2144.923	11.882
Concurrent Task	10	1006.590	1008.737	1007.958	0.678
Many Providers	10	1651.992	1677.968	1659.592	9.058

github-actions · 2023-03-18T19:52:38Z

Size Comparison

examples	master (KB)	pull request (KB)	diff (KB)	diff (%)
async_clock	101.879	104.199	+2.320	+2.278%
boids	171.810	174.132	+2.322	+1.352%
communication_child_to_parent	92.598	94.914	+2.316	+2.502%
communication_grandchild_with_grandparent	103.522	105.841	+2.318	+2.239%
communication_grandparent_to_grandchild	99.693	102.019	+2.325	+2.332%
communication_parent_to_child	89.931	92.247	+2.316	+2.576%
contexts	106.125	108.443	+2.318	+2.185%
counter	87.972	90.289	+2.317	+2.634%
counter_functional	88.307	90.624	+2.317	+2.624%
dyn_create_destroy_apps	90.819	93.140	+2.320	+2.555%
file_upload	102.270	104.142	+1.872	+1.831%
function_memory_game	164.169	166.488	+2.319	+1.413%
function_router	331.980	333.572	+1.592	+0.479%
function_todomvc	159.681	162.001	+2.320	+1.453%
futures	225.227	227.547	+2.320	+1.030%
game_of_life	108.117	110.438	+2.320	+2.146%
immutable	182.636	186.136	+3.500	+1.916%
inner_html	84.624	86.941	+2.317	+2.738%
js_callback	110.230	112.552	+2.321	+2.106%
keyed_list	198.554	200.873	+2.319	+1.168%
mount_point	87.732	90.052	+2.319	+2.644%
nested_list	111.226	113.543	+2.317	+2.083%
node_refs	94.783	97.101	+2.317	+2.445%
password_strength	1542.321	1544.560	+2.238	+0.145%
portals	95.771	98.089	+2.317	+2.420%
router	303.426	305.024	+1.599	+0.527%
simple_ssr	140.751	143.052	+2.301	+1.635%
ssr_router	368.772	370.384	+1.611	+0.437%
suspense	107.342	109.661	+2.319	+2.161%
timer	90.846	93.164	+2.318	+2.552%
todomvc	142.163	144.483	+2.320	+1.632%
two_apps	88.618	90.943	+2.325	+2.624%
web_worker_fib	152.561	154.892	+2.331	+1.528%
webgl	87.260	89.579	+2.319	+2.658%

⚠️ The following examples have changed their size significantly:

examples	master (KB)	pull request (KB)	diff (KB)	diff (%)
async_clock	101.879	104.199	+2.320	+2.278%
boids	171.810	174.132	+2.322	+1.352%
communication_child_to_parent	92.598	94.914	+2.316	+2.502%
communication_grandchild_with_grandparent	103.522	105.841	+2.318	+2.239%
communication_grandparent_to_grandchild	99.693	102.019	+2.325	+2.332%
communication_parent_to_child	89.931	92.247	+2.316	+2.576%
contexts	106.125	108.443	+2.318	+2.185%
counter	87.972	90.289	+2.317	+2.634%
counter_functional	88.307	90.624	+2.317	+2.624%
dyn_create_destroy_apps	90.819	93.140	+2.320	+2.555%
file_upload	102.270	104.142	+1.872	+1.831%
function_memory_game	164.169	166.488	+2.319	+1.413%
function_todomvc	159.681	162.001	+2.320	+1.453%
futures	225.227	227.547	+2.320	+1.030%
game_of_life	108.117	110.438	+2.320	+2.146%
immutable	182.636	186.136	+3.500	+1.916%
inner_html	84.624	86.941	+2.317	+2.738%
js_callback	110.230	112.552	+2.321	+2.106%
keyed_list	198.554	200.873	+2.319	+1.168%
mount_point	87.732	90.052	+2.319	+2.644%
nested_list	111.226	113.543	+2.317	+2.083%
node_refs	94.783	97.101	+2.317	+2.445%
portals	95.771	98.089	+2.317	+2.420%
simple_ssr	140.751	143.052	+2.301	+1.635%
suspense	107.342	109.661	+2.319	+2.161%
timer	90.846	93.164	+2.318	+2.552%
todomvc	142.163	144.483	+2.320	+1.632%
two_apps	88.618	90.943	+2.325	+2.624%
web_worker_fib	152.561	154.892	+2.331	+1.528%
webgl	87.260	89.579	+2.319	+2.658%

WorldSEnder

Thanks for the PR and the performance comparisons.

My conclusion is that I favor interning attribute keys, tag names and event types, but not attribute values, by default. But also to keep it opt-in in some sense by the user.

WorldSEnder · 2023-03-18T20:14:54Z

packages/yew/src/dom_bundle/btag/mod.rs

-                .create_element(tag)
-                .expect("can't create element for vtag")
+            thread_local! {
+                static CACHED_ELEMENTS: RefCell<Vec<(String, Element)>> = Default::default();


I'm not sure I can follow your argument for using a linear search here. Since it's caching by tag, we might even be able to fine-tune the hashing used to avoid collisions, but even without this, a HashMap would be less surprising. The "usual" website uses between 40-100 different tagNames.

I ran new Set(Array.from(document.querySelectorAll("*")).map(e => e.tagName)) on the top 50 list. A few outliers are explained by the liberal use of custom elements on the google pages especially. The above also counts html, body and a few other elements that probably do not appear in the app itself, but I think expecting the average yew app to use 30 different elements at least is reasonable. I don't see the linear search being faster than a lookup in the map, and the memory overhead it most likely negligible.

Would you try this with a HashMap with a default capacity of, say 32?

Sure, let me give a bit more of my reasoning, since we should start from the assumption that a HashMap is the right call here.

For any given comparison between O(n) and O(1) it's good to keep in mind that if n is relatively small and 1 is relatively large, n may actually be more efficient. For example, if we only have 2 items in the Vec, it will obviously be cheaper to do a linear search than to look it up in a HashMap. If we have 10,000 items, it will obviously be cheaper to look it up in the HashMap.

I made the guess that n = 30 is somewhere in the "not a significant difference" range. I agree this is important to actually test rather than making an assumption.

Relative to the cost of DOM rendering itself these differences are likely minimal.

Binary size: Yew already has Vec<Element> in it so this doesn't add meaningful binary size. It doesn't (afaict) have a HashMap<String, Element> anywhere, so this is a new data structure to be monomorphized and included in the binary.

I did just do a HashMap version with capacity 32 as you suggested. Here are the benchmark results

I'm pleased to say you're right and I'm wrong here, in that even at this small n the HashMap is winning. You can see it when creating 1000 elements if you average "create 1000" and "append 1000", and a much bigger difference at 10,000. Of course on this particular run this wasn't enough to swamp the general statistical noise so the Vec approach was "faster" overall but this is not significant.

Note however that the HashMap version adds another 4kb to the WASM binary size.

So that's the tradeoff to consider in using a HashMap instead: even faster element creation that may or may not be measurable vs. 4kb in the binary.

Is 4KB (I assume you meant bytes with a B and not bits) really that much difference when WASM binary is highly compressible and is streamed to the client? I'm not sure. In any sizeable application, the binary size can be in MBs. 4KB is merely a drop in the bucket for that

packages/yew/src/dom_bundle/btag/attributes.rs

packages/yew/Cargo.toml

This reverts commit 28653c4.

…-bindgen

gbj · 2023-03-21T00:10:53Z

I've just pushed a few changes incorporating the feedback here, so the included optimizations are

cache element creation and use clone_node() (supersedes interning tag names)
intern attribute names and event types

I've added an enable-interning feature in yew for convenience that enables the same feature in wasm-bindgen, so users don't have to add a wasm-bindgen dependency themselves. This is off by default.

Unless I've missed something this is done from my end, as far as I can tell.

futursolo · 2023-03-21T00:21:50Z

I am not sure if we should be providing a feature for interning. For any sizeable application, you would need to have wasm-bindgen as a dependency, which is where interning matters.

WorldSEnder · 2023-03-26T02:11:42Z

Looks good to from an implementation perspective. Note that that some documentation should be provided for the users. Can you add a paragraph to the documentation about optimizations on how to enable that, after we decide if we want a feature in yew?

I'd argue for an "enable-interning" feature in yew directly:

it would be easier to teach to users, as it's just adding a flag and more prominently visible
on that note, it should have documentation of that feature in the crate level docs
if wasm-bindgen ever updates or changes the mechanism, we could change transparently to the new mechanism (hopefully)
a "sizeable" application might also put all the "ugly" wasm-bindgen stuff into a separate component library, and only enable string interning in the final executable.

I don't see the downsides of the feature forwarding.

futursolo · 2023-03-26T09:06:01Z

if wasm-bindgen ever updates or changes the mechanism, we could change transparently to the new mechanism (hopefully)

If wasm-bindgen adapts a different method or deprecating interning (e.g.: WebIDL?), not providing this transitive feature will make 1 less maintaining overhead for us. If wasm-bindgen can transition transparently, it would only be natural to assume they would also do so, if they have to introduce it a breaking change, I do not think we can apply it without it being a breaking change either. Which means that avoiding this feature would help us to avoid 1 potential breaking change.

a "sizeable" application might also put all the "ugly" wasm-bindgen stuff into a separate component library, and only enable string interning in the final executable.

wasm-bindgen not only provides binding features, but other things like #[wasm_bindgen(start)] to register an additional entry point, wasm_bindgen::prelude::*, which provides things like JsCast, UnwrapThrowExt, etc. which are all useful for sizable applications at application level.

it would be easier to teach to users, as it's just adding a flag and more prominently visible
I don't see the downsides of the feature forwarding.

I would see this as maintaining overhead that can be otherwise avoided.

There are implications around interning as it would require all strings that is passed through to the JavaScript APIs to be hashed (as it needs to look up whether it is interned). This cost is add on top of the existing UTF-8 - UTF-16 encoding / decoding cost.

Hence, if we include interning as a feature, we need to provide documentation around this feature flag so users can fully understand the implications and when to use it. In which this is something that we can avoid by simply pointing this to wasm-bindgen's intern function documentation.

By the end of the day, I wouldn't be against adding this feature flag, if you think it's still worth it after considering:

Users will likely need wasm-bindgen as a dependency.
We have to write documentation for this.
We have to potential handle issues and questions around users / libraries enabling / using this feature incorrectly.
It may potentially becoming a breaking change for us.

WorldSEnder · 2023-03-27T02:35:14Z

By the end of the day, I wouldn't be against adding this feature flag, if you think it's still worth it after considering:
1. Users will likely need wasm-bindgen as a dependency.

2. We have to write documentation for this.

3. We have to potential handle issues and questions around users / libraries enabling / using this feature incorrectly.

4. It may potentially becoming a breaking change for us.

All good points. After considering, I still think we should mention string interning in the crate level and optimization docs, but just point users to the wasm-bindgen feature directly. Should be less risky and less taxing for us in the long run and equally as usable.

ranile

If you merge the changes from master, the CI should be green

ranile · 2023-04-01T22:15:21Z

packages/yew/Cargo.toml


 [features]
 ssr = ["dep:html-escape", "dep:base64ct", "dep:bincode"]
 csr = []
 hydration = ["csr", "dep:bincode"]
+enable-interning = ["wasm-bindgen/enable-interning"]


Can you please remove this feature? (see above discussion)

Suggested change

enable-interning = ["wasm-bindgen/enable-interning"]

If you would like to also add documentation for interning, it should go in optimizations docs

ranile

Looks good to me! Thanks for taking the time to work on this

voidpumpkin

🚀

gbj added 6 commits March 18, 2023 12:28

enable interning

bf0a94c

intern tag names

adcc39d

intern attribute keys and event listener types

6712d9e

intern attribute values

28653c4

cache and clone elements

dee538a

clean up the node cloning version a bit

c64502b

WorldSEnder reviewed Mar 18, 2023

View reviewed changes

gbj added 3 commits March 19, 2023 17:12

use HashMap instead of Vec for element cache

fbcf898

Revert "intern attribute values"

4a63a87

This reverts commit 28653c4.

add enable-interning feature to Yew that activates the same in wasm…

b58417f

…-bindgen

ranile reviewed Apr 1, 2023

View reviewed changes

ranile added performance A-yew Area: The main yew crate labels Apr 1, 2023

futursolo mentioned this pull request Apr 2, 2023

Encode Path Parameters in yew-router #3187

Merged

2 tasks

voidpumpkin added the S-waiting-on-author Status: awaiting action from the author of the issue/PR label Apr 2, 2023

remove interning feature

10d5f89

voidpumpkin removed the S-waiting-on-author Status: awaiting action from the author of the issue/PR label Apr 2, 2023

ranile approved these changes Apr 2, 2023

View reviewed changes

ranile requested review from WorldSEnder and futursolo April 2, 2023 19:28

voidpumpkin approved these changes Apr 2, 2023

View reviewed changes

voidpumpkin merged commit bdf5712 into yewstack:master Apr 2, 2023

gbj mentioned this pull request Apr 16, 2023

Some front-end performance suggestions saru-tora/anansi#23

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incremental performance improvements to element creation #3169

Incremental performance improvements to element creation #3169

gbj commented Mar 18, 2023

github-actions bot commented Mar 18, 2023 •

edited

Loading

github-actions bot commented Mar 18, 2023 •

edited

Loading

github-actions bot commented Mar 18, 2023 •

edited

Loading

WorldSEnder left a comment •

edited

Loading

WorldSEnder Mar 18, 2023

gbj Mar 18, 2023

ranile Mar 19, 2023

gbj commented Mar 21, 2023

futursolo commented Mar 21, 2023

WorldSEnder commented Mar 26, 2023 •

edited

Loading

futursolo commented Mar 26, 2023 •

edited

Loading

WorldSEnder commented Mar 27, 2023 •

edited

Loading

ranile left a comment

ranile Apr 1, 2023

ranile left a comment

voidpumpkin left a comment

Incremental performance improvements to element creation #3169

Incremental performance improvements to element creation #3169

Conversation

gbj commented Mar 18, 2023

Description

Checklist

github-actions bot commented Mar 18, 2023 • edited Loading

github-actions bot commented Mar 18, 2023 • edited Loading

Benchmark - SSR

Yew Master

Pull Request

github-actions bot commented Mar 18, 2023 • edited Loading

Size Comparison

WorldSEnder left a comment • edited Loading

Choose a reason for hiding this comment

WorldSEnder Mar 18, 2023

Choose a reason for hiding this comment

gbj Mar 18, 2023

Choose a reason for hiding this comment

ranile Mar 19, 2023

Choose a reason for hiding this comment

gbj commented Mar 21, 2023

futursolo commented Mar 21, 2023

WorldSEnder commented Mar 26, 2023 • edited Loading

futursolo commented Mar 26, 2023 • edited Loading

WorldSEnder commented Mar 27, 2023 • edited Loading

ranile left a comment

Choose a reason for hiding this comment

ranile Apr 1, 2023

Choose a reason for hiding this comment

ranile left a comment

Choose a reason for hiding this comment

voidpumpkin left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 18, 2023 •

edited

Loading

github-actions bot commented Mar 18, 2023 •

edited

Loading

github-actions bot commented Mar 18, 2023 •

edited

Loading

WorldSEnder left a comment •

edited

Loading

WorldSEnder commented Mar 26, 2023 •

edited

Loading

futursolo commented Mar 26, 2023 •

edited

Loading

WorldSEnder commented Mar 27, 2023 •

edited

Loading