-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Note: Implementation uses manual DOM manipulations [sticky] #772
Comments
I believe the WASM implementations like Worth mentioning |
This benchmark should not force any programming style, it should just be a framework for exercising different libraries that perform certain functionality. If you want to consider the "effort to write the code", there are better ways to measure that. You can add "lines of code" of the implementation as a measure that shows how much code a developer will need to write to do the included functionalities. As I mentioned in the jQuery issue, this benchmark should allow all types of libraries. People can decide to use this library or not based on the performance as well as the lines of code (a measure of effort). The personal preference for syntax or style should not be covered here. |
I think lines of code are poor metric for effort. I can write very concise code that is hard to reason about yet is very performant. Or I can write Elm which is easy to write but has a disproportional amount of LoCs. That being said people can just look at the implementation and make their own judgement call assuming they do equivalent things. The problem with non-data driven approaches is they become ultimately meaningless from a comparison perspective outside serving as a guidepost for optimal performance. The scenario presented here is an approximation of a category of the solution but shouldn't represent a specific one. Let me explain why this matters. I think there are 3 factors at play. Having knowledge of the specific problem allows certain types of optimizations that would make no sense for a general solution. However, restricting these lower-level libraries makes no sense as there is no reason jQuery should adopt React's abstraction. It's an artificial constraint. Conversely, every library has an escape-hatch back to the VanillasJS. So an implementor could just grab a ref and wire everything up by hand. Sure it bypasses the library but it is no more effort or complex than the VanillaJS implementation. We've seen that in implementations here. It is a tempting approach to get performance. One could argue it is even idiomatic. But any library could do the same thing. We've gotten to a place where declarative libraries are knocking on the door of vanilla performance anyway. So it's a shrinking zone, since I'm not even sure what the idiomatic jQuery implementation looks like anymore. How Vanilla is it? I am a little curious how jQuery performs these days myself and we have a enough reference builds they could compare against themselves, maybe it is worth a go. Like how does compare against optimal WASM written in Rust. But without some common ground these stop testing the same thing. Arguably we are past that point already but it is still of value at minimum categorizing the solutions. |
Well, LoC is one of the many things that show the effort took for writing some code. I don't mean that this is the best solution, but at least gives some context. For example, if there was some LoC number, I could see that the raw wasm-bindgen implementation is too verbose. Maybe another way is to dd a hover effect to show the code for a certain framework.
That's my point. There is no reference for its performance, and because of that, people everywhere say that "jQuery is not good since it is slow!".
These are hard to detect, and it gets into the gray area. In my opinion, the programming style or approach is not a good way to detect this. Instead, if some implementation explicitly goes from the official API to Vanilla Js API in the human-written code, then this is a good sign for detecting this. On the other hand, there may be some frameworks that have made these optimizations their "official API". For example, they say do this if you want to achieve this functionality. I think we should allow these types of implementations as long as they are considered "first-class API". This is not hard to imagine if finally, WebAssembly gets direct access to Web IDL APIs. This means wasm based frameworks can beat JavaScript. If this happens, we should start porting our JavaScript libraries to Wasm. Using AssemblyScript, this will not be that hard! Maybe we get a "solid-wasm" someday! |
a jQuery impl would be pretty much identical to vanilla, both in perf and imperative nature. jQuery is "slow" because people tend to over-use selectors in loops (and selectors were in software/Sizzle, not querySelectorAll). jquery code frequently manipulates attrs rather than properties, and use things like innerHTML and string parsing for creating DOM elements via e.g. if you avoid using these jQuery facilities, you just end up with a super thin sugar layer over vanilla dom, and it all becomes the same...and pointless. btw, https://github.com/franciscop/umbrella is a modern/tiny jQuery alternative, with a matching API. |
That's why I am interested to see (one or more) implementations here. When there is no point of comparison, we can't compare them. Here it points to different jQuery like libraries. It would be nice to implement these compare them against each other. We can also have a jQuery 4 implementation which is interesting to look at. |
@aminya Well jQuery is a library.. Some people may don't know how to benchmark JavaScript, but actually if you want to compare library performance you can easily use jsbench.me to do that. Based on your mentioned link I created a simple benchmark for benchmarking jQuery and it's friends. I included ScarletsFrame on the test as a sample framework because it has jQuery like feature. Why jQuery is not included? it's because the main goal of this project is about framework benchmark. <!-- The framework may need to compile/parse this template on the fly -->
<div id="content">
<div @for="val in list">dummy {{ val.text }}</div>
</div> // Depend on framework design, the implementation or the available feature
// may be different with other framework
var content = new Framework('#content', {
list: [{text: "Henlo"}]
});
// Data-driven framework may have feature that doesn't
// need you to update the DOM manually
// Editing first array value
content.list[0].text = "Hello"; // Framework may already bind this property and update the DOM
// Add new data into the array
content.list.push({text: "World!"}); // Framework may immediately add new element into the DOM With jQuery: example result <div id="content"></div> var content = $('#content');
var list = [{text: "Henlo"}];
// We need to manually update the DOM to match the list
for(var i=0; i<list.length; i++){
// Bottleneck: jQuery need to parse the string to generate new element
content.append(`<div>dummy ${ list[i].text }</div>`);
}
// Editing first array value
list[0].text = "Hello"; // We need to manually update the DOM after change this
content.children().eq(0).text(`dummy ${ list[0].text }`);
// Add new data into the array
list.push({text: "World"}); // Again.. we need manually update the DOM
content.append(`<div>dummy ${ list[1].text }</div>`); For me not including jQuery on the test is acceptable, because the level of it's implementation was different as a framework. It's not like I hate jQuery, I love with their simple syntax for some specific case. However framework is more optimized to handle multiple case in elegant way.
Calm down bruh.. Don't lit the war, or I will join the forces.. |
To be honest: I'm not too excited maintaining a few jquery based (or similar) frameworks. |
These types of benchmarks are far from real-world operations (who wants to calculate the length!). In contrast, the benchmarks in this repository resemble the real-world situations quite well. |
@aminya It's true that the benchmark I provided is about counting the available element that have The test code on JSBench is editable, you can just modify it into A library does provide some collection of the small parts or tools which help developer build their own design or architecture. But with a framework these small parts is managed by the framework itself to provide the medium parts so the developer can immediately design their app without the need of managing the DOM directly. The benchmark on this repository is about to compare performance on how the framework handle these small or native function of JavaScript with some given situation. We may need to manage or accessing the DOM directly if using jQuery just like VanillaJS does. Instead of that we may need to use query selector more often or parsing text for creating an element to avoid it being very dependent with native DOM functions on the benchmark's source code. |
@krausest Do you agree that the WASM libraries probably should be marked as reference build due to this issue? I think there is more grey area elsewhere but like Due to some older libraries being removed due to inactivity I think that the number of implementations using imperative methods for select row has reduced. I'm onboard tightening up the definition of data-driven a bit here although I suspect at this point there might be enough libraries in the greyer area that it could almost fill its own chart. Ignoring explicit/implicit event delegation which I think is hard to argue because it can be the "the way" to do stuff in some libraries. I've tried my best to look at every implementation(except
Cleansolid (after #794) Dirty Model (put selected on each row data)mikado Direct DOM Manipulationvanillajs1 Reflections: Some popular libraries are dirtying the model: lit-html, ember. lit-html is especially interesting as the lit-element does not. Given the lit-element version doesn't do WC per row I imagine the majority of the overhead between these 2 approaches might be something like this. The vast majority of libraries using direct dom and dirty models are in the top 30 but not all of them. The top 30 looks very different if this is corrected. |
Counter-proposal: instead of trying to define the impossible (and inflammatory), like what's a hack / dirty / cheating etc, solve the problem from the other side by reporting results to only 1 significant digit in the results app. Aka all these frameworks are at 1.0, these at 1.1, these at 1.2 etc. No 1.06, 1.07 etc, which is meaningless precision anyway. Randomize the sort order of results that are equal to that precision. So from the m85 results, vanilla, mikado, solid, stage0, sinuous, fidan and domc are all in the 1.0 club and a random one will happen to be "first" each time the page reloads. Congrats guys, time to focus on something other than the last 1% or, even worse, the hope that you get a "lucky run" and show up at the top of the next results page, or even worse than that, whether another dev's 1% improvement was cheating or not. For framework developers, have an internal flag in the results app to switch it into "meaningless mode" where 3 significant digits are used instead of 1. At least, that's how I used to run it when I was comparing two approaches in Surplus. I think this would maximize the really awesome parts of this benchmark -- devs using it to improve their frameworks and cross-pollinate ideas -- while minimizing the really sucky parts -- devs trying to stick stink tags on their competitors. |
@adamhaile Your suggestion seems out of the scope of this issue. You should probably create another issue for that, so it is not lost here. |
@ryansolid I think that the categorization of the libraries is certainly a good approach. In combination with the hover effect for showing the source code (and probably the number of lines of code), we can quickly get an idea of what the library's code would be like. |
@adamhaile You're certainly right about the precision and I really considered reducing precision but wouldn't the necessary rounding make the ranking quite unstable? (like dominator 1.24 => 1.2, domvm 1.25 => 1.3. I guess that won't be as stable as we'd want it to be). But if I find the time I'll try to make the compare mode easier to use. I think this could help.) @ryansolid Regarding dirty model: I currently think this way to model the row state might be considered okay. If I remember right this was good practice for some frameworks (was it ember or even jsf?) long ago when it was hard to call functions in your templates. But I think we should mark the following implementations with issue 772 for reaching out to parent nodes in client code: Would you agree? |
I feel like I didn't quite explain my issue with the dirty model. In one sense you could say it tests something different. Per row selection is a lot like update every 10th row. That being said everyone wants it to be an O(2) operation instead of O(n) and what I'm doing internally in Solid isn't that different. I'm iterating over all rows but only doing heavier computations on the ones affected. No, It's more that it changes how you write your implementation in a way that could be ambiguous. Like what's the difference between: el.className = "danger"
row.class = "danger" If it's a proxy those could be the exact same thing. They use a technique that doesn't proxy to a change detection system but actually proxies directly to one to one element writes. The data model can't be re-used and only ties to those exact DOM nodes. Those libraries already have syntax to perform the operation in the template the other way but sort of sidestep it to do it that way. Which is arguably fine but it's ambiguous. Once you change the selection to be an implementation that is about setting individual rows it opens this up. Take this further once you mark the row you can start caching the previous selected row. It is essentially the same as holding the TR like the vanillajs implementation. To be fair we can't really crack down on this nor is it a necessarily bad technique beyond it essentially doesn't scale beyond local template. But it works fine here. Not all libraries that put selected on the model do this it just the change to how you write the implementation opens up the door. Event delegation personally isn't an issue to me. But others would disagree. Mostly that it is defacto technique for this. Libraries that don't have it built-in tell you it's an idiomatic way to solve these sort of problems in their docs. It's been used universally since it could used and almost every library either does it automatically or would have their implementors do it. Implicit event delegation like you see in many frameworks often have problems with DOM standards like Shadow DOM re-targeting and knowledge of Composed Events. Almost every VDOM library here would fail that. But a library like lit-html used with Web Components all the time wouldn't add event delegation in the core for that reason. Yet they aren't supposed to use the technique where it makes sense while everyone else can? Unlike what we've talked about above it isn't avoiding the libraries built-in syntax, it's just the way to solve the problem. All that being said it isn't part of the library and takes implementation specific code so it definitely doesn't look as nice. I'm going to have to defer any perspective on this to others. But I know the Web Component crowd would probably say disallowing it doesn't make sense. I guess what I'm saying is in my opinion this sufficient for now and we see how things go. I do think both of these might warrant further examination or refinement but I can't completely fault implementations from doing so. Mind you is it only those 4 libraries doing explicit event delegation? I would have expected more. |
Another way to look at the Dirty Model thing is pretty much any library could implement this way and it improves things to O(2) from O(n). Which means it's an easy win for every library here in the first group. Should we just go ahead and do that? It has a definite performance improvement. It's not quite as performant of direct DOM manipulation but depending on type of library it can be closer to that than it is to doing it the way most libraries are. It isn't not data-driven. It just changes the test from a delegated state change to a partial update. This especially benefits granular reactive libraries (like mine or Sinuous). Anyone have opinions? I thought the unwritten rule to mirror real scenarios you wouldn't want to dirty the model. But if this is just a presumption, it's an easy to fix. |
I took a look at what impact it has for react and angular. One model is immutable, the other is mutable. |
This comment has been minimized.
This comment has been minimized.
yeah, the dirty model will boost a lot of libs that currently try to keep it clean. @ryansolid thanks for putting together the thorough review in #772 (comment). i'm a fan of simply adding visible feature flags to the table according to that categorization (maybe even with manual event deleg?). people should be able to make their own judgement calls as to whether the lib & impl speed comes at an acceptable purity/imperativeness trade-off for their coding style or situation. i think the rows should breifly describe the flag (as the metrics do) rather than only linking the relevant GH issue. |
@StefansArya The reason I'm making the distinction with proxies is not just that it is essentially direct manipulation since I mean the goal of most reactive systems is to achieve that. I would say direct binding of that nature to the DOM element like a couple implementations (not ScarletsFrame necessarily) makes it not transcend. But an event-based granular system is basically the same with a subscription mechanism on top. It's that and it's testing a different thing. I was noting that this implementation basically opens the door for abuse specifically since it's easy to cache that one row. Whereas not putting it on the model doesn't. The reason the dirty model is awkward goes beyond that it can't be shared (although in the couple implementations it cannot be), it's that the selected state arguably a UI state is global. Sometimes maybe that is desirable like you showed in your example different views of the same thing. But I mean more like you have the same list multiple times being selected temporarily (ie not something you'd store in a DB) for different reasons. Do we add 3 selected states here? What if you go somewhere else in the app should it still be selected? Does selected even make sense to be on the model? @krausest I definitely would and have implemented that in real apps where it was causing performance overhead (KnockoutJS with Web Components at times a bit of a mess). The challenge is I and I think many framework writers take that test to exist in this suite because it's trying to test a different case. Putting it on the row is basically the same as partial update every 10th row. I would go as far as if they thought it was ok to put selected on each row the DOM hack might have not even existed. The problem is if posed with this problem in isolation as a benchmark wouldn't you just do the fastest method? I think people have mostly been pretty respectful of the data setup part even if it means wrapping with their own primitives. Some of the dirty model methods don't even put the selected in the initial creation not make that part of the benchmark different. But you confirmed what I suspected. It is in every libraries best interest to change their implementation this way. It just tests a different thing that we are already testing. I think this comes down to which goal of the project is more important. From the perspective of common cases solved by every library I think it's fine to dirty the model. But from the value of the framework author to solve certain classes of problems it is a shame. Ie.. I used the selection test to achieve a different solution with those constraints. I haven't tested recently but dirtying the model might actually be faster than what I came up with. But I acknowledge this class of problem might not be real and this constraint might be imagined. I don't believe it is from my experience but I can't necessarily convince someone else that. I just assumed the inclusion in the test was because it was a different challenge for the library not just because it's a common thing someone would do. Most tests here test a unique thing. |
@krausest I guess we will need different visualizations at some point or we are going to have a lot of red. From the note on dirty model it is clear it isn't considered a bad thing. Is it strange I'm waiting to see what the visual looks like before I go and convert a bunch of rows over to selected on the model rows? Saying this is ok makes it something I feel compelled to do to get the best numbers. Truthfully if this is ok, I'm not sure it is even a thing anymore. People would just write their tests differently from the start. There is no real reason not to do it the more performant way. |
i'm in the camp of keeping the models clean since i usually consider the models to be data and |
Since your range has a floor at 1.00, truncation would be better than rounding as it gives you equal-sized buckets. Aka 1.0000 to 1.0999... all go to 1.0, 1.1000 to 1.1999... go to 1.1, etc. Stability would be much better. For instance, Surplus was listed as sixth in the m83 rankings but first in the m84 ones. That's made-up precision: in truth, all you can say from the data is that several frameworks tied for first. Truncating before ranking would do that. Some frameworks that regularly score near an edge might flip between buckets, but that's much less churn than there is at present, where all the frameworks jump around depending on how close they are to their neighbors. None of this is meant as any kind of slam on your benchmark. Not at all! One of the things that makes this benchmark awesome is all the work you've done to remove as much variability as possible. It's the best tool out there, it just unfortunately doesn't have the power to do what it's claiming, which is provide a framework-by-framework ranking. That's without even getting into the question of whether precision = accuracy, aka whether this benchmark is saying something "true" about the frameworks. That latter is why I'd argue for a fairly conservative, single-digit of precision in the rankings. |
A preview for the new result table can be seen here: It features a few changes:
|
I think it's important to remember that the purpose of the benchmark is not for the sake of the frameworks themself (or the person writing the benchmark)... it's for the sake of users who want to evaluate the performance of different frameworks. So it's pointless to have a React implementation which does a lot of direct DOM manipulation, because that's not idiomatic in React, and so a user would get a false impression: they will think React is really fast, but then when they write idiomatic React code in their own app it ends up being far slower, since their app isn't doing direct DOM manipulation. So it's not really about direct DOM manipulation (or "data driven", or whatever), it's purely about idiomatic usage of the framework. And what is considered idiomatic varies from framework to framework (and is somewhat subjective), so that's really hard to do without causing a lot of drama. So what I would say is to have two categories: idiomatic and fastest. Fastest can use basically any trick it wants (while staying within the framework), whereas idiomatic is supposed to be extremely uncontroversial standard code, without any performance tricks at all. The idiomatic code should be extremely simple and clean, the sort of code you would see in a tutorial for the framework. It's okay for the idiomatic benchmark to use direct DOM manipulation or event delegation, as long as that is considered idiomatic for that framework. And what is considered "idiomatic" should be conservative (not permissive). If there is doubt over a performance trick, then the performance trick shouldn't be used. If you want maximum permissiveness then you should create a "fastest" benchmark instead. This provides useful information for the user: they can see what the typical performance of a framework is (when written in idiomatic style), and also see what the potential maximum performance is for a framework (when using escape hatches and kludges). In real apps you end up needing both of those things, since real apps are first written in idiomatic style and then kludges are put in when extra speed is needed. |
These implementations use direct DOM modification in the end user code. This means specific DOM updating code is written tailored for these specific tests.
Those implementation are expected to perform very close to vanillajs (which consists of manual DOM manipulations).
The text was updated successfully, but these errors were encountered: