Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use global regexes (cached) for small speed improvement #2501

Closed
lukeapage opened this issue Mar 12, 2015 · 57 comments
Closed

use global regexes (cached) for small speed improvement #2501

lukeapage opened this issue Mar 12, 2015 · 57 comments

Comments

@lukeapage
Copy link
Member

I wonder if this will help with speed given how much we use regex's

http://jsperf.com/regex-caching-lp

@seven-phases-max
Copy link
Member

(I added a few browsers there) Curiously it looks like it's dramatically faster only in Chrome :). I guess it's still could be a neat piece of code to use for those regexps that execute the most (e.g. vars, properties till : etc). But it does not look like a solution for those #2339 things.

@rjgotten
Copy link
Contributor

Curiosly it looks like it's dramatically faster only in Chrome :)

V8 likely implements regex literals strictly per ECMAScript 5, which states that regex literals are compiled each time the literal is evaluated, as if it had been declared using the new RegExp(pattern, flags) syntax.

The other engines are probably still implementing the ECMAScript 3 behavior, which states that regex literals are compiled once, before the containing program or function is compiled and executed, and any use of the corresponding literal is treated as a reference to the same compiled object.

Or they're using a hybrid form, where the regex state machine is compiled and created only once, but fresh instances of new RegExp are created around it each time the literal is encountered.

@jonschlinkert
Copy link
Contributor

V8 likely implements regex literals strictly per ECMAScript 5, which states that regex literals are compiled each time the literal is evaluated

You should see what caching can do when the regex is dynamically created from a string based on options or user input etc. For example:

var cache = {};
function createRegex(str) {
  return cache[str] || (cache[str] = new RegExp('foo' + str + 'bar'));
}

I created this not long ago. Really, the code above is all that's needed if the only input is a string, still it's pretty interesting how much of a difference caching can make

@matthew-dean
Copy link
Member

If you look at parsers like acorn (http://marijnhaverbeke.nl/acorn/), and a few others that have reputation for speed, you'll notice that there are barely any regexes at all, and instead use sequences of OR || statements, and often check individual characters against numerical character codes. The code ends up being more verbose, but from what I've read, JS engines can optimize the crap out of very linear logic statements and static functions. Regexes are more of a mixed bag, and things like backtracking in regex can be expensive.

In Less, we often have a series of OR checks for regexes, and those regexes themselves have backtracking OR switches. Pragmatically, it's easy to maintain, but that pattern is more likely to negatively impact performance.

I've been thinking about how we can very gradually improve speed of individual node matching. I was reading this - https://github.com/felixge/faster-than-c (a very good read about making JS faster than C), and he advocates benchmark-driven development. I was wondering if we could create some kind of test harness, like our own kind of JSperf, that has sequences of .less input, and tests the performance of matching to a single matching function in the Less library (and verifies that the output is as expected). Then a dev could tweak that function, run more tests, save / graph the results.

That way we don't have to KNOW what would be faster so much as try something and see if it's faster. If it's faster on all supported platforms, make a PR for that particular function, badda bing, badda boom.

I'm not sure the best way to go about that, but it would be great to be able to quickly test all these great ideas.

@matthew-dean
Copy link
Member

Also, would there be any value at looking at JS-based CSS parsers like https://github.com/css/gonzales to see if there are ideas we could port? Surely there would be relevant pattern matching from something like that. I know each of our entities gets a lot more complex, but just an idea.

@rjgotten
Copy link
Contributor

If you look at parsers like acorn (http://marijnhaverbeke.nl/acorn/), and a few others that have reputation for speed, you'll notice that there are barely any regexes at all, and instead use sequences of OR || statements, and often check individual characters against numerical character codes.

That type of parser formally separates itself into a lexer/tokenizer and a LALR(k) parser. The power and speed of that approach comes from employing generalized regular expression matches on a very small scale (mostly for recognizing identifiers, filtering whitespace, etc.) and limiting yourself to a fixed set of hand-coded lookaheads by hand-writing a parser based on accepting the lexer's tokens via a set of production rules formulated as a BNF grammar.

You could do that with Less as well, but you'd need to build on a compliant CSS parser first, which can tokenize and parse CSS selectors, which (if I understood correctly from the original discussion) is really where regex lookahead starts to hurt a lot.

I remember CSS 2.1 having a BNF grammar for the language in the W3C recommendation. I assume CSS 3 is no different and does the same. You could still short-circuit a lot of the grammar, since you don't need to know every little detail of it.

@matthew-dean
Copy link
Member

@rjgotten Is this what you meant? http://dev.w3.org/csswg/css-syntax/

They really break it down. Just came across that recently.

@lukeapage
Copy link
Member Author

lukeapage commented Mar 12, 2015 via email

@matthew-dean
Copy link
Member

I've read that parser generators can leave you with a different set of problems (advocating hand-writing parsing), but I don't really know one way or the other. I think as long as we can benchmark the results, however we got there is all good.

@lukeapage
Copy link
Member Author

Can you run this through on a few browsers?

http://jsperf.com/regex-string-size

I'd expect if size of the string is less's performance problem (as shown by chunkInput) I'd expect to see the speed decreasing with size of the string. Any idea what wrong with the test if that isn't happen? Got to run for a flight or I'd have done more testing...

@matthew-dean
Copy link
Member

On my setup, Safari performed best, with all the non-cached operations happening at roughly the same rate, and the cached operations at a different rate. Chrome 41 shows performance issues as the string size increases. Chrome 42 does not show as much issue. Firefox trails way in last with every operation performing the same rate, cached or not.

I think we can expect that every JS engine will optimize each regex differently, and even the same regex will perform differently with each input. So, nothing will beat testing.

I did another arbitrary test here (http://jsperf.com/regex-string-size/2) , just comparing OR in regex vs. testing two regexes and matching one or the other. I also changed up where the backtracking happens in the regex. And the browsers don't all change performance at the same rate. In fact, Firefox does better at an OR between two regexes than one of the regexes which backtracks in a different place, while Chrome and Safari always do much better at a single regex.

At the very least, the data shows caching is generally better than not caching, except Firefox. But then, it's certainly not worse.

I think we can start with what @jonschlinkert said. For any place in the code where we're testing against multiple regexes, it should still be faster to dynamically create a regex and test against a single one. And even faster if the regex isn't being redefined every time the function is visited. (Implement caching). And then run benchmarks on changed functions.

@lukeapage
Copy link
Member Author

lukeapage commented Mar 13, 2015 via email

@seven-phases-max
Copy link
Member

I'd expect if size of the string is less's performance problem (as shown by chunkInput)

I think it's not just the size of the input string but the "the size + some lookahead regexs". I.e. only certain re patterns timings may increase exponentially as the input size grow and chunkInput just makes those to bail out earlier.
So to get the real picture a test needs to involve one of those "bad" regexs (usually those with ?----+/*) and the input string also needs to be not not-ambiguous for that particular pattern. Simple patterns (like this .mixin one above) find their match in the input immediately hence the input size does not affect too much.

P.S. (a bit offtopic) When making the highlighter I found this tool to be quite useful for debugging of such "bad" patterns (it really helps to detect some problematic parts earlier though you still need to invent/predict those ambiguous inputs that may put the regex down yourself (so it's still kinda tedious thing)).

@matthew-dean
Copy link
Member

you still need to invent/predict those ambiguous inputs that may put the regex down yourself (so it's still kinda tedious thing)

I agree, it's a ton of work to try to predict what will work. That's why it would be great if we could staple a benchmarker integrated to the Less lib directly (only loaded for tests / development). Change code, and if the results are accurate, and it's faster, keep code. If we could turn on the benchmarks just for function X and run all the less tests (which would throw a bunch of .less patterns and file sizes, yes?), it seems like it would give a more "real-world" picture.

@matthew-dean
Copy link
Member

@seven-phases-max The highlighter and other tools are cool too, definitely should check those out. Let's hit this from all sides. Maybe we could make "performance" the next major feature (other than features in progress)?

@lukeapage
Copy link
Member Author

lukeapage commented Mar 14, 2015 via email

@rjgotten
Copy link
Contributor

Im most interested in fixing the regexs making parsing slow in non-v8 with large files
(...)
Im not really interested in performance that much - i think code readability is more important, but at the moment we need to fix non v8 browsers.

Well, if you want to do something about both: start by separating your parser into a real lexer/tokenizer and a token-based parser. Right now, you still have a lot of tokenization happening inside the actual parsing routines. And if your tokenization is happening mostly based on regexes, that's really like extending an invitation to all the degenerate backtracking cases to come and join the party.

Performance-wise, the good thing about having a separate lexer is that you generally only need regexes to recognize a small pool of token types from text and as tokens almost exclusively consist of one word, it means you have very little chance for degenerate backtracking in a long chain to recognize a token.

The parser no longer receives a stream of text, but a stream of strongly-typed tokens. Tokens are always the same (even if they can be used in different roles). The 'real' backtracking that disambiguates language constructs will then happen inside the parser only and won't have to involve complex backtracking in regexes anymore.

The maintainance/readability upshot of this is that you get named constants for your tokens that are much easier to work with and your parser will contain only the isolated parser logic, being much easier to follow than also receiving the added cognitive load of dealing with the inlined regexes and text stream processing.

Also, if you formally separate lexer and parser, you can more easily handle another current issue, namely the one regarding comment parsing.

Structured tokens means you can attach debug information like line and column number and a sample of surrounding input directly to each token. That enables you to keep that process intact while choosing to omit comments whenever you 'fetch a next token' from the lexer. I.e. you pretty much get the whole 'selectively allow comments' idea for free.

@lukeapage
Copy link
Member Author

yes, it would be nice, but its a big change that I don't really have time to do - I inherited the existing parser and never felt like I had enough time to re-write it. It is a different approach from the current "chunker", which just aims to split the input up into things about 200bytes long, guessing boundaries, but it gets unstuck by things like

url(http://) }
// vs a comment

I started with something simple - replacing regex's that don't have any logic in them. they aren't the ones causing the problem, but still.. f8de5bc

@lukeapage
Copy link
Member Author

omit comments whenever you 'fetch a next token' from the lexer. I.e. you pretty much get the whole 'selectively allow comments' idea for free.

We can do that already - that isn't the problem with comments, the comments is how they are represented on the ast is all wrong.

@lukeapage
Copy link
Member Author

A way towards a lexer might be to start using save more and slowly convert regex's to only use str and char. At that point a lexer can easily be introduced and the str and char's swapped out for token comparisons.

@rjgotten
Copy link
Contributor

that isn't the problem with comments

Yeah; I kind of glossed over the fact that comments are the one thing that's already handled at the lexer level (i.e. parser-input). My bad.

@lukeapage lukeapage changed the title Speed improvement use global regexes (cached) for small speed improvement Mar 15, 2015
@matthew-dean
Copy link
Member

But thats in node / v8... Im most interested in fixing the regexs making
parsing slow in non-v8 with large files.. the hardest part will be finding
which ones that is.

That probably means the benchmarking piece may need to be browser-based somehow.

But, I would also say that it's reasonable that if we address performance bottlenecks in V8, it's very likely to solve things on other engines. For instance, if there are pieces which cause exponential slowdown, and V8 is starting with a small amount of slowdown, then just addressing that small piece should speed up other engines by orders of magnitude. Not in every case, but I don't really see this as optimizing for a particular JS engine (although that's nice when that happens in the case of V8), but just figuring out more efficient algorithms.

So, I'm interested in those other engines too; I just don't think they need to be specifically targeted in order for performance to increase.

@matthew-dean
Copy link
Member

I just came across this relevant quote by programmer Jaime Zawinksi:

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

lol

@seven-phases-max
Copy link
Member

Now they have two problems.

Well, the problem is not in regular expressions but in how one uses them. As a C++ programmer who has never used regexs before ~2012 or so I can assure you ;). Things like "OK, I just replace this regex with a bunch of my spagetti ifs" do not work too. One has to write a pretty descent tokenzer/lexer/parser/whatever abstraction layer/kernel to keep things going smooth.

@matthew-dean
Copy link
Member

I know, no principle in programming is absolute. Just funny because it contains some truth.

@jonschlinkert
Copy link
Contributor

Now they have two problems.

haha, awesome

@matthew-dean
Copy link
Member

Just, FYI, @lukeapage (et al) I decided to follow up with this:

I looked briefly at switching to a parser generator - basically its a
rewrite. Id love to use a parser generator to make it all generated.

I've been looking at a few libraries and reading on different parser generator methods and algorithms. However, I don't want to clutter this thread here until I find a good candidate (which I may have). Just wanted to mention it on the off chance anyone else was doing that work. Once I find something that MAY work, then I may need to try to generate the parser in order to compare performance.

@matthew-dean
Copy link
Member

Also, 390ms would still be a win if it was 150ms instead, so I'm going to look at parser generators anyway when I have time. I'll share if/when I find something interesting.

@lukeapage
Copy link
Member Author

lukeapage commented Mar 26, 2015 via email

@matthew-dean
Copy link
Member

@lukeapage Were you asking @rjgotten? I was referring to his measurements.

@lukeapage
Copy link
Member Author

lukeapage commented Mar 26, 2015 via email

@rjgotten
Copy link
Contributor

Eval tree? What happens in that stage?

The bulk of the compiler's work. It's the phase where variables are resolved, mixins are called, etc.
which eventually should produce a reduced tree of nested plain rulesets with property rules.

(Afaict In the post-visit phase the nested selectors are resolved into their final complete form. And then the complete thing gets handed off to the CSS generation phase, which is simply where genCSS is called recursively on the final tree to build up the final CSS output.)

were those measurements from master or last release?

From master.

@matthew-dean
Copy link
Member

Just curious, what's the reason that things are converted to plain but nested rulesets and then again reduced to CSS rulesets? Couldn't the intermediary step be skipped? I mean, I know they're sequential, but just wondering if there would be a way to reduce object creation. After all, there's still the matter of frequent huge GC dumps, although that may have been mitigated with chunkInput turned back on.

@seven-phases-max
Copy link
Member

It's not that they are specifically converted to plain rulesets. They just become plain because their internal stuff (e.g. vars and mixins) is "expanded". So, yes, it's just one step ("flatten rulesets") after another ("expand ruleset internals")...

@rjgotten
Copy link
Contributor

@matthew-dean
Post-processing visitors also handle stuff that needs the globally final state where everything has been 'expanded' / 'resolved' / etc. , but not yet unnested into plain CSS rulesets. E.g. the prop +: value merging syntax and, if I'm reading it correctly, @<directive> bubbling.


Regarding reduction of object creation; I think @lukeapage hit the nail on the head with the scopes involved in mixin call evaluation.

I have a rather complex system in a current project that maps a set of names to character codes, for use with font-based icons. The supports leaving a character code undefined for one named icon, at which point it should take the code of the previous icon, incremented by one.

Very nice for custom icon font definitions as you can keep a neat and tidy ordering without having to remap codes the whole time. However; it never managed to scale well and gave me ridiculously long compilation times when I got into larger icon sets.

Anyway, it has quite a bit of branching-logic inside a tight loop that has to iterate over the list-of-lists I used to model the dictionary / map, which means mixins ... with guard conditions ... and overloaded signature matching ... on each iteration.

As a matter of experiment, today I removed the feature where undefined codes are automatically filled based on increments and I wrote a few dedicated custom functions to directly perform key lookups on a list-of-list (I love my new @plugin; makes this so--- easy to integrate ^_^) without having to write mixin-based iteration.

Doing away with the mixin-based branching-logic and iteration this way shaved 7 seconds off of the eval tree phase of the compilation of that project's file set and that was with a fairly limited set of just 50 icons.


As an aside:
I have since restored the auto-increment feature. Managed to write a function that creates a lookup structure out of a detached ruleset's properties. That gave me the option of writing an auto-incrementing loop based on mixins within the detached ruleset that is eventually converted into a key->value map. It's quite nice. If my employer allows, I may be able to arrange the complete set of map functions being made available publicly.

@seven-phases-max
Copy link
Member

I think @lukeapage hit the nail on the head with the scopes involved in mixin call evaluation.

It's the problem of the language. I don't think @lukeapage or anyone else can do too much about this, i.e. of course there can be some performance improvement tricks here and there, but nothing can cancel the whole picture: all this stuff just has to be re-evaluated in each scope because every scope may have different variables and mixins that may change the behaviour of another mixin call/match.

(Meanwhile there's way to "iterate" over a list w/o any loops at all... It's just has to be different list not one of a variable ;) (of course since this method is also mixin-based it probably can't be as fast as dedicated JS-function).

@rjgotten
Copy link
Contributor

of course since this method is also mixin-based it probably can't be as fast as dedicated JS-function

I guess the takeaway here is that if you need to do heavy-lifting stuff and need to do it fast, you really need the JS escape hatch. ;-)

@seven-phases-max
Copy link
Member

WebGL to go! (who needs these stone-age text based icon descriptions when you can iterate and render right in GPU :)

@SomMeri
Copy link
Member

SomMeri commented Apr 3, 2015

I found alot of recalculations - e.g. generating a scope for each mixin call seemed to take a large part of the calculation, even when it wasn't used.

@lukeapage Would lazy scope generation be an option? The disadvantage is that the code may end up more complicated. Where is in the code is the scope generated?

@SomMeri
Copy link
Member

SomMeri commented Apr 3, 2015

About parser tokenization/generation: less4j is written that way and I am not sure benefits are worth complete rewrite. Adding features like detached rulests is somewhat faster I think, but you are loosing complete control you have now.

Css/less does not break down into nice strongly typed tokens, so I have to rely on semantic predicates and "tricks" more then I like. Less.js knows the context, the code it uses to parse selector is completely unrelated to code used for property names or anything else. Less4j on the other hand has to use the same tokenizer for everything which leads to very small almost weakly typed tokens.

Similar problem happen with whitespaces. They are ignorable almost everywhere except some places. It was solvable of course, but somewhat easier in hand made parser. A lot of what is in css is like that - many general rules have at least some exceptions.

If I would be doing third compiler, I would likely go with generation again (depending on grammar), but it is pretty tight. It can be that current less.js architecture has downsides I do not know about, I did not looked that much into parser part. What I seen looked straightforward.

@rjgotten
Copy link
Contributor

rjgotten commented Apr 5, 2015

@SomMeri

Less.js knows the context, the code it uses to parse selector is completely unrelated to code used for property names or anything else. Less4j on the other hand has to use the same tokenizer for everything which leads to very small almost weakly typed tokens.

But that's what a tokenizer (or rather: lexer) is and does: it generates tokens. It's then up to a parser to decide how particular sequences of tokens are allowed to fit together and what meaning is attributed to those sequences.

You may think those tokens 'weakly typed', but infact they are very specifically typed: identifier, opening-brace, closing-brace, etc. The wider and weaker you make that definition, the harder it becomes to manage token extraction correctly without complicating the lexer into a full-blown parser with need for complex backtracking etc. implemented directly on the character stream with regexes.

Tokenization is meant as a pre-processing step that 'pre-shrinks' the input set, if you will and prevents you from having to run painfully large regular expressions over huge input strings.

If fully automated generation isn't your thing; you could always write a lexer/parser separation by hand. Full control and all the maintainance (and theoretical performance) benefits.

@SomMeri
Copy link
Member

SomMeri commented Apr 7, 2015

Some grammars are better fit for that than others. Parsing java (minus error handling) or python this way is likely to be easy and straightforward - they were designed in clean way that makes it easy. Css and thus less have features that makes it less perfect fit. The thing about css/less is that it sometimes resembles multiple grammars merged into one file. I seriously considered having special grammar for selectors where I would parse them only on superficial blob and then run through secondary parser.

Take identifiers - one way or the other they are pushed into parser instead of lexer. Less does not have keywords, so all could be ending tag in selector, could be all as config for extend or could be just an identifier. Same with variable names and at-rules, same with when etc.

Selector inside extend needs to be parsed almost as normal one, except that all in the end. Nothing unsolvable, but if I would be able to re-tokenize part of grammar that would be awesome. ANTLR actually supports something like that - you can switch tokenizers on the fly, but not in a way needed here.

In java, you can throw away most whitespaces. In python, whitespaces generarely matters. In less, they dont except few exceptional places - so you end up sacrificing performance or doing tricks on those.

As I told, it is nothing unsolvable. Les4j is very close to less.js and has everything I am aware of minus options and new features. But, it does slow you down every time something like that pops up and it forces me to do maintainability or performance trade offs I would prefer to avoid.

Side-note: it happened to me twice that I googled a problem and answer on antlr mailing list was: "best to redesign your grammar". They were right, but we can not really do that.

@SomMeri
Copy link
Member

SomMeri commented Apr 7, 2015

The other thing to be considered is error handling and error reporting. I have read about teams switching from generator to hand-made parser for that reason - it turned out easier that way.

If you really want to get it right with parser, your grammar can become big/convoluted due to error handling rules. I think that ANTLR 4 was trying to make it easier, but I did not looked at it yet. (ANTLR 4 should be also faster and more programmer friendly.)

It is something to be considered before switch, because you do not want to end up switching to other generator or hand-made again because of that.

@matthew-dean
Copy link
Member

One problem with ANTLR is that it seems to make extremely verbose parsers. The grammar definitions seem fairly complicated to design / maintain. An auto-generated parser may be the way, but I'm not sure ANTLR is that solution IMO.

@matthew-dean
Copy link
Member

What about something like this: https://github.com/Hardmath123/nearley

@rjgotten
Copy link
Contributor

Take identifiers - one way or the other they are pushed into parser instead of lexer. Less does not have keywords, so all could be ending tag in selector, could be all as config for extend or could be just an identifier. Same with variable names and at-rules, same with when etc.

You're not thinking small enough and as a result you're still making the tokenizer too powerful.

For at-rules and variables the @ itself would be an AT token output by the lexer. The parser would read that token and (depending on its current parsing context) would open an at-directive, variable declaration, variable evaluation, etc. parsing context.

When the parser asks the lexer for the next token it does so with explicit instruction to not allow whitespace inbetween. The parser then expects the lexer to respond with an IDENT token. The parser then decides what to make of that token based on its string value and will compose the full at-rule name, variable name, etc. accordingly.

If the lexer doesn't reply with an IDENT token or the parser can't make anything of the value of the IDENT token, then the parser pops its current parsing context and returns to the previous one to retry the next valid production rule.

If you work with generalized tokens this way, then when the parsing context is popped you can also shift all previously lexed tokens back onto the front of the token 'stream' as the tokens themselves are free of context. This means you don't waste cycles on relexing. You only waste a bit of memory in keeping processed tokens around in a memory buffer until you can be 100% sure they can be discarded.

@SomMeri
Copy link
Member

SomMeri commented Apr 23, 2015

The parser then decides what to make of that token based on its string value and will compose the full at-rule name, variable name, etc. accordingly.

Doesn't that require semantic predicate e.g. js/java/whatever condition? They were supposed to be used as last resort, since slow down algorithm and can mess up with prediction (from what I remember plus probably depends on what generator you use).

I was doing it that semantic predicates way in less4j (minus last part - generator control that not me) and just finished removing them - moving as much as possible into "pure" grammar.

@seven-phases-max
Copy link
Member

A remark not related to the original @lukeapage goal, but to note something important before you go too far along that "parser refactoring" road (as of the recent posts here). Before there were already rumours that for lessc performance in node (and thus less.js performance in a V8 browsers) it's not the parsing stage that is the real bottle-neck but the css generation stage (when compiling some "average" less sources).
Here's a lessc compiling bootstrap.less (just for example) V8-profiling snapshot (you can open it in either Blink browser via "developer tools > profiles > Load") for the documental reference. As you can see it's "parsing"/"toCSS" -> ~30%/70% or even worse (the profiling may drift in +/-10% between several runs because of some pretty random GC or so).
(The timeline of the profiling data is sort of virtual i.e. 1 sec is not real 1 second there).

In other words if lessc performance is concerned, any improvements are to begin with the CSS generation stage ("ruleset eval", "mixin-lookup" in particular + related stuff, then ruleset and rule visitors etc.) and not with the parser.

@rjgotten
Copy link
Contributor

rjgotten commented May 8, 2015

I'm going to second what @seven-phases-max is stating; the results I can get from a few of my own (quite large) less codebases also indicate that the bottleneck on anything of substantial size and complexity is in the ruleset/mixin eval phase.

I notice there's a lot of array cloning (via slice) and concatenation going on. My guess is you could fix up performance quite a bit by finding a way to eliminate that and using a more proper push/pop stack approach.

And ofcourse; type stability, type stability, type stability. Can't really state that one enough.

@matthew-dean
Copy link
Member

Well, let's attack it from all sides, shall we? But in order to do that, we need to be able to quantify improvements: that is, being able to hook in timers (in a dev process) for the time spent in each function.

I think we could do something like have a dev function that monkey patches selective Less functions. That way you wouldn't have any code left in production builds. It could even be a Less plugin. Basically, you override each function with a function of the same name that takes a Performance.now() measurement before and after the original function call, and increment the total count as you go. Output the totals per function at the end, easy peasy.

Once we have that, then we can quantify improvements with compiling different Less stylesheets, at least how it runs in current versions of V8. Otherwise we'd just be guessing if replacing an array slice or concat makes any measurable difference. I know we have total time in our benchmarks, but it would be nice to have function-level benchmarking, so that retooling a specific function is easier to measure.

@rjgotten
Copy link
Contributor

Basically, you override each function with a function of the same name that takes a Performance.now() measurement before and after the original function call, and increment the total count as you go. Output the totals per function at the end, easy peasy.

That's more or less what the JS profiler in both Firefox and Chrome's developer tools are already designed to do.

(Sidenote: Firefox's is also pretty nice in that it supports an 'inverted call tree' setting. It makes it easy to identify the biggest spenders at the lower levels of the call stacks without having to drill down through a huge graph manually.)

There's a NodeJS package that creates bindings to the V8 profiler that's active in Node 0.11, which I think is exactly the kind of telemetry data you're looking for:

https://www.npmjs.com/package/v8-profiler

@matthew-dean
Copy link
Member

@rjgotten Oh nice! I'll take a look.

@stale
Copy link

stale bot commented Nov 14, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants