Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imports referenced more than once seem to be parsed multiple times #2640

Open
royriojas opened this issue Jul 17, 2015 · 23 comments
Open

Imports referenced more than once seem to be parsed multiple times #2640

royriojas opened this issue Jul 17, 2015 · 23 comments
Labels

Comments

@royriojas
Copy link

Hi guys,

I just notice the build for a somehow complex less file was taking about 4s. I did some debug and found that the imports are parsed several times. Even when they were already previously parsed.

I have made a simple change to the import-manager as a proof of concept

royriojas@b84e7c6

with that small change the build time was reduced to 1.5s, I guess because in my use case some files: (constants, mixins and others) were included several times from different files

The change I did is very naive, and we can make sure we create a cache for each instance of the less object created.

Also, if we store this cache to a local file in the users filesystem we can reuse it for the next run and this will a build be reduced to just milliseconds. Are there any plans of enabling something like that. Would be really nice to have.

@seven-phases-max
Copy link
Member

This makes sense, though I'm afraid the implementation does not take certain edge cases into account (@import may be context depended).

Consider the following example:

// master.less
a {
    @i: 1;
    @import (multiple) "a";
}

b {
    @i: 2;
    @import (multiple) "a";
}
// a.less
@import "b@{i}";
// b1.less
div {
    color: red;
}
// b2.less
div {
    color: blue;
}

But it's not that really complicated as it may look. Taking into account every @import except (multiple) implies (once), for simplicity I guess it would be fine to keep the same simple cache code and just don't use it for (multiple).
More over, since (once) means any already imported (and thus cached ) file should not be evaluated at all, there's no need for a cache at all too - the compiler may safely skip reading/parsing a duplicated (non-(multiple)) import completely (somewhere earlier in the code, e.g. just right after resolvedFilename is ready).

@royriojas
Copy link
Author

@seven-phases-max sure I knew my implementation does not covered all the cases, but at least it was a proof of concept of something that can be used to include other kind of improvements like incremental parsing or incremental building

I was also looking to see how complicated will be to provide incremental parsing to less. I mean most of the times the compilation of the less files are quite trivial and takes very few milliseconds, but I have a couple of files that are taking at least 4s to complete the parsing.

I was thinking that caching the result of the parsing of each file and only parse it again if the file or any of its dependencies changed (imported files, inlined uris, etc).

I know that we can set a watch over the files and deps and recompile when any of them changed. I have written a module that does that and I'm aware there are other as well, but they require be in watch mode.

Would be really nice to be able to persist the cache and use it to decide which files to parse (obviously only the ones that changed).

I have done something similar with browserify persisting the cache that watchify uses. That way I don't have to use watch mode. I can just run severval times the bundly process and it only recompiles the files that were changed and it is blazing fast, well it depends on the amount of files changed of course, but for any real development from commit to commit only a few files are altered.

I know that this might be out of the scope of the main less library, but it would be nice to provide hooks so others can write the modules that do this kind of work.

What do you think?

@seven-phases-max
Copy link
Member

I doubt I can have any particular opinion until I see some examples of what kind of hooks that could be... (sure, no bias against something useful) :). Technically I can't see anything wrong if there's some hook (probably provided via same plugin API as other custom stuff is?).


Speaking of incremental compilation (not just parsing) in general, don't forget of Lazy-Evaluation. E.g. in:

@import "something";
@some-var: a;
@import "something";
@some-var: b;

the result of "something" may be different even if the file is not changed.

So while parsing is definitely something that can be easily cached like above, the compilation of an imported file actually can't in general (as each imported file is compiled in context of where it's actually imported). Not counting direct caching of the whole project result of course (e.g. just "master.less" -> "master.css"), but this would not need any special support inside the compiler itself.

@royriojas
Copy link
Author

Hi @seven-phases-max very interesting corner cases. Yeah those scenarios where a variable is mutated inside other file it's something I haven't actually suffered from. But indeed others might be relying on it.

In any case would be interesting to think about how can some hooks be provided to make something like incremental parsing, if that info is persisted the module could just reuse it. I would like to do some more research on that matter.

Something like this would really help in my case:

var opts = { 
  file: 'path/to/file',
  // this cache can be passed by reference like browserify does 
  // and can be the one previously persisted deleting the entries for the files that changed
  // between executions.
  parseCache: {} 
};

less.render(content, opts, function (err, output) {
  // do something with the css here
  var result = output.css; // the compiled result

  // Since the parseCache property was passed by reference it 
  // can be persisted. In this scenario the less module 
  // doesn't have to store the cache, just merely use it
  // and store the result of the parsing of the files in it
  //
  console.log(opts.parseCache); 
  // {
  //    'path/to/file1.less' : {
  //       root: [root object],
  //       contents: '' // do you use the original content? if not we only need the root object.
  //    }
  //    'path/to/file2.less' : {
  //       root: [root object],
  //       contents: '' // do you use the original content? if not we only need the root object.
  //    }
  // }
});

Would you consider doing something like that? exposing to a plugin might also work :)

Couple of questions...

Does the parsing of the file depends on or is affected by the imported resources? If the answer is true a tree structure of which files depend upon which others would help. But if it doesn't then with something like the above would be enough for my use case. Reducing from 4s to 1.5s is still a big win.

Just to clarify, on your initial response you do believe that it is possible to store the parsing cache without hitting much corner cases right?

Thank you for your time and patience :)

@slorber
Copy link

slorber commented Dec 3, 2015

Hey @royriojas I have been trying your solution and it gives me good results too.

I went from 4sec to 1.5sec on the initial less compilation.

The problem is that I have a gulp watch that triggers a less.render and when it does it seems the cache kicks in even for modified files.

Maybe it's because I use a gulp plugin: https://github.com/plus3network/gulp-less/blob/master/index.js

However I've found a workaround by putting this at the beginning of the render() method:

        var importsCache = require('./imports-cache');
        importsCache.clean();

I have passed from 4sec to 1.5sec for incremental compilation too with this trick.

Now I suppose it could even get better if there was a way to tell Less exactly which key to invalidate from the cache.

@seven-phases-max would you think something like that would be possible? I don't care that much if the compilation happens everytime, but if at least we could cache the file parsing it would already be a huge win for incremental compilation speed.

@slorber
Copy link

slorber commented Dec 3, 2015

Hey I have hacked something that seems to work fine.

var lessImportsCache = require("gulp-less/node_modules/less/lib/less/imports-cache");

        gulp.watch('./src/**/*.less').on('change', function(file) {
            gutil.log(gutil.colors.yellow('LESS file change:' + ' (' + file.path + ')'));
            lessImportsCache.remove(file.path);
            gulp.run('styles');
        });

(I removed the code that cleans the cache at each render)

The initial compilation is still 1.5sec, and the next ones are like 1sec (instead of 4sec in all cases).

@seven-phases-max do you think my solution is safe in regard to your corner cases?

I don't think we use any of these corner cases as our less code is a LOT of mixins composing each others, so anyway I'll probably use that solution in dev env and use a regular less for production.

@seven-phases-max
Copy link
Member

@slorber At first glance, yes. As I mentioned above, caching parsing result is totally safe unless your project includes some tricky (multiple) imports.

@slorber
Copy link

slorber commented Dec 3, 2015

@seven-phases-max I've made a PR.
I think an option to enable this performance improvements is worth to be proposed to the users, even if it may be unsafe for certain codebases

@royriojas
Copy link
Author

@slorber, @seven-phases-max

I made a cli tool called simpless based on my branch. and I did handle the watch case there as well in a pretty much the same way @slorber did with gulp.

https://github.com/royriojas/simpless/blob/master/index.js#L211

I was thinking that if I can persist the cache to disk, like I did for persistify, esbeautifier and eslint

then we could speed up tremendously the less compilation times. It can go from seconds to just milliseconds.

I remember I did some proof of concept of that, but in the end I find something that prevented the serialization. Don't really remember what was the issue, but maybe a circular reference or something.

Basically if less provides a cache, even if only for parsed files, if that cache can be persisted, then incremental builds will be easier to do and watch modes will be super fast.

What do you think?

@slorber
Copy link

slorber commented Dec 3, 2015

@royriojas You are using the command line. Don't use the command line and your problem will be solved.

I don't have this problem becauseI use gulp, and my node process stays alive ,and the cache remains populated in memory over time. This means I don't need the cache to persist on the disk, I just need to use gulp.watch. Only the first less render starts with an empty cache and the next ones will have a populated cache (if you use the withManualImportsCachingInvalidation option at least)

However in my experience this did not reduce the compilation time so much, probably because I use a LOT of mixins and the IO and parsing time is maybe not the only bottleneck in my app.

@matthew-dean
Copy link
Member

Basically if less provides a cache, even if only for parsed files, if that cache can be persisted, then incremental builds will be easier to do and watch modes will be super fast.

I'm interested in this as well, as I have a use case for executing numerous incremental builds by only changing one variable at a time.

@royriojas
Copy link
Author

@slorber, simpless provides a watch mode as well, and I would argue the command line is way more powerful than gulp, grunt or others, but that's merely my opinion... :)

There are other cases where you don't have the ability of run the watch mode, for example:

  1. at the beginning of your day
  2. on ci server
  3. you have multiple targets and watching all of them is convoluted.

@slorber
Copy link

slorber commented Dec 3, 2015

I understand your point but I think using such a persisted cache could also be very dangerous as there are many ways that persisted cache can be updated.

For example you are on master and populate that cache. Then you close your computer and the next morning you checkout dev and launch your command-line tool. If you load the cache from disk at this point you will get the cached files from master.

It is simpler to trash that cache everytime you stop the watch process, or you need some system to check cache consistency when hydrating the cache (like using md5 checksums on files...).

@royriojas
Copy link
Author

@slorber I have implemented such a module already file-entry-cache and does exactly what you mentioned.

It is what is used inside persistify which persists the cache from browserify to disk. It is safe to switch branches and everything, since the module is smart enough to detect the changed files and remove them from the cache

@slorber
Copy link

slorber commented Dec 4, 2015

@royriojas that's nice if that works fine for you.

Your tools are probably nice however many people are already using Grunt or Gulp or something else instead of the command line, for good or historical reasons. In my case I'm clearly not interested changing my whole build setting just to lower a little bit more the less compilation.

The more naive approach we both used divices my compilation time by 3 and now I compile incrementally in 1sec. In my case it's not worth looking further and even if simpless perform better I still think this optimization should be available to all less users.

@slorber
Copy link

slorber commented Dec 4, 2015

Also note that when using a persistent cache and the command line, you have to read the files from the disk everytime the command run (the os can cache too however), while in my case the parsed files remain in memory.

If I run less render twice:

  • first with an empty cache, i compile in 1.5sec
  • second with a full cache (means no IO nor parsing), I compile in 1sec

So, in my case I guess the IO + parsing for all my less files takes something like 0.5sec.

The remaining 1sec is then probably only compilation of parsed files and IO write of output CSS, which seems kind of harder to optimize (?)

If I were to use your solution, when running the command line, my cache would be full so I can still expect 1sec of compile + write CSS. However there's still the additional cost of reading from cache, so it would actually be worse than with gulp and keeping the cache in-memory over time.

I don't have measures to know how much is reading file and how much is parsing.

If your tool simpless keep parsed files in memory with a watch mode then it's fine and will lead to similar results than I have with gulp but I don't expect anything better unless you explain me why it would be better.

I can't give a try to simpless because it does not seem to support things I need in less like some plugins and other options.

@royriojas
Copy link
Author

I guess there is a misunderstanding here. I don't want to promote
'simpless' over 'less' actually I would like less to provide the cache
option in the module. That way everyone gets the benefit. Even you using
gulp or grunt or others using different tools like me. If I wish to have a
feature that all tools provide is incremental building.

That said I'm open to help to implement this feature in less. Maybe
implementing a custom file manager will be a better way to do it?

On Thu, Dec 3, 2015, 7:33 PM Sébastien Lorber notifications@github.com
wrote:

Also note that when using a persistent cache and the command line, you
have to read the files from the disk everytime the command run (the os can
cache too however), while in my case the parsed files remain in memory.

If I run less render twice:

  • first with an empty cache, i compile in 1.5sec
  • second with a full cache (means no IO nor parsing), I compile in 1sec

So, in my case I guess the IO + parsing for all my less files takes
something like 0.5sec.

The remaining 1sec is then probably only compilation of parsed files and
IO write of output CSS, which seems kind of harder to optimize (?)

If I were to use your solution, when running the command line, my cache
would be full so I can still expect 1sec of compile + write CSS. However
there's still the additional cost of reading from cache, so it would
actually be worse than with gulp and keeping the cache in-memory over time.

I don't have measures to know how much is reading file and how much is
parsing.

If your tool simpless keep parsed files in memory with a watch mode then
it's fine and will lead to similar results than I have with gulp but I
don't expect anything better unless you explain me why it would be better.

I can't give a try to simpless because it does not seem to support things
I need in less like some plugins and other options.


Reply to this email directly or view it on GitHub
#2640 (comment).

@slorber
Copy link

slorber commented Dec 4, 2015

Ohh ok so if you can do that in less directly it would be really great and I'll be happy to use that :)

Does it mean less will have to implement some kind of watch mode? because currently I don't see any and it would be a great progress (like browserify or webpack already do)

@matthew-dean
Copy link
Member

I was revisiting this thread today when it was linked to, and I think there are some off-topic items and possibly separate requests attached.

In the issue, it's stated that Less parses the same import file multiple times during a single render. If that is the case, that is a definitive bug. Yes, imports can be evaluated in many different ways depending on context. But static file A, once parsed, should never be parsed again during that single render. If it is, that's a bug. A file can only have unique content once loaded, and therefore only needs one parsing tree.

Now, the thing I'm not sure of is if there is something in Less's evaluation stage that depends on unique parses. I doubt it, but it would need testing. Individual imports may get "attached" to the parse tree, so I'm not sure.

The separate issue is about allowing the parse tree to persist for incremental rendering, or being able to export or cache individual parsed imports. That should probably be filed as a new issue, I'm thinking.

@royriojas
Copy link
Author

@matthew-dean I agree with you. This issue should probably be closed, and new issues should be created based on the discussions on this thread.

@matthew-dean
Copy link
Member

Well, I'm going to mark it as a possible bug, but yes the rest can be added as a separate issue.

@slorber
Copy link

slorber commented Nov 24, 2016

For info, I'm running my app with a custom less fork and implementing caching lead to significant better performances on incremental compilations

https://github.com/slorber/less.js

It's certainly far from perfect implementation as I don't know enough less internals but it works for me.

As far as I know, to this day, the only major problems I had was due to running multiple less compilations in parallel in my gulp build, and having a bad cache cleanup system in my build that would sometimes remove an entry from the cache that is currently in use. If you don't run multiple compilations in parallel I had no problem. One solution would be that less creates a cache per compilation/entrypoint maybe

I'm not really able to give better feedback, it works for me and I'm not touching it anymore :p

@teazean
Copy link

teazean commented Apr 13, 2017

yep, I need this feature, too。

@matthew-dean matthew-dean changed the title cache the parsing of an import if it was already seem before Imports referenced more than once seem to be parsed multiple times Nov 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants
@royriojas @matthew-dean @slorber @teazean @seven-phases-max and others