Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make browserify recognize its own bundles #1151

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jmm
Copy link
Collaborator

@jmm jmm commented Mar 6, 2015

Summary

  • It's a pain in the neck to use a browserify bundle in a subsequent bundling operation.
  • Wouldn't it be nice if you could just require() a browserify bundle in another bundle?
  • Pragma: prepend on output, detect on input
  • Demo

I think this is pretty light-weight and useful, but it seems like the kind of thing I should get some thumbs up on before going ahead with.

This involves changes to browserify and several dependencies, so the branch this points to is just a placeholder. The real commits are here:

Tests pass except for one on browserify that I think will be remedied easily enough.

My anecdotal testing suggests this would have a negligible impact on performance, but I welcome suggestions on how it should be tested.

Details

Currently it's kind of a pain in the neck to bundle a browserify bundle A in a new browserify bundle B, e.g.:

browserify('./entry.js')

entry.js

// bundle-a is a browserify bundle
require('./bundle-a');

If you just do that it's highly prone to failure. There are at least 2 failure modes:

  1. Bundle operation B errors out because it attempts to resolve paths in require() calls in bundle A and doesn't find them. It's unnecessary and undesirable to even do this because bundle A already contains all of the files referenced by require() calls that it needs to contain.
  2. Bundle operation B resolves paths from require() calls in bundle A to files that don't have anything to do with anything and includes them in the bundle even though they're not referenced by anything.

I've seen confusion about how to consume a bundle in a subsequent bundling operation a number of times here and on stackoverflow1. Generally it can be cured by:

  1. Setting noParse for bundle A in the bundle B operation; or
  2. Running the standalone bundle through derequire; or
  3. Minifying the bundle

But each of those approaches requires that you and / or consumers of the module keep track of which modules are browserify bundles and / or modify them or do extra work to use them, instead of being able to just require them like any other module.

And to boot, noParse currently suffers from path mis-matching issues that are affecting a lot of things in browserify right now (e.g. it's sensitive to including / omitting file extension for one thing).

And a lot of users don't even know about doing those things. Neither the noParse documentation nor the notes about derequire even mention that use case. Also, standalone bundles used to be processed through derequire on creation and that was discontinued. I don't think it's obvious, especially to users used to the former behavior, that it won't work to:

just require() a browserify bundle in another bundle.

Wouldn't it be nice if you could do that, without breaking anything else? I think there might be a way. This PR is a proof of concept of a method for doing that.

In a nutshell, when bundling:

  • On output, prepend a pragma kind of thing, similar to the ECMAScript use strict directive. The current format I'm testing is: ({"compiler": "browserify", "version": "1.23.456"});
  • On input check the beginning of each bundled file for the pragma. When found, noParse the file. Probably omit transforms as well (but that could be an option).

I'm guessing that the pragma format(s) I'm testing wouldnt wouldn't break one bundle in a million in the wild, but if that doesn't seem safe enough then a (bit) longer, more arbitrary value could be added, like b1c2247fb27f6c8ffb66871b1cfbd30e, which I expect would reduce the likelihood of breakage to one in a trillion.

These are the ideas I've iterated through for the format of the pragma (chronological order):

  • "BROWSERIFIED"; (or with ')
  • "BROWSERIFIED@1.2.3"; (or with ')
  • "compiler:browserify@1.2.3"; (or with ')
  • '{"compiler": "browserify", "version": "1.23.456"}';
  • ({"compiler": "browserify", "version": "1.23.456"}); (current)
  • More compact possible alternative: ({"browserify": "1.23.456"});

Another option could be to apply special handling by file extension(s), like .bify.js, .browserify.js, or .browserified.js, but I favor the pragma approach if it's one or the other.

Thoughts?

1.

jmm added a commit to jmm/browserify-pragma-demo that referenced this pull request Mar 6, 2015
@ghost
Copy link

ghost commented Mar 6, 2015

This seems pretty good and can avoid the huge slowness of derequire. Pragma-wise, I like the simple compact strings but that decision should be up to you since you're taking the initiative here. If there are no other concerns from anyone else, just publish browserify-pragma to npm and I'll merge the changes into browser-pack and module-deps.

@jmm
Copy link
Collaborator Author

jmm commented Mar 6, 2015

Great, thanks for the consideration!

Pragma-wise, I like the simple compact strings but that decision should be up to you since you're taking the initiative here.

Ok, thanks. I'm leaning a bit towards one of the later formats like (JSON object);, because it leaves the door open to most readily embed more data in these things in a structured way in the future without changing the basic format, if that were to become desirable. (And to my mind the byte length it adds to the bundle is not unreasonable.) But I'm not married to any of them and I'm open to suggestion about what would be the wisest format.

Thanks for your feedback. For the string versions do you have an opinion on with / without version #? I just thought it might be a nice piece of metadata for debugging, but it's probably not essential. I'll mull this over more, and I'd love to hear comments from anyone else who has an opinion too. It would obviously be ideal to get it "right" the first time. @thlorenz @defunctzombie @hughsk @domenic @calvinmetcalf @feross @zertosh @dominictarr do you have an opinion?

If there are no other concerns from anyone else, just publish browserify-pragma to npm and I'll merge the changes into browser-pack and module-deps.

For now I've published browserify-pragma@0.1.0 to npm, utilizing the pragma format listed as "current" above. Needless to say I need to write some tests for this package.

@zertosh
Copy link
Member

zertosh commented Mar 6, 2015

Sounds interesting. Just as an FYI, I recently made some improvements to derequire that cut it's run time by more than 50% (see calvinmetcalf/derequire#23). There are still more improvements that can be made there.

@jmm
Copy link
Collaborator Author

jmm commented Mar 6, 2015

@zertosh That's good to know. I wonder how desirable / necessary it is to avoid running transforms on an existing bundle.

@jmm
Copy link
Collaborator Author

jmm commented Mar 6, 2015

@zertosh Do you have any idea how much more the performance could be improved? I'd have to double check, but I think I remember seeing an issue comment describing browserify as being slower by a factor of 10 due to derequire. If that's at all accurate then even a 50%+ faster derequire could still result in a significant slowdown in browserify.

@zertosh
Copy link
Member

zertosh commented Mar 6, 2015

@jmm: The test suite runs in ~780ms. I just quickly switched it from esprima-fb to acorn and it ran in ~720ms. So that's another easy win (I'll put in a PR for that later). I have to see what it is exactly that esrefactor is doing, maybe that can be improved.

Sigh, I dream of a world where build tools passed around ASTs instead of raw source. I have high hopes for estree, then maybe that'll happen.

@zertosh
Copy link
Member

zertosh commented Mar 7, 2015

@jmm @calvinmetcalf, I managed to really improve derequire, there is a little bit more I can squeeze out of it, but I'll do that tomorrow (see calvinmetcalf/derequire#26). The tests went from ~780ms to ~300ms, and derequiring react-with-addons.js went from 7.2s to 0.4s - yes a 95% drop.

@zertosh
Copy link
Member

zertosh commented Mar 8, 2015

@jmm calvinmetcalf/derequire#26 is "done" if you want to give it a try. The final perf metrics are: To derequire react-with-addon.js for two tokens it took ~9387ms before, and now ~242ms.

@substack After calvinmetcalf/derequire#26 gets merged, I think it might be worth exploring bringing back derequire into browserify. It would be awesome if you could tell browser-pack what function name to use for require. This way, you could implement derequire as a plugin that ran after label on a per row basis so it can cache the results when doing incremental rebuilds.

@calvinmetcalf
Copy link
Contributor

confirmed, with @zertosh's changes the perf test went from 42127ms (aka 42 secconds) to 742ms

@jmm
Copy link
Collaborator Author

jmm commented Mar 9, 2015

@zertosh

I managed to really improve derequire...yes a 95% drop.

Wow! Nice work.

This diff dramatically increases the speed of derequire by: (1) eliminating an unnecessary ast parse
...
Sigh, I dream of a world where build tools passed around ASTs instead of raw source. I have high hopes for estree, then maybe that'll happen.
...
This way, you could implement derequire as a plugin that ran after label on a per row basis so it can cache the results when doing incremental rebuilds.

To your point, if derequire is integrated directly with browserify, could you conceivably have detective return an AST and have derequire consume it, saving an additional 1 small parse per row or 1 big parse per bundle?

@calvinmetcalf
Copy link
Contributor

@jmm we could go one farther actually and since detective is finding all of the requires detective could also change the function name as it finds it and then in the latter step when browserify wraps in in the function (require, exports, module){} it could use the modified name for require. This would likely be the most performant version possible but would have some issues with people not wanting their code mangled like that.

@jmm
Copy link
Collaborator Author

jmm commented Mar 9, 2015

@calvinmetcalf Yeah, I considered that too. I haven't thought it all the way through yet, but I figured if derequire performance is now acceptable it might be a cleaner separation of concerns to leave detective doing what it does (but return the AST) and let derequire do its mangling thing. As everyone has noted, either way it requires integration with browser-pack (which I don't think will be hard to implement).

Are there any good use cases for people to care about this and not want the mangling? browser-unpack'ing a bundle to do something with it is one I guess. So either way, whether detective or derequire does mangling maybe there should be an option to enable / disable it. In that case maybe the pragma proposal this PR is about would still make sense so that browserify will recognize its own unmangled bundles.

@zertosh
Copy link
Member

zertosh commented Mar 9, 2015

@jmm @calvinmetcalf

could you conceivably have detective return an AST and have derequire consume it

Yeah, this would be a small change in derequire. In detective, we'd have to make sure to pass ranges as true, and have it return the AST to module-deps, which then returns it to browserify.

Are there any good use cases for people to care about this and not want the mangling?

Oh yeah, as you noted, browser-unpack is one - so bundle-collapser too. Which makes this all really tricky because that's a really important tool. I'm not sure what's the right path here.

@jmm
Copy link
Collaborator Author

jmm commented Mar 9, 2015

@zertosh

and have it return the AST to module-deps, which then returns it to browserify

Yeah, I was thinking that if this were implemented in the plugin fashion you suggested, module-deps could just populate row.ast.

@jmm
Copy link
Collaborator Author

jmm commented Mar 9, 2015

@zertosh

Oh yeah, as you noted, browser-unpack is one - so bundle-collapser too. Which makes this all really tricky because that's a really important tool. I'm not sure what's the right path here.

How feasible would it be to bless something, _dereq_ or whatever it needs to be, as a reserved function name that gets special treatment in browserify like require does and make the necessary tools aware of it so that they're both handled the same?

@calvinmetcalf
Copy link
Contributor

How feasible would it be to bless something, dereq or whatever it needs to be, as a reserved function name that gets special treatment in browserify like require does and make the necessary tools aware of it so that they're both handled the same?

ok so the issue there is that the output of browserify isn't the only place derequrie is used, ember uses it BEFORE it gets sent to browserify because they locally define a function called require that screws up browserifying.

more generally

originally derequire modified the ast and then used that to recreate the code. This lead to people complaining that it was messing up their white space and inappropriately adding/remove semicolons, so I switched to modifying the code as a string instead of using the AST to recreate the code. This is in line with what browserify does where it creates AST to find the requires but it does not use that to create the final output , so solution might be to pass the code and ast to derequire and then return modified code.

@zertosh
Copy link
Member

zertosh commented Mar 11, 2015

@jmm: I included an undocumented browserify plugin that runs after the pack step. When you use bundle-collapser as a plugin, it runs after label (so before pack). So it's fine to run both as plugins. It does add time to rebuilds - but not that much. Do you think the added complexity of having "blessed" functions is worth it?

@jmm
Copy link
Collaborator Author

jmm commented Mar 11, 2015

@zertosh

I included an undocumented browserify plugin that runs after the pack step....

Ok, so if / when derequire is integrated with browserify that'd probably be the way.

Do you think the added complexity of having "blessed" functions is worth it?

I was just wondering if that might be a feasible way to run derequire when browserifying and still be able to use tools like browser-unpack / bundle-collapser later (by making them treat _dereq_() like require()). As @calvinmetcalf pointed out it's not as simple as being able to give that kind of special treatment to _dereq_. I guess browserify could designate it's own replacement (something other than _dereq_) as special and related tools could handle that specially.

What are your opinions @zertosh @calvinmetcalf on these issues:

  • Should users get to care about what's in the output, specifically whether it contains the original require() calls or has them replaced with _dereq_() or something else? For example, to later run bundle-collapser, or browser-unpack to roundtrip a bundle back to the original source files.
  • To that point, if derequire is integrated with browserify do you see it being mandatory or optional? (And enabled / disabled by default?)

I'm trying to figure out if integrating derequire with browserify would consistently solve the use case of re-bundling a bundle. Was that the main reason it was integrated previously, or was it for the sake of AMD loaders?

@calvinmetcalf
Copy link
Contributor

so previously when derequrie was included in browserify it was only run with the standalone option was passed, the idea being that standalone creates a self contained bundle you can use anywhere and thus changing the requires makes sense

@jmm
Copy link
Collaborator Author

jmm commented Mar 11, 2015

@calvinmetcalf Right, understood. My understanding is that derequire has at least 3 use cases / results in connection with browserify:

  1. Feed derequire's output to browserify as input (as in your ember example).

  2. Feed browserify's output to it to accomodate AMD loaders.

  3. Feed browserify's output to it to prevent browserify from parsing require() calls when the bundle becomes input to a new bundle.

Was # 3 the / a reason for incorporating derequire into browserify before, or was it just a side effect of incorporating it to achieve # 2?

I just checked and browser-unpack will unpack a standalone bundle, so that brings me back to the questions in my previous post.

@calvinmetcalf
Copy link
Contributor

#2 was my original aim, specifically dojo is fucking terrible

On Wed, Mar 11, 2015 at 1:43 PM Jesse McCarthy notifications@github.com
wrote:

@calvinmetcalf https://github.com/calvinmetcalf Right, understood. My
understanding is that derequire has at least 3 use cases / results in
connection with browserify:

  1. Feed derequire's output to browserify as input (as in your ember
    example).

  2. Feed browserify's output to it to accomodate AMD loaders.

  3. Feed browserify's output to it to prevent browserify from parsing
    require() calls when the bundle becomes input to a new bundle.

Was # 3 the / a reason for incorporating derequire into browserify before,
or was it just a side effect of incorporating it to achieve # 2?

I just checked and browser-unpack will unpack a standalone bundle, so that
brings me back to the questions in my previous post.


Reply to this email directly or view it on GitHub
#1151 (comment)
.

@zertosh
Copy link
Member

zertosh commented Mar 11, 2015

No. 3 is my only use.

To that point, if derequire is integrated with browserify do you see it being mandatory or optional?

I'm cool with it being optional - could just be a boolean called "derequire", if you need more options, then use the plugin. I'm not sure what the use case would be, but just as an FYI, it's perfectly ok to use derequire on a bundle with exports (so non-standalone). Regardless of the name given in externalRequireName, that variable won't get renamed since it's a global and not created in any scope.

@jmm
Copy link
Collaborator Author

jmm commented Mar 11, 2015

@calvinmetcalf

#2 was my original aim, specifically dojo is fucking terrible

Ok, thanks. If you could give me your feedback on the questions here when you get the chance that'd help me wrap my head around the situation. When I opened this PR I didn't think derequire was even an option for solving the rebundling use case I was trying to address, so I'm trying to figure out if with the performance improvements it's the best way to accomplish that, or if potentially re-integrating derequire into browserify and solving that use case are separate issues.

@jmm
Copy link
Collaborator Author

jmm commented Mar 11, 2015

@zertosh

No. 3 is my only use.

Ok, thanks for the feedback

I'm not sure what the use case would be, but just as an FYI, it's perfectly ok to use derequire on a bundle with exports (so non-standalone). Regardless of the name given in externalRequireName, that variable won't get renamed since it's a global and not created in any scope.

Ok, thanks. I wasn't worried about that, but I actually have a PR on browser-pack (#51) proposing to allow users to export a global require() from standalone bundles. So that'd be a use case for using derequire on a bundle that exports require().

I'm cool with it being optional - could just be a boolean called "derequire", if you need more options, then use the plugin.

I hope that @substack and / or some other people who are more familiar with the previous go around will have a chance at some point to comment on the idea of re-integrating derequire into browserify. If the outcome is that it either doesn't happen or it's optional then I think my proposal still makes sense (either on all bundles or just those where derequire is not enabled) to enable browserify to consume its own bundles without choking on them either way (with or without derequire). If derequire is re-integrated non-optionally, then I think it would obviate the need for this PR.

@zertosh
Copy link
Member

zertosh commented Mar 11, 2015

@jmm Sounds good. Thank you for taking charge of all this! I hope I didn't derail your work by coming out of nowhere with the derequire improvements.

@jmm
Copy link
Collaborator Author

jmm commented Mar 11, 2015

Thanks @zertosh. I think that's a definite possibility in the sense that my proposal may become obsolete, but I certainly can't fault you for awesome work improving derequire. I admit it will be a bit of a bummer if my effort on this was for naught, but if that happens it's because a better solution emerged. I also realize in hindsight that I should've checked in on what's been happening with derequire before working on a new solution :) I failed to adhere to one of the prime virtues of programming: laziness.

@calvinmetcalf
Copy link
Contributor

so my vote was that if we wanted to include derequire, just have it enabled when standalone is enabled, having a --externalizable (or better name) which also does it could work too.

@jmm
Copy link
Collaborator Author

jmm commented Mar 11, 2015

@calvinmetcalf Ok, thanks for clarifying. I think it would be ideal to have the option to get back out what you put in (meaning your original source after going through whatever transforms / plugins you configure), but I'm not sure if it's compelling to provide that.

samuelsimoes added a commit to fluxo-js/fluxo that referenced this pull request Nov 2, 2015
browserify/browserify#1151

obs.: I don't like the need to use something like this, maybe we can
find a better alternative to module loading. For this moment this commit
apply the derequire in order to keep things working.
lizozom added a commit to lizozom/VPAIDFLASHClient that referenced this pull request Nov 23, 2015
…ses browserify, since you can't browserify a browserified module (See browserify/browserify#1151)

Adding the option below to package.json allows including the raw JavaScript, allowing the requiring module to handle the browserification.
To include the module with require you should

npm install vpaid-flash-client

and then include it like this:

var VPAIDFLASHClient = require('vpaid-flash-client/js/VPAIDFLASHClient.js');
@jmm
Copy link
Collaborator Author

jmm commented Nov 30, 2015

@calvinmetcalf re: your comment, what makes it so terrible? Does it do regex matching or something? Any idea how many widely used AMD loaders there are? (Or CJS bundlers for that matter?)

The slickest thing, of course, would be to use scope tracking on input to ignore require() calls in scopes that define a require, but...performance. As an experiment, I made an adaptation of Detective that does that. It may not yield the best performance, but for convenience I did it using Babel, since it provides scope tracking for "free" (as in effort, not performance). I don't know that doing it any other way would yield acceptable performance either.

@calvinmetcalf
Copy link
Contributor

So the issue with dojo was it tried to load via http every require call in the package even though they were local ones. I think you're the issue was just that dojo wasn't written with interoperability with other loaders in mind so it assumed all require calls were for it.

Derequire btw works exactly like you suggested looking for require calls by scope and the current version isn't too bad performance wise.

@jmm
Copy link
Collaborator Author

jmm commented Nov 30, 2015

@calvinmetcalf

So the issue with dojo was[...]

Ok, I see, thanks for clarifying. So that'd be the same deal with any AMD loader, right? Unless there are some that scan for them with scope tracking.

I think you're the issue was just that dojo wasn't written with interoperability with other loaders in mind so it assumed all require calls were for it.

Yeah, I mean, essentially the same deal here right?

Derequire btw works exactly like you suggested looking for require calls by scope and the current version isn't too bad performance wise.

Yeah, I know from this thread that the performance was way improved. I haven't tested it myself, but from what was discussed here it sounds like that's no longer a blocker. I just meant that it would be slickest if the require()s could be detected with scope awareness on input because then it wouldn't be necessary to mangle them on output or add anything to the output.

From what's been discussed here it sounds to me like restoring Derequire by default on standalone bundles would be a reasonable thing to try in a major. I've also wondered if it makes sense to pursue this pragma idea further because it wouldn't be necessary to mangle the require() calls (but maybe no one cares) and I suspect it would have lower performance overhead, but to be effective for interop between Browserify and other stuff would require other module loaders / bundlers to adopt it as well (so something generic like "standalone-bundle";). That's why I was wondering how many widely used ones there are. That wouldn't help with versions of whatever lib from before they became aware of the pragma though. And I don't know if the performance difference between that and where Derequire is now would be meaningful even if it is lower.

@ForbesLindesay
Copy link
Contributor

The only thing that will realistically work across all the package managers we want to support is to either restore derequire or use remove-require as shown in browserify/browser-pack#53. Right now I would say this is probably browserify's most urgent issue.

@alundiak
Copy link

Any 2016-like updates on this issue?

@rhewitt22
Copy link

@alundiak

Just use derequire cli as demonstrated in the docs:

browserify main.js --standalone Foo | derequire > bundle.js

wildlyinaccurate added a commit to wildlyinaccurate/angular-relative-date that referenced this pull request Jun 12, 2016
This is a workaround for the browserify issue reported in ##20, since
browserify/browserify#1151 seems to be dead in the
water.
cburgmer added a commit to cburgmer/inlineresources that referenced this pull request Nov 5, 2016
@jfirebaugh
Copy link

jfirebaugh commented Mar 22, 2017

Would the two failure modes described in the OP be fixed by "simply" (quotes because I have no idea how simple it would actually be) fixing browserify/detective#22?

@nickkolok
Copy link

Why doesn't browserify just insert a small string mark (like "use strict';) in each scope where requires should not be resolved?

function(..., require, ...){
  "browserify_requires_resolved";
  // ...
  require(...);
  //...
}

@dominictarr
Copy link
Contributor

@nickkolok that is a pretty good suggestion actually!

@goto-bus-stop
Copy link
Member

This PR makes detective ignore require() calls inside browserified bundles: browserify/detective#79

It doesn't do full scope analysis because that would be really slow; instead it only checks for the browserify pattern where the inner require is a parameter to the module factory function, and skips walking functions with a require parameter entirely.

Would be cool to get some thoughts on that from folks here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.