Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add special handling for spawning npm #12

Closed
wants to merge 2 commits into from

Conversation

addaleax
Copy link
Member

As laid out in istanbuljs/nyc#190, spawn-wrap’ing npm does currently not work when using the npm binary that comes bundled with Node.js.

This adds special handler code for the case that npm is the program that should be executed, looking for the npm executable in PATH and invoking the shim with it as an argument.

/cc @bcoe

@@ -133,6 +133,32 @@ function wrap (argv, env, workingDir) {
options.envPairs.push((isWindows ? 'Path=' : 'PATH=') + workingDir)
}

if (file === 'npm') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be tempted to isolate this logic in a helper function, I find it's a good way to keep icky code like this isolated so that people can just concentrate on the inputs and outputs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re right – I’ll move the code for looking up npmPath into its own function. I’d like to keep the rest in place, if you don’t mind, because it resembles what happens in the blocks for handling file === 'sh' || … and file === 'node' || ….

@bcoe
Copy link
Member

bcoe commented Mar 31, 2016

@addaleax looking solid 👍 I'm noticing on AppVeyor, as with your other pull request, we're failing on just node@0.10.x. I'd like to figure out these build failures before we land both branches. I can start working on getting your work out the door though \o/

@addaleax
Copy link
Member Author

Sounds good – this repo never passed on AppVeyor with Node v0.10, btw.

@addaleax
Copy link
Member Author

Updated with your suggestion 😄

@isaacs
Copy link
Collaborator

isaacs commented Mar 31, 2016

First off, great find, and good job digging through the weeds to figure out the case where this is failing.

However, adding a special case for a bin named "npm" is not the right fix here. Something is going wrong with the #! line parsing, and the right fix is to address that parsing bug rather than add a special case around the executable script where that parsing bug occurs. (If for no other reason than that another script some day might use the same shebang, and thus hit the same parsing bug.)

Usually I dig into this sort of issue by adding a bunch of logging in the guts of spawn-wrap. But if you can provide a limited test-case, I'll probably have some time to dig in myself in the next few weeks.

@isaacs
Copy link
Collaborator

isaacs commented Apr 3, 2016

Aha! Ok, forget what I said in my previous comment. Spawn-wrap doesn't do shebang reading or evaluating. I was misremembering :)

I'm still not quite convinced this is worth special-casing, but I'm less unconvinced. Here's what's going on in spawn-wrap, and the choice to be made.

Spawn-wrap works by doing 2 things:

  1. It creates a shim file, and makes sure that shim file is executed instead of executing node directly, for any exec or spawn where node is being explicitly invoked.
  2. It puts the path to that shim file first in any PATH environment variable passed to any child process, so that just calling node from other programs (like bash scripts, or /usr/bin/env node shebang programs) will resolve to the shim instead of the node binary.

If you have a file that is invoked without calling spawn or exec with node, either via PATH lookup or by providing the explicit path (like calling npm directly, or doing ./my-node-program.js), then the shebang of that script is what's used by the system to figure out what to invoke. If that shebang line is relying on the PATH via #!/usr/bin/env node then it'll resolve to the spawn-wrap shim, and all is good.

However, if the shebang is an absolute path to a specific node binary, then the PATH preference doesn't matter, and the node binary will be invoked directly by the system, and spawn-wrap's shim is not used, and the wrap is escaped.

There are two other cases where you'll escape the wrap:

  1. A program (other than a spawn-wrapped node program) that specifically calls /path/to/node foo.js instead of just node foo.js or /usr/bin/env node.
  2. A program (other than a spawn-wrapped node program) that modifies the PATH prior to invoking node foo.js or /usr/bin/env node.

In those two cases, ¯\_(ツ)_/¯, what can you do?

The question is whether this third case, of spawning a script with an explicit shebang, should be covered. I'm on the fence.

In all three (explicit node binary call, modifying the PATH explicitly, or a fully-absolute shebang), there's a pretty clear user intent to not allow wrapping.

On the other hand, the person who wrote that shebang is far away, and you're deciding to wrap your test call on your computer, so your intent matters more.

To do this right, we'd have to do the following for any program before deciding not to wrap it:

  1. Resolve the file via path lookup
  2. Read the first 256 bytes of the file
  3. If it's a shebang, and a shebang to node, and absolute, then shim it.

I don't see how to do this without doing synchronous IO in an otherwise async function. But this program does a bit of that anyway, and it's gonna be pretty fast. (If you're using spawn-wrap on a production server to run npm test, you have bigger problems, most likely!)

@isaacs
Copy link
Collaborator

isaacs commented Apr 3, 2016

Can y'all check if the patch in #13 fixes the issue as well?

I'm still not sure if this is a rare enough case to just go ahead and look for the name "npm", but reading the file's shebang takes care of any other cases like this one.

@addaleax
Copy link
Member Author

addaleax commented Apr 3, 2016

@isaacs Not sure whether you’ve read in the mentioned nyc issue – Its actually even a bit of a combination of the edge cases you mentioned (fixed executable in the shebang + modifying PATH to match that fixed executable).

But either way, I think closing this PR in favour of one of the others is a good idea 😄

@bcoe
Copy link
Member

bcoe commented Apr 3, 2016

@addaleax @isaacs this sounds smart to me, and we can perhaps rollback #12 if the change to node lands.

My only comment is, would it be worth using an approach like head rather than readFileSync, I imagine this might be a slight performance boost especially if people are instrumenting large files -- or am I micro-optimizing? I know the AVA folks are performance concious.

@bcoe bcoe reopened this Apr 3, 2016
@bcoe
Copy link
Member

bcoe commented Apr 3, 2016

@addaleax oh, one other comment, any objection to pull in the test I added in #14?

It just fakes an npm bin, so that we exercise the new logic.

@bcoe
Copy link
Member

bcoe commented Apr 3, 2016

@addaleax if you rebase with master when you update this pull, appveyor should be passing now (I just removed Node 0.10).

@addaleax
Copy link
Member Author

addaleax commented Apr 3, 2016

@bcoe I think I’d have used something like head too, but it probably won’t make too much of a difference. If the target file is actually large, the JS parsing will overshadow any shebang parsing by orders of magnitudes (You can experiment with things like eval('(function() { var a = 0; ' + 'a++;\n'.repeat(100000) + ' return a; })'), but that will obviously depend on the actual file contents).
And for reading the file, maybe the synchronous I/O is a bit of a problem if the calling process would like to do something else – which is not the case for nyc – but other than that, it should just fill the kernel’s fs cache a little earlier.

And yup, I’ll add your test from #14 to this. I’d like to rename the file to npm and add the directory to the PATH, though, if that’s okay?

@bcoe
Copy link
Member

bcoe commented Apr 3, 2016

@addaleax seems reasonable to me, do we need to read the file? you don't think it's enough that the bin is npm (your original fix seems to work quite well).

@addaleax
Copy link
Member Author

addaleax commented Apr 3, 2016

@bcoe Uh, not sure – I assumed when you mentioned readFileSync you were talking about the shebang stuff (#13)? I will keep this PR free of sync I/O other than which, yes, and just check for npm as the command.

@addaleax
Copy link
Member Author

addaleax commented Apr 3, 2016

Okay, so, the (new) AppVeyor tests are failing, and I haven’t done much Windows development in a long time… maybe one of you could take a look at it?

addaleax and others added 2 commits April 4, 2016 02:38
As laid out in istanbuljs/nyc#190, spawn-wrap’ing `npm` does currently
not work when using the `npm` binary that comes bundled with Node.js.

This adds special handler code for the case that `npm` is the
program that should be executed, looking for the `npm` executable
in PATH using `which` and invoking the shim with it as an argument.
@addaleax
Copy link
Member Author

addaleax commented Apr 4, 2016

Ignore that, everything seems fine now.

@addaleax
Copy link
Member Author

addaleax commented Apr 4, 2016

@bcoe The “new problem” mentioned in istanbuljs/nyc#190 with recursive npm run invocations should be fixed when including this PR in its current state (I originally missed that there is also the possibility of shell-based exec('npm …') calls, but your tests thankfully pointed that out).

@isaacs isaacs closed this in #13 Apr 5, 2016
@addaleax addaleax deleted the npm-support branch April 5, 2016 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants