Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: moving npm out of the core repo #1419

Closed
rvagg opened this issue Apr 14, 2015 · 19 comments
Closed

Discussion: moving npm out of the core repo #1419

rvagg opened this issue Apr 14, 2015 · 19 comments
Assignees
Labels
discuss Issues opened for discussions and feedbacks. meta Issues and PRs related to the general management of the project. npm Issues and PRs related to the npm client dependency or the npm registry.

Comments

@rvagg
Copy link
Member

rvagg commented Apr 14, 2015

This is something that has come up in a few separate discussions I've had recently and each time we have an npm update PR all I can think about is all of the cruft we keep on adding to the repo which makes it take even longer to clone which adds to the burden of contribution; also the fact that the diff is impossible to read because of all of the bundledDependencies.

So I have three main questions:

  1. Is there an appetite to deal with this and split npm out of the repo in some way?
  2. If we move it out, what is the nature of the coupling? I'm working on an assumption that we still want it to be built in to binaries and distributed and probably also tested in our CI as part of builds and maybe even regular test runs but perhaps I'm not thinking outside of the box enough?
  3. What are the technical possibilities for changing the current situation for npm in core? Perhaps as simple as specifying the version somewhere in the repo, as approved by the npm client team, then having configure download and unpack the appropriate tarball for that version, similar to the Intl download stuff or even simpler.

/cc @othiym23

@bnoordhuis
Copy link
Member

Perhaps as simple as specifying the version somewhere in the repo, as approved by the npm client team, then having configure download and unpack the appropriate tarball for that version, similar to the Intl download stuff or even simpler.

That assumes the machine where configure runs can make outgoing HTTP requests. It's one thing I don't like about the ICU integration, because it's not necessarily true.

Not many people need i18n but everyone needs npm; issues related to downloading are bound to pop up more frequently.

@rvagg
Copy link
Member Author

rvagg commented Apr 14, 2015

we could still bundle it in source tarballs

@YurySolovyov
Copy link

+1 to "out-of-repo, but still in source tarballs" option, though it would be up to person who makes releases to put it there.

@mscdex mscdex added meta Issues and PRs related to the general management of the project. npm Issues and PRs related to the npm client dependency or the npm registry. labels Apr 14, 2015
@Fishrock123
Copy link
Contributor

See also: #252 (original issue)

I think I'm generally for this if it can be done neatly.

@Fishrock123 Fishrock123 added the discuss Issues opened for discussions and feedbacks. label Apr 14, 2015
@elinformatico
Copy link

Nice

@othiym23
Copy link
Contributor

I am +1 on this change if and only if I'm not expected to do the infrastructural work necessary to make it happen. :D

@cjihrig
Copy link
Contributor

cjihrig commented Apr 14, 2015

+1 for this. We would need to account for the currently floating patches.

@Fishrock123
Copy link
Contributor

We would need to account for the currently floating patches.

Ah yes. This would be much easier without those.

Could we have a script that clones npm@latest-stable down and then uses @othiym23's stuff* to apply the patches?

*I forget what exactly it is.

@rvagg
Copy link
Member Author

rvagg commented Apr 16, 2015

my take-away from the brief discussion @ the TC call today was that ppl are generally in favour of this (although Ben has already stated his reservations about downloading) but that nobody is willing to put up their hand to do the work, neither I nor @othiym23 specifically.

@Fishrock123
Copy link
Contributor

This would be far easier if npm's deps "just worked" for io.js

General thought:

  • make a bundled-npm package on npm that is npm with deps bundled.
  • grab that and unzip via script when producing builds / npm test / told to make install with it.

But, we have stuff to put ontop of that, so while maybe that could work, we'd need instead:

  • a bundled-npm-iojs that also uses git instead of just npm
  • @othiym23's scripts within that module to apply the git patches
  • same as above

Maybe this is too simple, I'm not sure.

@othiym23
Copy link
Contributor

make a bundled-npm package on npm that is npm with deps bundled.

npm is already shipped with all of its dependencies bundled, and has been since the beginning (otherwise it would be incapable of bootstrapping itself)

@othiym23's scripts within that module to apply the git patches

This is a very simple use of quilt and git-quiltimport. It is my strenuous and enduring hope that node-gyp will be updated soon enough to render this a moot point, because I don't like maintaining floating patches any more than io.js / Node core do.

a bundled-npm-iojs that also uses git instead of just npm

Can you expand on what this might be?

@othiym23
Copy link
Contributor

I moved stuff over from quilt (which is likely to feel extremely archaic and weird to anybody who's not already familiar with it) to using Stacked git, which has a workflow not entirely dissimilar from regular git. Don't try to learn anything from the homepage, which is pretty much useless, but check out the tutorial.

Here's how I created the patch series, and how I land new npm versions in io.js, for now:

  1. cd npm
  2. git checkout vX.Y.Z (where vX.Y.Z is the latest stable version)
  3. make release
  4. cd ../iojs
  5. git fetch --all
  6. git checkout v1.x (this will change to master as soon as I'm told that's where we want it)
  7. git merge --ff-only origin/v1.x
  8. git checkout -b npm-X.Y.Z
  9. rm -rf deps/npm
  10. cd deps
  11. tar xfz ../../npm/release/npm-X.Y.Z.tgz
  12. cd .. && git add deps/npm && git commit -m "deps: upgrade npm to X.Y.Z" && git rebase --whitespace=fix v1.x

That's the basic workflow. To create the patch quilt, I then did (the first time):

  1. stg init
  2. stg pick 742b68d (this is @cjihrig's patch combining the changes to to get node-gyp working with io.js)
  3. stg pick 71f9251 (this is @piscisaureus's patch with the Windows fixes to fix the DLL name issue)
  4. stg export, which creates a directory named patches-npm-X.Y.Z with the two patches + the control file
  5. mv patches-npm-X.Y.Z ../iojs-npm-patches

Then, to replay that later, I just do

  1. stg init on the npm-X.Y.Z branch
  2. stg import --series ../iojs-npm-patches/series

Afterwards, I

  1. git push npm npm-X.Y.Z
  2. create a PR including the relevant bits of the changelog

There are additional steps you can take to let stg know that the patches have been merged upstream and to clean up the local branch, but there really isn't a point, because typically the branch doesn't live beyond this point, so I end up just deleting the local branches.

Automating that and incorporating it into the build process shouldn't be too tough, but it's more than I've got time to deal with.

@Fishrock123
Copy link
Contributor

npm is already shipped with all of its dependencies bundled, and has been since the beginning (otherwise it would be incapable of bootstrapping itself)

Doh. That will make things a little easier.. at some point. :)

a bundled-npm-iojs that also uses git instead of just npm

Can you expand on what this might be?

Maybe something like maintaining an npm clone as a submodule with deps so that we can float patches, and publishing it to something like that.

The point is to offload the patch business elsewhere still so that our build & users don't have to do it.

  1. git checkout v1.x (this will change to master as soon as I'm told that's where we want it)

I'll warn you already; that will be from now on. :)

Automating that and incorporating it into the build process shouldn't be too tough, but it's more than I've got time to deal with.

That's fine, thanks for the write up. I'll see what I can do I suppose.

Also, really wish this could be done with just git but oh well.

@othiym23
Copy link
Contributor

Also, really wish this could be done with just git but oh well.

The way I was moving those changesets around before was just using git cherry-pick on the two shasums for @cjihrig and @piscisaureus's commits, but that felt pretty heuristic, and the process above took longer to write down than it did to set up.

@Fishrock123 Fishrock123 self-assigned this Apr 27, 2015
@chrisdickinson
Copy link
Contributor

Apologies, I probably should have poked my head into this earlier. If it's too late, feel free to ignore me!

I'm not sure I like the ./configure --download-npm route, for similar reasons as @bnoordhuis.
I also don't think that dependency-updating commits are likely to make download times much larger, so long as we're eagerly deleting image files from the source tree. Text files will be eagerly compressed+delta'd by git.

I'd prefer a script – tools/update-dep.js – that would take a dependency name + version and perform the following tasks:

  1. Download the tarball from a configured URL template and verify the shasum.
  2. Extract it to the target name.
  3. Commit the results.
  4. Run tools/deptool/<name>/0000-scriptname.js through tools/deptool/<name>/NNNN-scriptname.js against the tree, re-committing the tree after each script runs so that the results of a script run can be reviewed during PR.

This lets us:

  • Continue to use the process of vendoring our dependencies in git.
  • Separate the results of different "patches" out.
  • We have the ability to address changes that cannot be stored in patches (for example, switching all of openssl's links to #includes, when they may add more links in a new version).
  • Keep the result of running ./configure with no args close to what the release builds will produce.

Thoughts on this?

@Fishrock123
Copy link
Contributor

So posterity, the route I was going to try implementing before this came up in irc is:

  1. add --download=npm to configure.
    • download the latest stable npm, unless otherwise specified somehow
  2. install with npm if it exists, so long as configure was not run with --without-npm.
  3. use --download=npm in some way for the build, so that releases still ship with npm (including tarballs)

Now, since npm currently does not work with io.js, we'd have to have it to download a patched npm that does work with io.js.

This probably means maintaining a clone of npm until npm works without patches.
This saves it from having to be patched on build. I'm willing to do the grunt work for this, although it wouldn't be anything more than we already do, just not in this repo.

This is a temporary hassle, but once we don't have to patch npm, it would only be the above 3 points.


@chrisdickinson

Continue to use the process of vendoring our dependencies in git.

npm isn't really a dependancy per say. We don't actually use it in core, we just ship with it. (and should. we still would ship releases with it.)

Separate the results of different "patches" out.

If you are talking about patches to npm's deps, those would still be available on the temporary clone.

We have the ability to address changes that cannot be stored in patches (for example, switching all of openssl's links to #includes, when they may add more links in a new version).

How does this apply to npm?

Keep the result of running ./configure with no args close to what the release builds will produce.

It still will once you have a version of npm.

EDIT: updated for clarity

@chrisdickinson
Copy link
Contributor

I think there's generally a need for a tool that downloads new versions of deps and applies modifications to them – that way those modifications are local to our repo, versioned, and can be reviewed. It'd be nice if we could make upgrading deps as straightforward as running the tool with a depname and version, then issuing a PR with any new/changed modification scripts + the results of running those scripts.

npm isn't really a dependancy per say. We don't actually use it in core, we just ship with it. (and should. we still would ship releases with it.)

The node package installers expose two executables – npm and node – and we depend on npm to make the first of the two available. Our code may not rely on npm being there, but our users rely on our package to contain npm. It might be a peer dep, or a transitive dep, but many humans & their computers depend on us to make npm available, so the io.js package depends on npm.

If you are talking about patches to npm's deps, those would still be available on the temporary clone.

This keeps all patches / modifications local to the context they are used in – io.js – instead of a cloned copy of the dependency.

How does this apply to npm?

A script could run to delete all image files out of the source tree after import, further reducing the size of a fresh clone. Or a script could run to ensure proper file casing.

@Fishrock123
Copy link
Contributor

@rvagg do you still want this for a convincing reason? I'm basically getting "nah" for every option, and that the current state is o.k. (Which I somewhat agree with by now.)

@rvagg
Copy link
Member Author

rvagg commented May 23, 2015

yeah, I never had a deep conviction about this, it's more of an ideal for me that we may work to one day, I just find the whole bundling process very noisy, particularly now with the pace of development on both projects.

to be revisited one day!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issues opened for discussions and feedbacks. meta Issues and PRs related to the general management of the project. npm Issues and PRs related to the npm client dependency or the npm registry.
Projects
None yet
Development

No branches or pull requests

9 participants