Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

package is published with all test files #235

Closed
jason-ha opened this issue Jan 20, 2021 · 14 comments
Closed

package is published with all test files #235

jason-ha opened this issue Jan 20, 2021 · 14 comments

Comments

@jason-ha
Copy link

Published package should be limited to content required to leverage it and exclude tests files. Test file currently are over half of the 1.19.0 package.

@ljharb
Copy link
Member

ljharb commented Jan 20, 2021

Tests should be published with every package, so that npm explore foo && npm install && npm test always works.

Installed package size is irrelevant; if you never require the test files, they'll never affect the runtime of your application.

Duplicate of #58. Duplicate of #44. See #105 (comment).

@ljharb ljharb closed this as completed Jan 20, 2021
@jason-ha
Copy link
Author

Tests should be published with every package, so that npm explore foo && npm install && npm test always works.

I will actively disagree that such a sequence commands should always hold. For the projects I have surveyed that does not hold nor is it required. Testing should be performed prior to any publish. A package could incorporate a self-validate process, but full test is not of interest - go to the source if you want that.

Installed package size is irrelevant; if you never require the test files, they'll never affect the runtime of your application.

Duplicate of #58. Duplicate of #44. See #105 (comment).

This is not about the binary size of the package, but about surface area of code. I came to this issue by means of security scanning. The test content in this package raised issues. Excluding content that is not required reduces possible security issues. Please help enhance reliability by limiting content to that which is actually required.

@ljharb
Copy link
Member

ljharb commented Jan 22, 2021

The source might not be available after installation, if the machine is offline.

Unfortunately, automated security scanning of files that are merely present on the filesystem is not a viable technique - instead, build a dependency graph, and only scan the files you use.

@jason-ha
Copy link
Author

Sorry not seeing the applicability of machine being offline. Testing for the package can be done once when the package (or new version) is first used. No need to run testing again until there is another change. In those cases machine must be online in order to get package.

I agree a dependency graph is ideal. However there is risk if dependencies are ever misunderstood. As code evolves that is likely to occur. A complete scan of all things present is more robust. I think we can agree limiting exposure will always be more secure pattern.

@ljharb
Copy link
Member

ljharb commented Jan 27, 2021

I often need to run testing on installed dependencies, long after initial installation, when debugging an issue.

@jason-ha
Copy link
Author

jason-ha commented Mar 3, 2021

Not a flow I am following. You are running tests on a package installed a long time ago and nothing has changed in the meantime? And expecting different results than syncing source to said version and running the tests? Syncing source should always be an option, right? I am assuming it is more than just convenience; otherwise I don't see that outweighing limiting exposure.

@ljharb
Copy link
Member

ljharb commented Mar 3, 2021

@jason-ha many things may have changed in the meantime - for example, the node version itself, or my own code that’s now calling the package with different options.

@jason-ha
Copy link
Author

jason-ha commented Mar 3, 2021

Okay node version, though that feels like a bit of a stretch -- well, let's make sure we are talking about the same situation. I think we are talking about installing resolve package per another use and testing resolve package in that context, right? So you try to keep old (& random?) versions of resolve validated in this way? Seems like a maintenance nightmare if trying to support all of the major and minor versions of resolve.

I don't understand how your own code calling "the package" with different options is going to change test results of "the package". The test should be contained. I don't see that impacting the use case mentioned before: npm explore foo && npm install && npm test.

BTW I really do want to understand. I hope at the end of this you can either convince me that more packages should have at least some test packaged with it and that is worth security complications or we find patterns that work well without test code packages and there are less security complications. Thanks for continuing the discussion.

@ljharb
Copy link
Member

ljharb commented Mar 3, 2021

As the maintainer of resolve, I don't need resolve itself to work this way, because I have the repo on my machine already.

However, let's say I wasn't the maintainer of resolve. Then, I'd want to be able to have it installed in my node_modules, and not have to have internet access (i could be on a plane, or traveling, or the power's out) nor have to track down the repository location (which might not even be permanent, because repos can be deleted, companies can go out of business, or i might be in a country in which access to a website is blocked), in order to review the contents of the dependency.

"review" might mean reading the tests (which serve as a form of documentation) or running the tests (to verify that, in my exact scenario, the maintainers' expectations hold) or making a temporary patch in preparation of an upstream PR and verifying it.

@jason-ha
Copy link
Author

jason-ha commented Mar 4, 2021

As a user of resolve, I currently value not having the test code more. And that is true for all of the packages my org is using.

In the use pattern you listed, you will still require internet access per the npm install step. I don't think the use case for the pattern is high when offline. I am would be curious for the things you listed how many times in the past 3 months you have exercised them.

I also would recommend against using packages that are not maintained or can't be easily found. You should probably be looking for a replacement.

I am sure this is a trade-off/balance issue. Let's assume internet access for a moment, there are processes to do everything you talk about that does not require test code in package. It also seems like you don't really want the maintainer's tests. It sounds like you want a set of acceptance tests. You care about how the package behaves in your certain case; so you can develop tests that confirm that. These can be kept functional without having to setup a special situation. (npm install in the context of that package in your node_modules does not give you the same environment as your use of that package anyway.) Such tests even help you when you pick up a new version - and ready to run without extra explore/install.

@ljharb
Copy link
Member

ljharb commented Mar 4, 2021

Having the test code is an inconvenience for you; not having it makes a different use case impossible, and "possible" should always override "convenient".

I would require internet access for the initial npm install step, but internet access is a transitory thing. By the time I need the test files, I might not have internet access - and in non-covid times, i would have used that many times in a 3 month period, because I'd be traveling.

@jason-ha
Copy link
Author

jason-ha commented Mar 8, 2021

I agree "possible" trumps "convenience". But it seems that suggests not having the test code present.
a) For a fully robust security scan we want to scan all of the files present and
b) there is no mechanism present to remove test code from packages or to avoid download in the first place except to exclude from package.
c) There are options available for your case which seem to have advantages over current practice:

  1. Get the source. Beyond the current state of package you have installed, you can probably see if there are recent changes related to issue you are working on. And assuming source control lit git, no internet access required. As you develop a test case that you care about, contribute that back to the source to ensure continued support.

  2. Write acceptance tests. Similar advantage of keeping track of the particular case(s) relevant. I think you are indicating that you modify tests a lot (and sure pre-covid), but do you keep making the same modifications over-and-over?

So npm explore && npm install and later npm test are a convenience, whereas avoiding test code is currently an intractable problem.

I don't want to inconvenience any productive workflow. To make both sets a matter of convenience, I think it require changes to npm packaging standards. Ideally there'd be structure and commands to allow specifying whether you want to install test code or not. Lacking that simplicity of a command a structure or properties that identify product code versus test versus documentation, etc. would work. Maybe you have a simpler suggestion?

To get a more complete picture of your process when traveling, you run through all installed packages for all of the things you might be working on and npm explore & install? You have some script for that?

@ljharb
Copy link
Member

ljharb commented Mar 8, 2021

The nature of JavaScript is that it's impossible to do a "fully robust security scan" merely based on the filesystem. You have to scan your program's actual dependency graph.

Additionally, even if I updated all of my packages to remove test code, there are packages in wide use that ship tests that will never again be updated, so it's simply not a solvable problem.

Not all of them; I do it manually as I debug and the need arises.

@jason-ha
Copy link
Author

jason-ha commented Mar 9, 2021

I am not sure I know what you mean when claiming it is impossible to do a security scan based on files. In any case, we still want to do the best security scan that we can. And it is absolutely the case that systems will be more successful when we can reduce the scope of what is possible. Reducing surface area is a best practice.

There is no ask here for you to be responsible for packages you have no influence over. By changing the pattern for most packages the issue becomes tractable. If there are a small handful of never changing packages, then those can be dealt with. They won't be changing which reduces some risk vectors from them and makes special casing possible.

For your process, you have some investment beforehand to know what you might want rather than tackling any general issue. If you miss something, then I'll assume you don't work on that issue or work without it. Okay. Do you also run test for those as part of prep or otherwise check that test later is likely to run? It might be interesting to know how often packages are repeated targets for this prep, but I am not expecting you to have any real data on that.

I am curious to see conversations that have gone the other way in recent history - where there are requests to add test code to packages. Unfortunately the most reliable way to get that here would be to remove the test code and see if anyone raises an issue. Only thought I would have is to have some kind of survey, but how would you get the right participation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants