Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Dependency Files as Inputs With Symlinked Packages #944

Open
ObliviousHarmony opened this issue Oct 26, 2023 · 6 comments
Open

Comments

@ObliviousHarmony
Copy link
Contributor

Problem

Wireit does not provide a way to easily include the files of symlinked dependencies in fingerprinting. Unlike normal dependencies, changes to these will not be reflected in a lock file and will not change the fingerprint. This can be somewhat addressed by using the "dependencies" feature in Wireit, however, this is not always the case:

  • Repositories that use wireit in isolation in each package and rely on another solution for executing dependency and transitive dependency scripts. These won't define "dependencies" in wireit since that would result in multiple executions of the same scripts.
  • Packages that do not use wireit don't define files to fingerprint or dependencies to include transitively.
  • Packages with vanilla JS or private 'devDependencies' may not have a script to depend on. This can (probably?) be solved using a Wireit-only script that does nothing but define files and dependencies, but that feels like a hack.

Proposed Solution

Provide a new dependencyFiles option that treats the files included in dependencies and transitive dependencies as fingerprinting inputs. By default this should include dependencies, devDependencies, and peerDependencies since changes to any of these might result in changes to outputs. This option should have these settings:

  • true: Enables the feature.
  • exclude-dev: Enables the feature and ignores devDependencies.
  • false: Disables the feature (default).

Performance Considerations

While it would be easy to simply include all files, in the case of a symlink, there will be many files that we have no business treating as inputs. There are some steps we can take to fingerprint the smallest number of files possible:

  • Only include dependencies that don't cause package lock file changes. We can check for file: dependencies as well as patterns used by workspaces in currently supported package managers.
  • Only fingerprint files included by NPM.

Benefits

Ultimately the goal here is to provide a way for wireit to be used with an individual package in isolation. By treating dependency files as inputs we can rely on wireit to handle caching and watching for a package regardless of whether or not its dependencies are using it too.

In my particular use-case, I am using pnpm and want to rely on --filter to provide a consistent syntax for developers to run commands in our monorepo. Using Wireit's "dependencies" means having various side-effects when using --filter on scripts that use it. Including dependency outputs as package inputs means that each one can be built independently and we can rely on our own tooling to handle hierarchical tasks.

This doesn't seem incompatible with the existing "dependencies" functionality either since downstream file changes would happen prior to the generation of a new fingerprint upstream. It will also play nicely with --watch since downstream changes would automatically cascade.

Alternatives Considered

Add node_modules to "files"

You can technically already get something kind of like this by pairing the "allowUsuallyExcludedPaths" setting with explicit node_modules globs. The caveat here is that you'll need to be very explicit in order to minimize unnecessary inputs. By relying on package.json files to read dependencies we will avoid the headaches and pitfalls associated with manually maintaining a list like this. You can avoid these pitfalls with a custom tool but I believe this feature makes more sense to implement directly in Wireit.

@ObliviousHarmony
Copy link
Contributor Author

If this is something that would be a good fit for Wireit I wouldn't mind taking a crack at implementing it!

@aomarks
Copy link
Member

aomarks commented Oct 26, 2023

So am I understanding correctly?

You have package A, which depends on package B. Package A uses wireit, but package B does not. You want A to re-run whenever B changes.

I think the first solution I would reach for in this situation is to just include the relevant files of B in the "files" array for A. To keep things organized, you could create a "files-only" target and use that as a dependency:

{
  "name": "a",
  "wireit": {
    "build": {
      "files": [
        "lib/**/*.js"
      ],
      "dependencies": [
        "b:files"
      ]
    },
    "b:files": {
      "files": [
        "../b/lib/**/*.js"
      ]
    }
  }
}

Would something like that be a good solution for your use case?

@ObliviousHarmony
Copy link
Contributor Author

The case of a package using wireit while another does not was just another one that I noticed when I was writing this proposal. In my specific use-case I have a number of packages with this kind of config:

{
  "scripts": {
    "build": "pnpm --filter=\"...$npm_package_name\" run build:project",
    "build:project": "wireit"
  },
  "wireit": {
    "build:project": {
      "command": "tsc --project tsconfig.json",
      "files": [
        "tsconfig.json",
        "src/**/*.{js,jsx,ts,tsx}",
        "typings/**/*.ts"
      ],
      "output": [
        "dist/**"
      ]
    }
  }
}

When I run pnpm build for this package it will run build in all of the dependent packages first (with concurrency) and then the top-level package. wireit will make sure that any changed packages are updated, but without additional configuration, these changes won't cascade. Although all of these packages do use wireit, if we used the "dependencies" feature, the build script would double-cascade when using --filter.

Would something like that be a good solution for your use case?

This is an interesting solution and I think it would work if also paired with one for #23 that automatically reads workspace dependencies. Assuming that each dependency had a wireit-only "files" script like this, a configuration that depended on all of its workspace "files" would have the same result. This could then be depended on by the "build" script like in your example and cascade file changes but not actual scripts.

@ObliviousHarmony
Copy link
Contributor Author

For the use-case of a non-wireit dependency, however, you'd also need to define all of the transitive dependencies too and honestly it sounds like a bit of a mess. Maintaining lists of dependencies by hand separately from the typical NPM dependencies seems a little error-prone and likely to result in accidental drift.

@ObliviousHarmony
Copy link
Contributor Author

One thing I do like about this proposal though is that it feels more intuitive. Package lock files are included in the fingerprint because we are acknowledging that changes in dependencies should invalidate the cache. Symlinked dependencies aren't included in that fingerprint but should also invalidate the cache. Since only the output files have any bearing on the current package, including those in the fingerprint just seems natural.

@ObliviousHarmony
Copy link
Contributor Author

I implemented this in our monorepo using a.pnpmfile.cjs hook file. It grabs all of the workspace dependencies for each project and constructs a config that looks like this:

"wireit": {
	"dependencyOutputs": {
		"allowUsuallyExcludedPaths": true,
		"files": [
			"node_modules/@woocommerce/eslint-plugin/configs",
			"node_modules/@woocommerce/eslint-plugin/rules",
			"node_modules/@woocommerce/eslint-plugin/index.js",
			"package.json"
		]
	}
}

It's definitely not as pretty as a native solution in Wireit could be but It does feel intuitive when paired with a different solution for task orchestration. We can use Wireit alongside PNPM's --filter option without having to worry about anything and it supports Wireit's full feature set. This ends up working out better in that we don't have to try and get folks to remember some commands can use --filter while others cannot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants