Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add triggersRerun setting to control fingerprint inheritance #238

Merged
merged 17 commits into from
Nov 7, 2022
Merged

Conversation

aomarks
Copy link
Member

@aomarks aomarks commented May 16, 2022

Background

Currently, if script A depends on script B, then script B's fingerprint is automatically included in script A's fingerprint. That means if an input file for script B changes, which causes B to re-run, then script A will re-run too — regardless of whether script B actually emitted different output.

This is a good and safe default because it means you aren't required to specify the input files for every script when those input files are generated by a dependency; Wireit assumes that any time script B runs, script A could be affected. However, it comes with the trade-off that scripts will sometimes re-run even when none of their input files changed.

This PR

This PR adds a "triggersRerun": false setting that can be annotated on dependencies. This prevents the fingerprint from being inherited, so running B won't necessarily cause A to run as well.

This can be used for more optimal builds via fewer re-runs, but requires that input files are always fully specified.

This will also be very useful for service mode, because a triggersRerun:false dependency won't require a restart across watch iterations. (Restarts will happen only if the fingerprint is different between iterations). For example, the script that builds the service itself can be a regular dependency (causing restarts), but the script that builds the assets served by the script can be a triggersRerun:false dependency (not causing restarts).

Note in order to have a place to put this annotation, dependencies can now be objects. They can still be plain strings as well, but must be objects to receive annotations. (This also opens the door for other annotations on dependency edges we may want in the future. E.g. #23 is probably better expressed as a "workspaces": true annotation instead of a magic $WORKSPACES: prefix as previously planned).

Fixes #237

Example

In this example, bundle has a triggersRerune:false dependency on build. Importantly, it also includes lib/**/*.js in its files array, which are the specific outputs from tsc that rollup consumes. Including these input files wasn't necessary before, but with triggersRerun:false it is now critical.

The advantage of this configuration is that if tsc re-runs but doesn't produce different .js files (for example, if it only produced different .d.ts files), then rollup won't need to re-run. We've also excluded the lib/test directory, because we know test files aren't included in our bundles.

{
  "scripts": {
    "build": "wireit",
    "bundle": "wireit"
  },
  "wireit": {
    "build": {
      "command": "tsc",
      "files": ["src/**/*.ts", "tsconfig.json"],
      "output": ["lib/**"]
    },
    "bundle": {
      "command": "rollup -c",
      "dependencies": [
        {
          "script": "build",
          "triggersRerun": false
        }
      ],
      "files": ["rollup.config.json", "lib/**/*.js", "!lib/test"],
      "output": ["dist/bundle.js"]
    }
  }
}

src/analyzer.ts Outdated Show resolved Hide resolved
src/analyzer.ts Outdated
}
} else if (maybeUnresolved.type === 'object') {
specifierResult = findNodeAtLocation(maybeUnresolved, ['script']);
if (specifierResult == null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this is specifically == to catch both undefined and null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this style from the other code @rictic wrote in this file. I have been in the habit of using === undefined when we know we don't need to handle null (a habit I learned from @justinfagnani) -- but maybe @rictic has a reason to use == null here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use == null to signify that either we do need to handle both (good), or that I'm not sure (bad - and I leave a TODO).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched everything to check === undefined when the type indicated that it could not be null (which is almost everything).

Copy link
Member

@rictic rictic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a jump to definition and a codeaction test in an object dependency

@@ -14,33 +14,57 @@
},
"command": {
"markdownDescription": "The command to run.\n\nThis is a shell command that will be executed, with all binaries from npm dependencies and devDependencies available.\n\nFor example:\n\n```json\n\"command\": \"tsc\"\n```\n\nFor more info, see https://docs.npmjs.com/cli/v8/using-npm/scripts#environment",
"type": "string"
"type": "string",
"minLength": 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh nice

schema.json Outdated
"minLength": 1
},
"soft": {
"markdownDescription": "If `false` (the default), the cache key of the dependency is automatically included in this script's cache key. This means that whenever the dependency re-runs, this script will re-run too, even if the output produced by the dependency didn't change.\n\nIf `true`, Wireit won't assume that this script needs to re-run just because the dependency re-ran. Instead, the dependency will still run first and be kept up-to-date, but whether this script runs is entirely determined by `files`. Be sure that any input files this script needs from the dependency are specified in `files`.\n\nFor more info, see https://github.com/google/wireit#soft-dependencies",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs on hover should lead with the operationally useful information first, then give more background in subsequent paragraphs, or link to the docs site.

What do you think about this for an opening paragraph:

When `false` (the default), whenever this dependency runs, this script 
(the dependent) will be marked stale and need to re-run too. When soft is 
`true` Wireit won't assume that the dependent is stale just 
because the dependency ran. This can reduce unnecessary re-building 
when `files` captures all of the relevant output of the dependency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I like the way this is phrased. Also updated the README docs more along these lines. PTAL.

schema.json Outdated
},
"soft": {
"markdownDescription": "If `false` (the default), the cache key of the dependency is automatically included in this script's cache key. This means that whenever the dependency re-runs, this script will re-run too, even if the output produced by the dependency didn't change.\n\nIf `true`, Wireit won't assume that this script needs to re-run just because the dependency re-ran. Instead, the dependency will still run first and be kept up-to-date, but whether this script runs is entirely determined by `files`. Be sure that any input files this script needs from the dependency are specified in `files`.\n\nFor more info, see https://github.com/google/wireit#soft-dependencies",
"enum": [true, false]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to prefer enum over boolean? IDK either way, just curious

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason, I just copied it from the "enum": [true, false, "if-file-deleted"] that was already here. boolean makes sense though.

src/analyzer.ts Outdated Show resolved Hide resolved
src/script.ts Outdated
@@ -31,7 +31,8 @@ export interface ScriptReference extends PackageReference {

export interface Dependency<Config extends PotentiallyValidScriptConfig> {
config: Config;
astNode: JsonAstNode<string>;
specifier: JsonAstNode<string>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term specifier is much clearer, +1

src/executor.ts Outdated
@@ -310,7 +310,7 @@ class ScriptExecution {
}

async #executeScript(
dependencyStates: Array<[ScriptReference, ScriptState]>
dependencyStates: Array<[Dependency<ScriptConfig>, ScriptState]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection, ScriptConfig should be the default value of the generic parameter to Dependency, and only potentially invalid dependencies should be specially marked

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 done

@aomarks
Copy link
Member Author

aomarks commented May 17, 2022

In a discussion @rictic and I had yesterday, we mulled over names that might be better than "soft" for this feature.

We considered some names that tried to encode more meaning, things like:

  • output-in-input-files: true
  • inherits-fingerprint: false
  • always-runs: false

But we had trouble thinking of one that felt really clear. One way these names can be confusing is that it can be ambiguous what the directionality is (e.g. is it the dependency that always runs, or the dependee?)

I think we agreed that there might actually be a benefit to using a fuzzy term like "soft", because it basically requires the user to look at the actual definition (in the docs or by hovering if they have the vscode extension), rather than giving them a false impression. Hopefully it is then memorable.

@rictic suggested "weak" as an alternative to "soft". I prefer "weak" as well.

Any other thoughts from reviewers here?

@aomarks
Copy link
Member Author

aomarks commented May 17, 2022

Please add a jump to definition and a codeaction test in an object dependency

Added a test, but the originSelection squiggles are off by 2 spaces, and I haven't figured out why. Does anything jump out to you as to why, @rictic (see latest commit)?

2 spaces matches the indentation level, so I'm guessing it's something to do with finding the range for the parent object, instead of the string itself. But I can't see why, because we're getting the range for dependency.specifier, which is the AST node for the string, regardless of whether it was inside an object.

@aomarks aomarks requested a review from rictic May 17, 2022 16:17
~~~`,
// TODO(aomarks) The ~~~ is 2 spaces ahead of where it should be.
originSelection: `
"script": "b"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I bet I know what this is. The comparison that we use ignores leading and trailing whitespace. It probably should just ignore leading newlines. I bet if you add a couple of spaces to this line it will fix it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah that was it, done.

@rictic
Copy link
Member

rictic commented May 17, 2022

We might want to release a new version of the extension before we release wireit, so that by the time people update wireit they'll probably already have the version extension that supports object deps

@rictic
Copy link
Member

rictic commented May 17, 2022

Ultimately, I think that soft is a little better than weak. Weak in the sense of WeakMap or WeakRef suggests to me that the dependency might not run, rather than the dependent.

Soft doesn't have that association so I think it's slightly better.

@augustjk
Copy link
Collaborator

I like soft over weak for reasons @rictic mentioned above.

Something that occurred to me though, would there be any benefit to having a script level option that treats all of its dependency as soft? That would be something like always-use-files-for-freshness or file-only-fingerprint kind of thing?

Since putting even just one dependency that is soft means the user must be sure to specify a comprehensive files input for the script, I wonder if it makes more sense that it's at the script level, rather than per dependency.

@aomarks
Copy link
Member Author

aomarks commented May 17, 2022

Something that occurred to me though, would there be any benefit to having a script level option that treats all of its dependency as soft? That would be something like always-use-files-for-freshness or file-only-fingerprint kind of thing?

Since putting even just one dependency that is soft means the user must be sure to specify a comprehensive files input for the script, I wonder if it makes more sense that it's at the script level, rather than per dependency.

If a dependency is soft, you only need to be sure to specify the input files you care about from that one dependency. And there are legitimate cases where you want some dependencies soft, and some not.

I think allowing you to set it for a whole script could be a little dangerous, because you might set that flag, and then later add a dependency without actively considering whether it's safe for that dependency to be soft. With it being explicit per dependency, you're more likely to think about it on a per-dependency basis.

src/analyzer.ts Outdated
}
} else if (maybeUnresolved.type === 'object') {
specifierResult = findNodeAtLocation(maybeUnresolved, ['script']);
if (specifierResult == null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use == null to signify that either we do need to handle both (good), or that I'm not sure (bad - and I leave a TODO).

src/executor.ts Outdated
): Promise<ScriptState> {
let allDependenciesAreCacheable = true;
const filteredDependencyStates: Array<
[ScriptReferenceString, ScriptState]
> = [];
for (const [dep, depState] of dependencyStates) {
if (dep.soft) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@aomarks aomarks changed the title Add soft dependencies, which don't inherit cache keys Add soft dependencies, which don't inherit fingerprints May 17, 2022
@aomarks aomarks enabled auto-merge (squash) May 17, 2022 20:57
@aomarks aomarks disabled auto-merge May 17, 2022 20:58
@aomarks
Copy link
Member Author

aomarks commented May 17, 2022

I had a discussion with @justinfagnani about this change, and I've written up a draft proposal for an alternative way to think about achieving the same ends, but in a way that might be more intuitive and useful: #245

@aomarks aomarks changed the title Add soft dependencies, which don't inherit fingerprints Add rerunOnChange setting to control fingerprint inheritance Nov 6, 2022
@aomarks aomarks changed the title Add rerunOnChange setting to control fingerprint inheritance Add triggersRerun setting to control fingerprint inheritance Nov 6, 2022
@aomarks
Copy link
Member Author

aomarks commented Nov 6, 2022

This PR is ready for another review.

  • I've merged in the new services changes. For services, this setting controls whether the service needs to restart or not in watch mode when that particular dependency changes.

  • Added tests and documentation about how this setting interacts with services.

  • I've renamed soft:true to triggersRerun:false. Although it's more verbose, I think it gives a better impression of what the setting does. I think it also works for both standard and service scripts.

    I would still like to pursue the idea of "output slices" described in Proposal: changes to output and fingerprinting #245 later, but I think we can still land this in the meantime. If we do add output slices, then that will be an additional, more sugary way to express the same thing for some use cases.

    However, I think this setting will still have a place -- [1] for services that read output dynamically (where it's not a question of which output you consume, it's more about whether the binary reads it at startup time or request time), and [2] because maybe sometimes it won't make sense to have your dependency declare an output slice (maybe it's just weirdly specific, or maybe you don't own that other script).

cc @justinfagnani

@aomarks aomarks requested a review from justinfagnani November 7, 2022 22:14
@aomarks aomarks merged commit 98e025e into main Nov 7, 2022
@aomarks aomarks deleted the soft branch November 7, 2022 23:40
@aomarks aomarks mentioned this pull request Nov 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Option to stop inheritance of cache keys from dependencies
4 participants