-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add @source not
support
#17255
Open
RobinMalfait
wants to merge
17
commits into
main
Choose a base branch
from
feat/source-not
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Add @source not
support
#17255
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
41102c9
to
39588cf
Compare
d3700cc
to
41e1ac2
Compare
This is a chunky commit which includes a refactor and all the necessary changes. Sorry that it's a bit confusing to read because so many things changed. The big picture changes: 1. We were dealing with `GlobEntry` enums, but we split this up to better model what's going on. - `GlobEntry`: literally represents a `Glob`. We still return these so our frontends like `@tailwindcss/postcss` can still use them. - `SourceEntry`: in the `node` crate, a `SourceEntry` is what used to be a `GlobEntry` but also has a negated flag now. This represents a `@source not '…'` for example. This will map to a `PublicSourceEntry` inside of the `oxide` crate. - `PublicSourceEntry`: in the `oxide` crate, this is exactly the same structure as the `SourceEntry` in the `node` crate. - `SourceEntry`: this is an enum that better represents the domain we are dealing with.: ```rs pub enum SourceEntry { Auto { base: PathBuf }, IgnoredAuto { base: PathBuf }, Pattern { base: PathBuf, pattern: String }, IgnoredPattern { base: PathBuf, pattern: String }, } ``` 2. One big realisation we had is that auto source detection and the ignored files/extensions/... _basically_ map to a giant set of `.gitignore` rules. This is also how we modeled it now. The `Scanner` will get a set of `sources` (`Vec<SourceEntry>`) and generates the necessary `.gitignore` rules. One nice benefit with this approach is that it unlocks a few features for free: 1. No need to duplicate the rules in several spots anymore (scanning files initially, scanning files in subsequent calls, scanning specific files via `scanFiles()`). They all use the same "rules". 2. No need to potentially do multiple file system walks anymore. All the rules are setup once, and we will be able to walk over the files and folders that adhere to the rules. Bonus points: _if_ you reach a folder that we know with 100% certainty is ignored by the gitignore rules, then we can skip that folder entirely. 3. `@source` order matters, but it allows you to explicitly ignore an allowed file, or explicitly allow an ignored file. 3. We split the internal `Scanner` API, it essentially looks like this: 1. The `Scanner::new(sources)` sets up the world and creates a `walk_builder` (the big engine behind all of this) 2. `Scanner.scan_sources(…)` — the only thing it does is use the walk_builder and stores information based on the info we found: 1. Collect `dirs` 2. Collect `files` 3. Collect `extensions` 3. `Scanner.scan_content` — this is what the `@tailwindcss/cli` calls via `scanner.scanFiles(…)`. This also just registers the changed files and content into the `scanner`. For `ChangedContent::File(_)` we verify that the file is allowed by all the rules mentioned above. 4. `Scanner.extract_candidates` — essentially take whatever is stored in `self.changed_content` and parse these files by extracting all the candidates. This is used by `scan_sources` and `scan_content`. Co-authored-by: Philipp Spiess <hello@philippspiess.com>
Essentially we used to combine files and globs into a shared `paths`. Now these are split again, and instead of using tuple syntax you can get them from a `ScanResult`. Additionally, we will _not_ add `**/*` (Auto Source Detection) by default, you have to be explicit about this.
The gist of it is that we now are dealing with `SourceEntry` instead of `GlobEntry`. A `SourceEntry` in the `node` crate maps to a `PublicSourceEntry`. This makes it easy for JS APIs to pass in the values. A `PublicSourceEntry` looks like: - `base` — the base path of the source - `pattern` — the glob pattern - `negated` — whether or not the pattern should be negated Internally we will map this to a `SourceEntry`, this is a proper enum that looks like this: ```rs pub enum SourceEntry { Auto { base: PathBuf }, IgnoredAuto { base: PathBuf }, Pattern { base: PathBuf, pattern: String }, IgnoredPattern { base: PathBuf, pattern: String }, } ``` Before we construct a Scanner, we will also make sure to optimize these patterns. Some optimization steps: 1. Each `pattern` will be brace-expanded and a new `SourceEntry` will be created. This allows us to always deal with simple patterns. 2. The `base` of each `SourceEntry` will be canonicalized so we are dealing with the real paths and symlinks are resolved. 3. When patterns include static parts such as `/src/*.html`, then the static part (in this case `src`) will be moved to the `base`.
This isn't strictly necessary for this PR, but noticed missing values when updating the `globs` -> `sources` and made TypeScript happy by adding this.
These tests will reflect 2 big changes: 1. New tests in general 2. Tests where we used auto source detection and `.gitignore` files, will now favor the `@source '…'` rules over the `.gitignore` rules. Note: If you _don't_ want this behavior, then you can use `@source not '…'` to override rules.
We still emit `globs` for `@tailwindcss/postcss`, but we also added better normalized globs based on your `@source` directives. We returned them as a new property to stay backward compatible, but this also means that the `@tailwindcss/cli` can make use of this.
Some people don't have `node_modules` in the `.gitignore` file so in some scenario's this will always be scanned. Another example is if you use the Vercel CLI to deploy your project because `.gitignore` files are not pushed... This is technically a breaking change, but now that we have `@source not '…'` you can include files/folders from `node_modules` even though the folder is ignored by default.
d1d31ef
to
97f9505
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a new source detection feature:
@source not "…"
. It can be used to exclude files specifically from your source configuration without having to think about creating a rule that matches all but the requested file:While working on this feature, we noticed that there are multiple places with different heuristics we used to scan the file system. These are:
@source "./my-dir"
)@source "./**/*.bin"
)Because of the different heuristics, we were able to construct failing cases (e.g. when you create a new file into
my-dir
that would be thrown out by auto-source detection, it'd would actually be scanned). We were also leaving a lot of performance on the table as the file system is traversed multiple times for certain problems.To resolve these issues, we're now unifying all of these systems into one
ignore
crate walker setup. We also implemented features like auto-source-detection and thenot
flag as additional gitignore rules only, avoid the need for a lot of custom code needed to make decisions.High level, this is what happens after the now:
@source
rules into a list of roots (that is the source directory for this rule) and optional globs (that is the actual rules for files in this file). For custom sources (i.e with a customglob
), we add an allowlist rule to the gitignore setup, so that we can be sure these files are always included.@source
rule, we create respective ignore rules.So, consider the following setup:
This creates a git ignore file that (simplified) looks like this:
We then use this information on top of your existing
.gitignore
setup to resolve files (i.e so if your.gitignore
contains rules e.g.dist/
this line is going to be added before any of the rules lined out in the example above. This allows negative rules to allow-list your.gitignore
rules.To implement this, we're rely on the
ignore
crate but we had to make various changes, very specific, to it so we decided to fork the crate. All changes are prefixed with a// CHANGED:
block but here are the most-important ones:.gitignore
rulesBehavioral changes
@source
now wins over your.gitignore
file and the auto-content rules.node_modules
and.git
folders as well as the.gitignore
file are now ignored by default (but can be overridden by an explicit@source
rule).@source not "…"
to negate any previous rules.content
rules in your legacy JavaScript configuration (e.g.content: ['!./src']
) now work with v4.@source
definitions matter now, because you can technically include or negate previous rules. This is similar to your.gitingore
file.Combining with other features
Note that the
not
flag is also already compatible with@source inline(…)
added in an earlier commit:Next steps
We are not finished yet with our quest to improve the
@source
directive. We still plan to address #16765 and resolve symlink-related issues.Test plan
@source not "…"
specific examples and updated the existing tests to match the subtle behavior changes[ci-all]
that, when added to the description of a PR, causes the PR to run unit and integration tests on all operating systems.[ci-all]