Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: correctly detect if file is outside base path on Windows #59

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

fasttime
Copy link
Member

@fasttime fasttime commented Jun 17, 2024

Prerequisites checklist

What is the purpose of this pull request?

Fix logic that detects when a file is outside the base path so it works well with Windows-style paths. Also fixed an edge case with paths that start with ".." but are not external like "..file.js".

What changes did you make? (Give an overview)

Windows paths have some unique features not shared by Unix paths. For one, absolute local paths start with a drive letter followed by a colon like "C:". The fact that the drive letter must be the first character in a local path means that a relative path cannot specify a drive. In Node.js, path.relative() will just return the absolute path of the second argument when it cannot calculate a relative path across different drives, so for example with path.relative("C:\\dir", "D:\\file.js").

Besides local paths, Windows uses UNC paths in the form \\Server\Volume\dir1\dir2\file to specify files on a network. Those paths are always absolute.

Both config-array and ESLint use relative paths to check if a file is inside a base directory: if the file's relative path from the directory starts with "..", then the file is not in the directory. This check is currently failing when the base directory and the file were located on different drives or servers, because as previously mentioned path.relative() doesn't handle those cases.

The fix I've come up with consists in calculating what Node.js calls namespace-prefixed paths of the base directory and the file and calling path.relative() on those two. Namespaced paths are prefixed with \\?\ for local paths or with \\?\UNC for network paths. When the provided arguments are namespaced paths on different drives, path.relative() will return a path in the form ..\..\C:\dir1\dir2\file.js. While such a path is not valid as it contains a drive letter in middle position, it can be readily recognized as for being outside the base path because of the leading .. and discarded as external.

Related Issues

refs eslint/eslint#18575

Is there anything you'd like reviewers to focus on?

Instead of using toNamespacedPath, we could manually parse the absolute path, normalize the drive letter or server specification and compare the path components to the corresponding values of the base path. This would make the check more explicit and possibly easier to maintain. The downside is that we would end up reinventing (part of) the wheel.

@eslint-github-bot eslint-github-bot bot added the bug Something isn't working label Jun 17, 2024
@fasttime fasttime changed the title fix: correctly detect is file is outside base path on Windows fix: correctly detect if file is outside base path on Windows Jun 17, 2024
eslint.config.js Outdated Show resolved Hide resolved
Comment on lines -355 to -358
if (relativeFilePath.startsWith("..")) {
return true;
}
Copy link
Member Author

@fasttime fasttime Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is already performed in getConfigWithStatus().


//-----------------------------------------------------------------------------
// Helpers
//-----------------------------------------------------------------------------

// calculate base path using import.meta
const basePath = path.dirname(new URL(import.meta.url).pathname);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, this was giving a path like /c:/foo/bar/rewrite/packages/config-array/tests, where the c: is considered a top-level directory rather than a drive specifier. This was producing incorrect results with path.resolve(basePath) like C:\c:\foo\bar\rewrite\packages\config-array\tests.

@@ -2860,6 +3010,30 @@ describe("ConfigArray", () => {
);
});

it("should return true when the directory is the parent of the base path", () => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to ensure that a relative path like ".." without a slash is recognized as external.

packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
@fasttime fasttime marked this pull request as ready for review June 23, 2024 20:18
packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
packages/config-array/src/to-namespaced-path.js Outdated Show resolved Hide resolved
}

// On non-Windows systems, `toNamespacedPath` returns the argument as is.
export default sep === "\\" ? win32ToNamespacedPath : arg => arg;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem like a safe check. Again, because we aren't going to know what format a string is in when it's passed in a browser environment, I don't think we can assume path.sep is accurate for the data we'll be getting.

It seems safer to check for the first instance of \ and / in the string? Something like:

const isWinPath = Boolean(filePath.indexOf("/") - file.Path.indexOf("\\"));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input file path could be relative, so there may not be enough indication to determine if it's a Unix or a Windows path. Another complication is that path.relative and path.dirname have different implementations for Unix and Windows (the path separator / vs. \ is not the only difference). These functions will return different results depending on the platform where Node.js runs even for the same input.

If ConfigArray needs to work with both Windows and Unix paths on any platform including browsers, and not depend on cwd at all, it would be better to avoid using node:path functions to manipulate paths and implement the whole logic internally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ConfigArray needs to work with both Windows and Unix paths on any platform including browsers, and not depend on cwd at all, it would be better to avoid using node:path functions to manipulate paths and implement the whole logic internally.

This was actually the intent. Originally, the only reference to path I had was path.relative, which I intended to replace with an inline function in order to avoid that dependency. My goal going forward was to make sure we weren't starting to use even more of path to ensure everything would work in browsers correctly.

Whatever we do, I just don't want to add even deeper dependency on the path module. If we need to rethink things a bit, then let's do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've started working on generic path handling helpers to handle both Unix and Windows style paths regardless of the current platform. I'm planning to show a prototype in this PR to get some feedback.

@nzakas
Copy link
Member

nzakas commented Aug 8, 2024

We may want to consider using the Deno standard path library, which is not runtime or OS specific:
https://jsr.io/@std/path

@fasttime
Copy link
Member Author

We may want to consider using the Deno standard path library, which is not runtime or OS specific: https://jsr.io/@std/path

It looks like that library contains different implementations for windows and posix, and by default, it selects the implementation for the current platform. That is similar to how Node.js node:path works.

It would be useful to find a library that can be installed via npm so we don't need to copy the source files manually.

@nzakas
Copy link
Member

nzakas commented Aug 13, 2024

npx jsr add @std/path works. 😄

@fasttime
Copy link
Member Author

Finally made it work as expected using only the Deno standard library for path handling and without considering the current directory. This fixes the way external files are matched on Windows by including drive letters or UNC server names when resolving relative paths.

An important note: external files are files located outside the base path, but the base path is an optional parameter. If no base path is specified, files will never be considered external. This changes the current behavior where the base path defaults to be the current directory if not specified (in ESLint, the base path is always specified).

@fasttime fasttime requested a review from nzakas August 19, 2024 08:14
@nzakas
Copy link
Member

nzakas commented Aug 20, 2024

An important note: external files are files located outside the base path, but the base path is an optional parameter. If no base path is specified, files will never be considered external. This changes the current behavior where the base path defaults to be the current directory if not specified (in ESLint, the base path is always specified).

This is actually the way I thought it would work when I first implemented it. 😄 I don't think this will affect users because we always pass basePath, but we may want to mark this as breaking for this package.

nzakas
nzakas previously approved these changes Aug 20, 2024
Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As I mentioned, I think we should mark this as breaking.

Would like @mdjermanovic to review before merging.

@mdjermanovic mdjermanovic changed the title fix: correctly detect if file is outside base path on Windows fix!: correctly detect if file is outside base path on Windows Aug 25, 2024
@@ -1 +1,2 @@
package-lock = false
@jsr:registry=https://npm.jsr.io
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would end users need to add this to their .npmrc files?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried using npm pack-ed version of config-array from this branch in eslint/eslint, and npm install fails with:

$ npm i
npm ERR! code E404
npm ERR! 404 Not Found - GET https://registry.npmjs.org/@jsr%2fstd__path - Not found
npm ERR! 404
npm ERR! 404  '@jsr/std__path@^1.0.2' is not in this registry.
npm ERR! 404
npm ERR! 404 Note that you can also install from a
npm ERR! 404 tarball, folder, http url, or git url.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can expect users to change their npm settings so that they can download packages from JSR. We could distribute @std/path in a separate directory inside config-array and import it from there. Or we could use the URL of the tarball from JSR to specify the dependecy in package.json:

-    "@jsr/std__path": "^1.0.2",
+    "@jsr/std__path": "https://npm.jsr.io/~/11/@jsr/std__path/1.0.2.tgz",

Either way, we will lose the magic of the caret ^ to pick newer versions automatically. I checked the JSR docs at https://jsr.io/docs/using-packages but I didn't find any recommendations on how to use their packages with npm apart from registering JSR in .npmrc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can one of you open an issue on JSR for this? I can't imagine they didn't encounter this before, but before we spin our wheels, it would be nice to verify if we're doing things the right way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a discussion here: jsr-io/jsr#701

@nzakas
Copy link
Member

nzakas commented Sep 3, 2024

I'd say let's just use the URL for now. We can always update the approach if/when they respond to your question.

@fasttime
Copy link
Member Author

fasttime commented Sep 4, 2024

The build is now failing because the generated CommonJS module is trying to require() @jsr/std__path, which is an ESM-only package. In packages/config-array/dist/cjs/index.cjs I can see:

var posixPath = require('@jsr/std__path/posix');
var windowsPath = require('@jsr/std__path/windows');

I'm not sure why there was no build error in my previous commit. The error doesn't seem connected to the way @jsr/std__path is declared in package.json. Maybe we need to bundle @jsr/std__path in the CommonJS module so it gets transpiled.

@mdjermanovic
Copy link
Member

Maybe we need to bundle @jsr/std__path in the CommonJS module so it gets transpiled.

If we're going to bundle @jsr/std__path in CJS, then we could bundle it in ESM as well, to avoid having a URL depedency (which I think is not a common practice for packages published to npm).

@fasttime
Copy link
Member Author

fasttime commented Sep 8, 2024

Maybe we need to bundle @jsr/std__path in the CommonJS module so it gets transpiled.

If we're going to bundle @jsr/std__path in CJS, then we could bundle it in ESM as well, to avoid having a URL depedency (which I think is not a common practice for packages published to npm).

Thanks! I tried to bundle @jsr/std__path in dist/cjs/index.cjs and dist/esm/index.js, but then tsc started to complain because of the checkJs option in tsconfig.json:

dist/esm/index.js:316:46 - error TS1313: The body of an 'if' statement cannot be the empty statement.

316       if (lastSlash === i - 1 || dots === 1) ; else if (lastSlash !== i - 1 && dots === 2) {
                                                 ~

dist/esm/index.js:516:15 - error TS2339: Property 'Deno' does not exist on type 'typeof globalThis'.

516       const { Deno } = globalThis;
                  ~~~~

dist/esm/index.js:1148:5 - error TS2353: Object literal may only specify known properties, and 'extended' does not exist in type '{ globstar?: boolean; }'.

1148     extended,
         ~~~~~~~~

dist/esm/index.js:1590:46 - error TS1313: The body of an 'if' statement cannot be the empty statement.

1590       if (lastSlash === i - 1 || dots === 1) ; else if (lastSlash !== i - 1 && dots === 2) {
                                                  ~

dist/esm/index.js:1970:13 - error TS2339: Property 'Deno' does not exist on type 'typeof globalThis'.

1970     const { Deno } = globalThis;
                 ~~~~

dist/esm/index.js:2742:5 - error TS2353: Object literal may only specify known properties, and 'extended' does not exist in type '{ globstar?: boolean; }'.

2742     extended,
         ~~~~~~~~


Found 6 errors in the same file, starting at: dist/esm/index.js:316

So I decided to bundle @jsr/std__path/posix and @jsr/std__path/windows in separate modules that tsc would not check. This seems to be working well and results in the following files being generated:

dist/esm/types.d.ts
dist/esm/index.js
dist/esm/types.ts
dist/esm/std__path/posix.js
dist/esm/std__path/windows.js
dist/esm/index.d.ts
dist/cjs/index.d.cts
dist/cjs/types.ts
dist/cjs/index.cjs
dist/cjs/std__path/posix.cjs
dist/cjs/std__path/windows.cjs

I will update this PR shortly with the new solution, and hopefully the CI build will pass this time.

@@ -133,7 +134,7 @@ describe("@eslint/backcompat", () => {
const linter = new Linter();
const code = "var foo = () => 123; function bar() { return 123; }";
const messages = linter.verify(code, config, {
filename: "test.js",
filename: path.resolve("test.js"),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the filename parameter is specified, it will be passed as-is to configArray.getConfig(), and getConfig() expects a full path:

* @param {string} filePath The complete path of a file to get a config for.

@fasttime
Copy link
Member Author

fasttime commented Sep 8, 2024

Checks are all green now, but I found another issue. In ESLint, when a Linter method like verify() or verifyAndFix() is called without a filename parameter, configArray.getConfig() gets called with argument "__placeholder__" instead of a full path. And if filename is a relative path, the relative path will be passed unchanged. Here is the relevant code (https://github.com/eslint/eslint/blob/343f99216096f1db955766870e35d92d5a121448/lib/linter/linter.js#L2010-L2014):

        const filename = options.filename || "__placeholder__.js";

        // Store the config array in order to get plugin envs and rules later.
        internalSlotsMap.get(this).lastConfigArray = configArray;
        const config = configArray.getConfig(filename);

Relative paths will no longer work when this PR is merged, and I don't think everybody is using the Linter with full paths, because that isn't currently necessary. So we should discuss how to update the Linter code as well when we upgrade config-array in ESLint.

@nzakas
Copy link
Member

nzakas commented Sep 9, 2024

To make sure I'm understanding correctly: relative paths will no longer work when passed to Linter because they would previously be resolved relative to process.cwd()?

@fasttime
Copy link
Member Author

fasttime commented Sep 9, 2024

To make sure I'm understanding correctly: relative paths will no longer work when passed to Linter because they would previously be resolved relative to process.cwd()?

Correct. path.relative() uses the current directory to resolve relative paths. We are using that logic in multiple places in packages/config-array/src/config-array.js.

@nzakas
Copy link
Member

nzakas commented Sep 10, 2024

And again, just for clarity, the path.relative() we're using in this PR does not use process.cwd()?

@fasttime
Copy link
Member Author

fasttime commented Sep 11, 2024

And again, just for clarity, the path.relative() we're using in this PR does not use process.cwd()?

I see the point. path.relative() in @std/path calls Deno.cwd() to resolve relative paths. The Deno global only exists in Deno (I think), and when it's not defined, relative paths are not resolved and instead an error is thrown. This is the relevant code:

      if (typeof Deno?.cwd !== "function") {
        throw new TypeError("Resolved a drive-letter-less path without a current working directory (CWD)");
      }
      path = Deno.cwd();
    } else {
      if (typeof Deno?.env?.get !== "function" || typeof Deno?.cwd !== "function") {
        throw new TypeError("Resolved a relative path without a current working directory (CWD)");
      }

Which is compiled from this source: https://github.com/denoland/std/blob/f1d3885994953eae8a843fb658952c9a6e6e2001/path/windows/resolve.ts#L36-L48.

The error messages "Resolved a drive-letter-less path without a current working directory (CWD)" and "Resolved a relative path without a current working directory (CWD)" is what a user of config-array would get to see if they pass in a relative path.

For a better error message, and also to stay safe in case the Deno global is defined for some reason, we could validate path arguments preemptively in config-array to make sure that they are absolute paths.

@nzakas
Copy link
Member

nzakas commented Sep 11, 2024

For a better error message, and also to stay safe in case the Deno global is defined for some reason, we could validate path arguments preemptively in config-array to make sure that they are absolute paths.

That seems like a good idea...although how would we end up passing an absolute path when ESLint#lintText() is called and we use a placeholder? Maybe the placeholder should be /__placeholder__.js instead?

@fasttime
Copy link
Member Author

I've added logic to throw an error when a path is not absolute in a783ba1.

@fasttime
Copy link
Member Author

how would we end up passing an absolute path when ESLint#lintText() is called and we use a placeholder? Maybe the placeholder should be /__placeholder__.js instead?

In order to keep the current behavior unchanged we could resolve the filename passed to getConfig() using path.resolve():

- const config = configArray.getConfig(filename);
+ const config = configArray.getConfig(path.resolve(filename));

Also, we should manually resolve the cwd option used to create a Linter, because cwd is used as a basePath for a ConfigArray, and we don't want to throw an error when a relative path is passed:

function normalizeCwd(cwd) {
    if (cwd) {
-       return cwd;
+       return path.resolve(cwd);
    }
    if (typeof process === "object") {
        return process.cwd();
    }

I tried it and with those two changes, all Linter tests were passing. I also had to update RuleTester to make sure that linter.verify() was being called with a full path, and fixed a couple of expected paths in the unit tests. And at that point, all tests passed, so I'm confident that this solution will work.

We could also change the way the Linter works so it doesn't depend on process.cwd() to resolve relative paths, or we could just ban relative paths in the cwd option so that process.cwd() is not used. Both would be breaking changes I think, but maybe that is where we want to go in ESLint v10?

@nzakas
Copy link
Member

nzakas commented Sep 16, 2024

Which file is that first code snippet from?

I don't think we want to create a dependency on path inside of Linter, as that's a Node-specific API.

Do we ever pass a relative cwd anywhere right now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted breaking bug Something isn't working
Projects
Status: Second Review Needed
Development

Successfully merging this pull request may close these issues.

3 participants