Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Local file path check error when starting with "file://..." #194

Closed
wants to merge 1 commit into from
Closed

fix: Local file path check error when starting with "file://..." #194

wants to merge 1 commit into from

Conversation

SEWeiTung
Copy link

@SEWeiTung SEWeiTung commented Jul 16, 2021

At L24 of your original code, we use something like fs.existsSync to check a file's existance or not:

const targetURL = new URL(node.url, currentFileURL);
if (targetURL.protocol === "file:" && !fs.existsSync(targetURL))

That's the BUG! Because 'currentFileURL' will be an absolute file pointing to a certain doc on the local disc, so the result should be something like:

WrongFileCheck

However, this will always return you false.

Futher more, a clear example is:
image

Just see what's output? If you put the file into this project to have a play with, you will find the 1st result is "true", while the next is "false".

# Quick solution:

Just check 'currentFileURL' is enough.

'targetURL' is made up of "file:///....", however it cannot find file
even the file exists locally. So a quick fixture for it is to just
check 'currentFileURL' instead of targetURL.
@SEWeiTung
Copy link
Author

@aduh95 & @Trott , Plz have a review :)

@SEWeiTung SEWeiTung changed the title fix: Local file path check fix: Local file path check error when starting with "file://..." Jul 16, 2021
@aduh95
Copy link
Contributor

aduh95 commented Jul 16, 2021

The code in your screen capture is buggy: fs.existsSync('file:///dev/null') is not intepreted as file: URL, but as a relative path (equivalent to fs.existsSync('./file:/dev/null')). Instead, you need to provide a URL instance: fs.existsSync(new URL('file:///dev/null')). The "best practice" is to use require('url').pathToFileURL(currentFilePath) which handles special URL char (if the file path contains % or what not).
(BTW, using a screen capture to share code is really not practical, I'd advise to paste the code in the comment body)

Copy link
Contributor

@aduh95 aduh95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currentFileURL is guaranteed to exist, it completely defeats the purpose of the check – which is to detect broken links (if targetUrl leads to a file that exists, it's not broken, and vice-versa).

@nschonni
Copy link
Member

Another approach would be to allow some re-writing of URLs like https://github.com/tcort/markdown-link-check does so, some patterns can be re-written prior to checking

@@ -21,7 +21,7 @@ function validateLinks(tree, vfile) {
for (const node of getLinksRecursively(tree)) {
if (node.url[0] !== "#") {
const targetURL = new URL(node.url, currentFileURL);
if (targetURL.protocol === "file:" && !fs.existsSync(targetURL)) {
Copy link
Author

@SEWeiTung SEWeiTung Jul 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aduh95: I find there's something interesting, maybe it will cause something bad:

If I use "console.log" to print the path out to check whether these path files are really not existing, code like this:

 if (targetURL.protocol === "file:" && !fs.existsSync(targetURL)) {
        console.log(`targetURL Not Exist: ${targetURL}`);   //Add this to print out
        vfile.message("Broken link", node);
      } 

And here's the result (some part of it):
image

This experiment is done locally on my PC of folked node.org project, however, the pdf files really exist:

image

There're also many files meeting with this problem.

You can have a try at node.org (local project on windows by adding the log), and then check the outputted result.

Copy link
Author

@SEWeiTung SEWeiTung Jul 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another problem is: We're usually referring the relative paths in our md file for each other, for example in the node.org project, we have a reference like this:

image

This link, will be generated to "http" by build.js in nodejs.org project, however it has a "file:///" as the URL at local disc, this means in remark-lint-nodejs-links.js, the actual path of the file would be:

image

Of course, it doens't exist, but it EXISTS really when deploying onto the server with an "https://nodejs.org/en/docs/guides/getting-started-guide/".

Copy link
Contributor

@aduh95 aduh95 Jul 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On nodejs/node repo, we forbid any path that starts with a /, we expect every path to be relative.
If nodejs/nodejs.org uses absolute paths, the way forward would be to add an option to specify what is the absolute base. The implementation would look something like that:

const targetURL = (options.absoluteBase != null && node.url[0] === "/" && node.url[1] !== '/') ?
  new URL('.' + node.url, options.absoluteBase) :
  new URL(node.url, currentFileURL);

This experiment is done locally on my PC of folked node.org project, however, the pdf files really exist:

I suspect the static folder you show in the screen shot is not at D:\static, right? Instead, I suspect it is located at D:\Projects\node_proj\nodejs.org\static or something like that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the static folder you show in the screen shot is not at D:\static, right? Instead, I suspect it is located at D:\Projects\node_proj\nodejs.org\static or something like that.

@aduh95:
Yes, that's the big problem.

BTW:I'm not sure how many kinds of paths we have now in these *.md files, sometimes we will have a path with a "#", pointing at a certain pharagraphu in another md file, and sometimes we have something like :/en/about, but it points at /en/about/index.md....

Too complicated to think...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing some, but I think there are only 5 types of URL:

  • Fully specified URL, e.g. https://nodejs.org/en/#anchor (currently supported, ignored if using a protocol other than file:).
  • Protocol relative URL, e.g. //nodejs.org/en/#anchor (not supported, but also not used AKAICT).
  • Origin-relative URL, e.g. /en/#anchor (not supported, that's the issue you are referring to).
  • Relative URL, e.g. ./en/#anchor (supported).
  • Document-relative URL, e.g. #anchor (supported).

If we want to support origin-relative URLs, we need to either:

  • try to resolve them from a root folder (provided by the configuration or the environment).
  • ignore them, so not try to find out if it targets a file that exists or not.

In either case, it needs to come from a configuration, as we wouldn't want to have an origin-relative URL in the nodejs/node repo.

sometimes we have something like :/en/about, but it points at /en/about/index.md....

I don't think that's going to be a problem as long as en/about exists on the file system (it doesn't matter if it's a directory or a markdown file).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestions in detail, and after deep thought for a while, I don't it's good to mix your corrent checking with this specific kind of md checks. What's more, considering there's a problem of ASCII ordering and I've disabled the check. So maybe I'll consider making another tool or package from npm instead of directly writing in your package :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants