Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fileExists() is called multiple times during module resolution on directories that do not exist #5601

Closed
ToddThomson opened this issue Nov 11, 2015 · 14 comments
Labels
Fixed A PR has been merged for this issue Help Wanted You can do this Suggestion An idea for TypeScript

Comments

@ToddThomson
Copy link
Contributor

When trying to resolve external modules, each module extension/location ( mod.ts, mod.tsx, mod.d.ts, mod/package.json, mod/index.ts, mod/index.tsx, mod/index.d.ts ), is tried even when the the base directory does not exist.

@mhegazy
Copy link
Contributor

mhegazy commented Nov 11, 2015

This behaves as i would expect it to. what is the issue/question?

@mhegazy mhegazy added the Needs More Info The issue still hasn't been fully clarified label Nov 11, 2015
@ToddThomson
Copy link
Contributor Author

Candidate directories are generated during node module resolution which are passed to loadNodeModuleFormFile ( or Directory ). If the candidate directory does not exist then 7 calls to fileExists() will be made.
This results in a lot of needless calls. It's not a big issue, just thought you might like to know. With a host that caches fileExists() the number of lookups/searches would be much smaller.

@mhegazy
Copy link
Contributor

mhegazy commented Nov 11, 2015

tsc uses a caching host implementation. the language services caches resolution results if they did not change, so this should not happen often. your host implementation can add optimizations to check if part of the path does not exist.

@mhegazy mhegazy added Question An issue which isn't directly actionable in code and removed Needs More Info The issue still hasn't been fully clarified labels Nov 11, 2015
@ToddThomson
Copy link
Contributor Author

@mhegazy Not sure why you wouldn't want to put a ts.sys.directoryExists() call on candidate directories to totally eliminate the chatter. Yes, I could optimize non-existing directories out of my caching host, but why not fix the root cause? It is all well and good that LS works well, but module resolution in program and compiler host should be efficient too.

@ToddThomson
Copy link
Contributor Author

To improve the efficiency of node module resolution, you should consider starting the search at the process.cwd ( not the containing directory of the source file ). In any event, It is unlikely that node_modules will be found anywhere in the source sub-tree.
In a general gulp based build pipeline node_modules and typings subdirectories are usually found at the root of the "project/Solution".

@vladima
Copy link
Contributor

vladima commented Nov 11, 2015

I think there are two items to discuss:

  • using process.cwd() instead of getDirectory(containingFile) - this optimization offers better performance by sacrificing correctness since the fact that node_modules mostly often appear at the root to the project does not mean that they cannot appear anywhere else, i.e. like in case below module2 should be searched relatively to the location of containing file, otherwise it cannot be resolved
root
  |---node_modules
       |---module1
            |---node_modules
                 |---module2.d.ts
            |---index.d.ts // contains `import {foo} from "module2"`

In these circumstances I don't think that it is safe to apply this optimization.

  • adding directoryExists into the CompilerHost so module resolution logic can omit checks for individual files if containing folder does not exist seems like a legit feature request. One note though: this method should be made optional, otherwise it will be API breaking change.

@vladima vladima added Suggestion An idea for TypeScript Help Wanted You can do this and removed Question An issue which isn't directly actionable in code labels Nov 11, 2015
@vladima
Copy link
Contributor

vladima commented Nov 11, 2015

Marking as Suggestion and Accepting PRs for a "add optional 'directoryExists' function into the CompilerHost" feature request

@ToddThomson
Copy link
Contributor Author

I'd be happy to make a few changes to this area. I've already added a directoryExists optimization to my base compilerHost class. It will most likely be the first week in Dec before I can do the work to issue the PR however. I'd also like to work on the overall algorithm for external module resolution in program (I'll address your concerns above ). Please let me know if this is something you want.

@ToddThomson
Copy link
Contributor Author

@vladima While setting up caching for module resolution in the host, I've run across module names that differ only in casing. What is the rule to use when module names only differ by case? Should host.useCaseSensitiveFileNames() apply?

@ToddThomson
Copy link
Contributor Author

@vladima Module resolution in program should cache resolved modules. From looking at the code in LSHost this is the case. I understand that LS needs to be more performant, but why not put the same logic in program?
It appears that LS also addresses module name casing.

@vladima
Copy link
Contributor

vladima commented Nov 12, 2015

the reason for this is current design of VS integration. In VS module resolution is done on the completely different instance of JS engine so it does not have access to the old instance of the program.

@ToddThomson
Copy link
Contributor Author

@vladima Your idea to use baseUrl for module resolution is a good one ( issue #5039 ). I was going to make the suggestion that baseUrl be just the base directory of tsconfig.json and then have all module references be absolute or relative to the containing file or from the baseUrl. I was beginning to think this through, but it looks like you've got it figured out.
No need for me to make changes now to goodness coming in 1.8. directoryExists() and module name casing could still be addressed, but perhaps these will come in 1.8 anyway. Prob no need for me to issue a PR.

@mprobst
Copy link
Contributor

mprobst commented Feb 17, 2016

During a single compile run of a small app, we see one fstat per parent directory per file. I.e. if you compile an application that has 200 files with an average directory depth of 6 or so, you'll end up doing 200 * 6 = 1200 fstats. If an fstat takes 1 ms, that's 1.2 seconds spent just on fstats.

The total number of possible node_modules locations during one build (assuming the file system doesn't change under our feet, which would break things anyway) is of course only the number of folders in the project, plus their parent folders.

If TSC would cache the absence of a folder instead of fstating over and over again, this could easily shave off a second of a 200 file compile. Though of course it's tricky to invalidate that cache.

This problem is exacerbated in our particular setup as some folders are much more expensive to fstat than others (up to 100 ms).

@vladima
Copy link
Contributor

vladima commented Feb 17, 2016

Note: in 1.8 ModuleResolutionHost can optionally expose directoryExists method. If it is provided then compiler will try to check the presence of directory before trying to access individual files

@mhegazy mhegazy added this to the TypeScript 1.8.2 milestone Feb 20, 2016
@mhegazy mhegazy added the Fixed A PR has been merged for this issue label Feb 20, 2016
@mhegazy mhegazy closed this as completed Feb 20, 2016
@microsoft microsoft locked and limited conversation to collaborators Jun 19, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Fixed A PR has been merged for this issue Help Wanted You can do this Suggestion An idea for TypeScript
Projects
None yet
Development

No branches or pull requests

4 participants