Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf: Optimize ignoreModule calls in no-cycle #2612

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 40 additions & 5 deletions src/rules/no-cycle.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,16 @@ import moduleVisitor, { makeOptionsSchema } from 'eslint-module-utils/moduleVisi
import docsUrl from '../docsUrl';

const traversed = new Set();
/**
* Cache of resolved paths for faster subsequent lookups
* @type {{ [importPath: string]: string }}
*/
const resolvedPaths = {};
/**
* Cache of external module evaluations for faster subsequent lookups
* @type {{ [resolvedPath: string]: boolean }}
*/
const externalModules = {};

Comment on lines +13 to 23
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these caches need to be cleared in the same place traversed is

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Program:exit hook is called at the exit of each file, so clearing the caches there cut the number of cache hits in our repo by about 50% (although it only added ~45s to our repo test case)

The high level attempt at this update was to quick filter down to "maybe external" modules like 'react', '@reduxjs/toolkit' or '@/some/alias/path' - where those resolved paths can be safely cached across files 🤔

Is that possible to accomplish with any of the existing data structures?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, hmm - you're right, it's less about clearing it after the file and more about clearing it after the "run" - since in editors, eslint can be long-running.

I think that in every file, there could be a different resolver, so the cache would have to be on the resolver and the resolver config and the fully normalized file path, i think?

module.exports = {
meta: {
Expand Down Expand Up @@ -52,11 +62,36 @@ module.exports = {

const options = context.options[0] || {};
const maxDepth = typeof options.maxDepth === 'number' ? options.maxDepth : Infinity;
const ignoreModule = (name) => options.ignoreExternal && isExternalModule(
name,
resolve(name, context),
context,
);

const ignoreModule = (name) => {
// Note: This fn is optimized to eliminate call time because it may be
// called hundreds of thousands of times during a full repository lint run
// and can add minutes to the total time in large codebases

// Low cost evaluation; do not ignore if not opted in or module starts
// with a relative path
// NOTE: skipping further resolve codepath also ensures that only unique
// import paths are set in resolvedPaths, eg "./index.js", or "../file.js"
// should not be stored because there could be duplicates that resolve to
// different paths
if (!options.ignoreExternal || name[0] === '.') return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't a resolver choose to configure a relative path as external?

Copy link
Author

@DHedgecock DHedgecock Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I suppose so, there's probably some monorepo setup out there like that...

I think removing this assumption means that we can't confidently cache the resolved module path we looked up because then there could be duplicate import paths that resolve to different files (eg import paths of './index.js').

Trying to approach this from a different angle: The current time to determine if a module is "external" is the performance bottleneck and this PR is attempting to short circuit that whenever possible.

Do you have suggestions on an approach?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the no-cycle rule option ignoreExternal could be a boolean or a regex, and if it's a regex it's used for this short circuit?

Copy link
Author

@DHedgecock DHedgecock Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the best configuration fields are, but doing a quick test with a regex short circuit has about the same performance improvement as the cache implementation.

Alternative approach tested:

    const ignoreModule = (name) => {
      if (!options.ignoreExternal) return false;

      if (options.ignoreExternalRegex) {
        return options.ignoreExternalRegex.test(name);
      }

      const path = resolve(name, context);

      let isExternal = externalModules[path];
      if (isExternal === undefined) {
        isExternal = isExternalModule(name, path, context);
        externalModules[path] = isExternal;
      }
      
      return isExternal;
    };

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the better question is, why is isExternalModule slow?

Perhaps the node and webpack resolvers can add a cache there, rather than inside the main plugin?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we doing a bunch of regex matching in isExternalModule?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally that's fast, but if it can be sped up, let's!


// High cost evaluation; cache evaluations of resolve() and
// isExternalModule() calls for faster subsequent lookups
let path = resolvedPaths[name];
if (path === undefined) {
path = resolve(name, context);
resolvedPaths[name]= path;
DHedgecock marked this conversation as resolved.
Show resolved Hide resolved
}

let isExternal = externalModules[path];
if (isExternal === undefined) {
isExternal = isExternalModule(name, path, context);
externalModules[path] = isExternal;
}

return isExternal;
};

function checkSourceValue(sourceNode, importer) {
if (ignoreModule(sourceNode.value)) {
Expand Down