Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization of algorithms for complex patterns #156

Closed
keenwon opened this issue Feb 13, 2019 · 15 comments
Closed

Optimization of algorithms for complex patterns #156

keenwon opened this issue Feb 13, 2019 · 15 comments

Comments

@keenwon
Copy link

keenwon commented Feb 13, 2019

Environment

  • OS Version: Manjaro Linux x86_64
  • Node.js Version: v11.8.0

Actual behavior

(node:22393) UnhandledPromiseRejectionWarning: Error: EACCES: permission denied

Steps to reproduce and Code sample

.
├── dir
│   ├── one
│   │   ├── a
│   │   │   └── 1.js
│   │   └── b
│   │       └── 2.js
│   └── two [error opening dir]
├── dir2
│   └── one
│       ├── a
│       │   └── 1.js
│       └── b
│           └── 2.js
├── fast-glob.js
├── package.json
└── package-lock.json
// fast-glob.js
const fg = require('fast-glob');
fg(['+(dir|dir2)/one/**/*']).then((entries) => console.log(entries));
$ node fast-glob.js
(node:22859) UnhandledPromiseRejectionWarning: Error: EACCES: permission denied, scandir '/home/keenwon/Test/fast-glob-test/dir/two'
(node:22859) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:22859) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

current user doesn't have permission to access dir/two:

total 8.0K
drwxr-xr-x 4 keenwon keenwon 4.0K 2月  13 17:37 one
dr-x------ 2 root    root    4.0K 2月  13 17:54 two
@mrmlnc mrmlnc self-assigned this Feb 14, 2019
@mrmlnc mrmlnc pinned this issue Feb 14, 2019
@mrmlnc mrmlnc added this to the 2.3.0 milestone Feb 20, 2019
@mrmlnc mrmlnc modified the milestones: 2.3.0, 3.0.0 Apr 4, 2019
@mrmlnc
Copy link
Owner

mrmlnc commented Apr 19, 2019

Sorry, I originally put the wrong label. Why do you think that is incorrect behavior? In your code sample you have not specified an error handler.

@mrmlnc mrmlnc removed this from the 3.0.0 milestone Apr 19, 2019
@keenwon
Copy link
Author

keenwon commented Apr 20, 2019

@mrmlnc the strange thing is why access to the directory that should not be accessed

@mrmlnc
Copy link
Owner

mrmlnc commented Apr 20, 2019

So, In version 3.0.0, you will be able to ignore errors.

@mrmlnc mrmlnc added this to the 3.0.0 milestone Apr 20, 2019
@mrmlnc mrmlnc changed the title Error: EACCES: permission denied The ability to ignore errors Apr 20, 2019
@keenwon
Copy link
Author

keenwon commented Apr 22, 2019

@mrmlnc It may not be clear enough. In fact, the key point of the problem is not to report the error or not, but why to access the directory that should not be accessed.

Also, I found that node-glob is working properly.

var glob = require('glob')

glob('+(dir|dir2)/one/**/*', {}, function (error, files) {
  if (error) {
    console.error(error)
    return;
  }

  console.log(files);
})

result:

[ 'dir/one/a',
  'dir/one/a/1.js',
  'dir/one/b',
  'dir/one/b/2.js',
  'dir2/one/a',
  'dir2/one/a/1.js',
  'dir2/one/b',
  'dir2/one/b/2.js' ]

@mrmlnc
Copy link
Owner

mrmlnc commented Apr 22, 2019

I'm sorry, I think I misunderstood you.

I think this related to the mechanism of creating tasks. Further basis on this tasks we crawl directories. In this case, we start reading directories from the root directory for +(dir1|dir2). This also slows down the following patterns: (a|b|c)/*.js, {file.js,directory/**/*}.

Most likely the node-glob package begins at once to read two directories instead of start from root directory.

So, I think we need to fix it.

@mrmlnc mrmlnc changed the title The ability to ignore errors Optimization of algorithms for complex patterns Apr 22, 2019
@mrmlnc mrmlnc unpinned this issue May 18, 2019
@jonschlinkert
Copy link

I re-created the folder structure described in the first issue, and this is what I get with fast-glob:

[
  'dir/one/a/1.js',
  'dir/one/b/2.js',
  'dir2/one/a/1.js',
  'dir2/one/b/2.js'
]

I'm not sure how to reproduce the bug. But I'm still looking into it.

@keenwon
Copy link
Author

keenwon commented May 30, 2019

Is the dir/two permission like this?

total 8.0K
drwxr-xr-x 4 keenwon keenwon 4.0K 2月  13 17:37 one
dr-x------ 2 root    root    4.0K 2月  13 17:54 two

@mrmlnc mrmlnc modified the milestones: 3.0.0, 3.1.0 May 30, 2019
@mrmlnc
Copy link
Owner

mrmlnc commented May 30, 2019

Moving to the next milestone, because we are waiting for some staff from micromatch (micromatch/micromatch#165). In any case, we will invest more in the problem.

@mrmlnc mrmlnc modified the milestones: 3.1.0, 3.x.x Jun 23, 2019
@mrmlnc
Copy link
Owner

mrmlnc commented Dec 22, 2019

Just another try to split pattern into segments. This works for most of the patterns from the micromatch tests (random selection).

'use strict';

const mm = require('../micromatch');
const { Minimatch } = require('minimatch');
const utils = require('../fast-glob/out/utils');

const PATTERN = 'root/+(dir1|dir2)/one/**/*';

const mn = new Minimatch(PATTERN);

console.dir(mn, { colors: true });

const buildSegments = (tokens) => {
  const segments = [];
  let segment = '';

  const state = {
    parens: 0
  };

  let index = 0;

  while (index < tokens.length) {
    const token = tokens[index];

    if (token.type === 'paren') {
      if (token.value === '(') {
        state.parens++;
      }
      if (token.value === ')') {
        state.parens--;
      }
    }

    if ((token.type === 'slash' && state.parens === 0)) {
      segments.push(segment);
      segment = '';
    } else {
      segment += token.value;
    }

    index++;
  }

  segments.push(segment);

  return segments;
};

const makeSegments = (pattern) => {
  const [result] = mm.parse(pattern);

  if (pattern === '') {
    return [];
  }

  // mm.parse returns [bos] for aa*.txt
  if (result.tokens.length === 1 && result.tokens[0].type === 'bos') {
    return [pattern];
  }

  return buildSegments(result.tokens);
};

const convertPatternsToSegments = (patterns) => patterns.map(makeSegments);

const prepare = (pattern) => {
  return mm.braceExpand(PATTERN);
};

const makeRe = (segments) => {
  return segments.map(segment => {
    return segment.map(part => utils.pattern.isDynamicPattern(part) ? mm.makeRe(part) : part);
  });
};

const patterns = prepare(PATTERN);
const segments = convertPatternsToSegments(patterns);
const final = makeRe(segments);

console.dir(final, { colors: true });

const parts = [
  [
    'root',
    /^(?:(?=.)(?:dir1|dir2)+)$/,
    'one',
    /^(?:(?!\.)(?:(?:(?!(?:^|[\\/])\.).)*?)[\\/]?)$/,
    /^(?:(?!\.)(?=.)[^\\/]*?[\\/]?)$/
  ]
];

const files = [
  'root/dir1/a/1.js', // +, +, -
  'root/dir1/b/2.js', // +, +, -
  'root/dir1/two', // +, +, -
  'root/dir2/one/a/1.js', // +, +, +, +, +
  'root/dir2/one/b/2.js',  // +, +, +, +, +
  'root/fast-glob.js', // +, -
  'root/package.json' // +, -
];

@mrmlnc
Copy link
Owner

mrmlnc commented Jan 21, 2020

After several approaches, I got a working version. We will use partial matching in a deep filter, using positive and negative patterns. The entry filter will be use a full pattern matcher.

I’m target this feature to the 3.2.0 milestone. Currently blocked by micromatch/picomatch#58.

@alexander-akait
Copy link

@mrmlnc Can we prioritize the issue, it is blocker for prettier and we want to use fast-glob for 2.0 version? Big thanks

@mrmlnc
Copy link
Owner

mrmlnc commented Feb 3, 2020

@evilebottnawi,

Thank you for your interest! When do you plan to release a new version of prettier?

I planned to release a new version on February 15/16. Is this option suitable for you? I can release the beta version earlier so that you can adapt your processes.

The current implementation works, but there is one unpleasant point — patterns like {a..z} will slow down the package.

@alexander-akait
Copy link

@mrmlnc February, original issue - prettier/prettier#6776, I think the beta release would be good, we could check if everything works as we expect

@mrmlnc
Copy link
Owner

mrmlnc commented Feb 4, 2020

@evilebottnawi,

You can use fast-glob@3.2.0-beta.

@mrmlnc
Copy link
Owner

mrmlnc commented Feb 9, 2020

Will be available with fast-glob@3.2.0 (#251).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants