Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite this library #405

Closed
isaacs opened this issue Nov 9, 2019 · 11 comments
Closed

Rewrite this library #405

isaacs opened this issue Nov 9, 2019 · 11 comments

Comments

@isaacs
Copy link
Owner

isaacs commented Nov 9, 2019

JavaScript has come a long way since I wrote this, and while it's used by a lot of people, it has some definite shortcomings that make it hard to improve and modify.

  • Rather than the "pile of functions" approach, with everything re-implemented for sync globbing, this should use a class-based approach, where all fs ops are abstracted into methods that can be overridden in a sync child class, like how the internals of node-tar or npm-packlist work.

  • Consider not relying on regular expressions or minimatch. Ie, write a parser that can take a glob pattern (after splitting up the {,} sections) and a path, and return the following possible states:

    • NO: definitely not a match, and cannot match children, stop
    • NO, BUT: not a match, but child entries may be a match. Return the portions of the pattern that may match child entries. Eg a/**/x/y against a/x would return **/x/y and y as patterns that could match against child entries. If the entry is not a dir, this is a NO response.
    • YES AND: a match, and may match child entries (eg **/x would match both a/x and a/x/x, so comparing it to the a/x path should say "yes, and"). Return value should include the bits that can match child entries. So, a/**/x/** against a/x would return a "yes and" response with **/x/** and x/** as remaining patterns. If the entry is not a dir, this is a YES BUT response.
    • YES BUT: a match, but cannot match child entries
    • YES IF: a match, but only if it's a dir (eg a/*/ matches the dir a/x/, but not the file a/x)

    For all of these, walk the string and compare with a finite state machine rather than converting to a regexp.

  • Fix all the windows stuff. Not sure what to use as a reference implementation here. Mingw bash? LSW?

  • Consider using a WASM'ed implementation of glob written in rust? Especially if we're not reliant on JS regexps, this makes a lot of sense.

  • Create a custom error type

I will set aside some vacation time to do this if a handful of people decide to fund this library.

@isaacs isaacs pinned this issue Nov 9, 2019
@bensampaio
Copy link

Would this also result in the use of promises instead of callbacks?

@isaacs
Copy link
Owner Author

isaacs commented Dec 6, 2019

Yes.

@valango
Copy link

valango commented Jan 23, 2020

Could it be something like this?

@IonelLupu
Copy link

@isaacs Do you consider using typescript?

@cdhowie
Copy link

cdhowie commented Feb 24, 2020

In addition to a promises API, I'd love to see an async iterator implementation. This would allow async traversal of huge trees without buffering the entire result set in memory, and be easier to use (for await ... of) than the current event/pause/resume interface.

In fact, this would make a promises API redundant; there are already libraries that will iterate an async iterator and collect the results in a promise for an array (such as async-iterator-to-array).

@dandv
Copy link

dandv commented Apr 4, 2021

Has tiny-glob done most of the work? It's written in TypeScript, uses Promises.

@isaacs
Copy link
Owner Author

isaacs commented Apr 5, 2021

I am not intending to use typescript for this. Nothing against TypeScript, but it's not my preference.

I have not investigated whether tiny-glob has the same adherence to Bash 4's glob semantics. It's no slight against tiny-glob if it does not, as there are several places where this necessitates arguably less than optimal implementation choices, but there is a value in having a library with a constraint, so I would not suggest that as a replacement unless it does.

I figured it would make sense to implement this as an objectMode:true minipass stream, so that Promise and async iterator behavior would be included without any additional work, since minipass streams are async iterators, and have a .collect() method which returns the entire contents in one array, and easily support synchronous operation as well.

@isaacs
Copy link
Owner Author

isaacs commented Jan 17, 2023

Turns out I am using TypeScript for this. #489

Still using regexps though. Gonna close this since the PR covers the changes more correctly.

@isaacs isaacs closed this as completed Jan 17, 2023
@Tofandel
Copy link

Tofandel commented Feb 28, 2023

Do you have a changelog for v9? Would be nice to have a list of breaking changes so we can upgrade, without having to read through every single commit

Thanks for the awesome lib

@mrmlnc
Copy link

mrmlnc commented Feb 28, 2023

@Tofandel
Copy link

@mrmlnc Thanks, I think I need to clean my eyes, I looked for it, but I guess because it's not in CAPS not in the right place :p

@isaacs isaacs unpinned this issue Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants