Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stopDir on recursive search #18

Closed
wants to merge 1 commit into from
Closed

Add stopDir on recursive search #18

wants to merge 1 commit into from

Conversation

CVarisco
Copy link

referred to #17

@sindresorhus
Copy link
Owner

This feels a bit fragile. What if you set the stopDir to foo, and then suddenly deeper down in the tree you create another foo directory. Now it changes where it stops without you realizing. Maybe it should instead be an absolute path? That would solve your use-case right? Since you want it to stop at some root directory. Or maybe we should make it more flexible and have a shouldContinue that accepts a function that's called for every directory step.

@sholladay
Copy link
Collaborator

sholladay commented Jan 21, 2018

I have some thoughts on the code, but let's address the design points first.

... changes where it stops without you realizing

As in a race condition? Definitely possible, but as it is, our search across levels of the tree is just as susceptible to that, unfortunately.

I am slightly against the shouldContinue() approach.

  • It would bypass a very large percentage of what's in this module (both in terms of functionality and lines of code), so adding it here doesn't seem high value.
  • It doesn't feel like a nice abstraction to me. What would its signature be like? We could be helpful and pass it a directory listing from fs.readdir(), but that may have non-trivial performance implications, especially in the worst case (fs.readdir() ought to return a stream due to the slowness of enumerating large directories, but it doesn't). On the other hand, if we chose not to do a directory listing, then the user almost might as well just use p-locate or locate-path directly. At that point, all we're doing is exploding the provided filepaths into their distinct levels. I would rather see that as its own separate module and have the user compose those pieces themselves.

I'm also still curious as to whether adding stopDir is merely a performance optimization or if it's an actual issue of safety for some application. I'm generally against adding options for performance without proof of how severe the problem is in a real world scenario.

}

const splittedPath = dir.split('/');
const lastFolder = splittedPath[splittedPath.length - 1];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very important to use path.basename() here to grab the last path component, instead of splitting like this.

  1. Hard coding / as a separator will not work with Windows-style paths.
  2. Even if you use path.sep, the path may end with a separator, which is quite common in practice. That will cause the final array element to be an empty string ''. Can't use that as a directory name!

Both of the above issues will be taken care of by path.basename().

There is another problem, though. What if dir is the root directory? You will need to take care of that separately, as even path.basename() will return an empty string in that case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your advice 👍

const splittedPath = dir.split('/');
const lastFolder = splittedPath[splittedPath.length - 1];

if (lastFolder === compareDir) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simpler...

return lastDir === compareDir;

const filenames = [].concat(filename);

return new Promise(resolve => {
(function find(dir) {
locatePath(filenames, {cwd: dir}).then(file => {
if (file) {
resolve(path.join(dir, file));
} else if (isLastDirEqualTo(dir, opts.stopDir)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually, this is the same as the next else if; the point is we are stopping and returning null because we have reached the end. I would prefer to combine these in the form of else if (foo || bar).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to path.resolve(opts.stopDir), otherwise comparisons against it will fail if the user passes a relative path like ../../.

@sindresorhus
Copy link
Owner

As in a race condition? Definitely possible, but as it is, our search across levels of the tree is just as susceptible to that, unfortunately.

That too, but I was thinking more of developer mistake. It's easy to have same-named directories in various places of a deep directory tree. Seems like matching the absolute path on where to stop would solve his use-case.

@sindresorhus
Copy link
Owner

I am slightly against the shouldContinue() approach.

We would only give it the information we already have, the current path. This option would solve many potential use-cases. Users could, for example, use it to find the first package.json file up the tree that does not contain a certain string.

I like it because it's flexible, as it can enable the user to solve all the use-cases themselves that we don't care about.

It would bypass a very large percentage of what's in this module (both in terms of functionality and lines of code), so adding it here doesn't seem high value.

They get the upwards directory traversal, which is not that easy to do manually. I've seen so many examples of it being done incorrectly when I did research for this module.

@CVarisco
Copy link
Author

@sholladay and @sindresorhus thank a lot for your comments.
I know that my approach couldn't be the best (and of course the code also).
I appreciate your interest in this "feature", thank you 🙂

Anyway, I like a lot the shouldContinue() because like @sindresorhus sad the user can manage himself all the behaviour that he wants and from library part, the developers don't care about it.

So, What I need to do to help you with this feature? Can I help with something?

@sholladay
Copy link
Collaborator

Okay, let's do it!

  • A new PR would be nice, just for clarity. 😎
  • The signature is shouldContinue(dir) => boolean.
  • Be sure to document it in README.md.

Let's start off with these semantics: shouldContinue() will only be called if locatePath() did not find any matches AND we are not in the root directory.

Something like this rough sketch...

if (file) {
	resolve(path.join(dir, file));
} else if (dir === root || shouldContinue(dir) === false) {
	resolve(null);
} else {
	find(path.dirname(dir));
}

@CVarisco I hope this is clear enough. Happy to help if you need it. Thanks for working on this. 👍

@sindresorhus
Copy link
Owner

There are two use-cases I'd like to solve.

  1. Deciding when to stop:
findUp('unicorn.png', {
	shouldContinue: dir => dir !== rootDir
});
  1. Custom matching logic. For example, here we want to stop in the directory with both a foo and bar file.
findUp({
	shouldContinue: dir => !(path.existsSync(path.join(dir, 'foo')) && path.existsSync(path.join(dir, 'bar')))
});

Maybe it would make more sense to just have a stopDir option for the first use-case, and for the second option, just let the filename argument be a function:

findUp(dir => {
	return !(path.existsSync(path.join(dir, 'foo')) && path.existsSync(path.join(dir, 'bar')));
});

But here we would like to return the foo filename from findUp, so maybe the semantics should be to stop if true is returned, then the dirname is used, or if a string is return, that is used.

findUp(dir => {
	return path.existsSync(path.join(dir, 'foo')) && path.existsSync(path.join(dir, 'bar')) && path.join(dir, 'foo');
});

@CVarisco
Copy link
Author

Hi @sindresorhus 🙂

Do you think that for the user could be simple to have multiple ways to use the lib?
Because I think that is easiest for the user to have only the shouldContinue function (He can handle every case) without changing the input type params.

But maybe I don't understand your point of view.
(And also for the maintainability of the library, because you don't need to write different logic per different case)

@sindresorhus
Copy link
Owner

Because I think that is easiest for the user to have only the shouldContinue function (He can handle every case) without changing the input type params.

Well, we could, but then your use-case would require more code on your part, as it would be up to you to check whether the file exists. The type params would be different regardless, as there wouldn't be any filename string input for use-case 2.

@CVarisco
Copy link
Author

I think we are right 🙂
I mean, the use case isn't objective in this case.
Because there are devs with different necessity and different way to operate with libraries.

Anyway, How you can proceed?

@sindresorhus
Copy link
Owner

Let's wait for feedback from @sholladay.

@sholladay
Copy link
Collaborator

Those look pretty good to me @sindresorhus. 👍

I'll propose a slight variation for sake of discussion. Something like this...

opts = Object.assign({
    match: locatePath
}, opts);

// ...

(function find(dir) {
	Promise.resolve(opts.match(filenames, {cwd: dir})).then(file => {
		if (file) {
			resolve(path.join(dir, file));
		} else if (dir === root || opts.shouldContinue && shouldContinue(dir) === false) {
			resolve(null);
		} else {
			find(path.dirname(dir));
		}
	});
})(startDir);

Usage examples:

  1. Search for and return unicorn.png, but give up if we can't find it after searching up to, and including, a specific level of the tree.

    findUp('unicorn.png', {
    	shouldContinue: dir => dir !== rootDir
    });
  2. Return the first directory with both package.json and node_modules children.

    findUp(['package.json', 'node_modules'], {
    	match: (filenames, opts) => filenames.every(fp => path.existsSync(path.join(opts.cwd, fp))) && '.'
    });
  3. Return package.json, but only if it has node_modules as a sibling.

    findUp(['package.json', 'node_modules'], {
    	match: (filenames, opts) => filenames.every(fp => path.existsSync(path.join(opts.cwd, fp))) && filenames[0]
    });

Of course, they can also be combined, too.

What is different?

  • Just use . to return the current directory. I don't think we should support doing so with true. The intent would be unclear in situations like my 2nd example, if I left off && '.'.
  • No new types for the existing arguments. Just new options with simple defaults. Not that big of a deal, but I like that you see the name of the function this way. That will be especially helpful for callers who use both functions at once.
  • By passing down the filenames as arguments, more matchers could be functionally pure and thus maybe a little easier to test.

@CVarisco
Copy link
Author

Guys? What do we need to do?

@sholladay
Copy link
Collaborator

I think we're on the same page about what needs to be implemented. There's just some room for creativity / improvement. Are you up for doing something like the designs we showed above, @CVarisco? If so, I think it would be fine to move forward with a new PR based on one of those approaches. If you want to wait for Sindre to reply, that is fine, too. We will probably have another round of feedback either way, which we will be easier when we see the actual code in practice. We'll get it merged quickly once the tests are passing and it feels right. :)

@sindresorhus
Copy link
Owner

By passing down the filenames as arguments, more matchers could be functionally pure and thus maybe a little easier to test.

I'm not super excited about this as there are many potential use-cases for this feature when it's not as simple as just using an array of filenames, so having special support for this, means just yet another way of doing things. They could still be kinda pure, by the user creating a factory function that accepts an array of filenames and returns a matcher.

Just use . to return the current directory. I don't think we should support doing so with true. The intent would be unclear in situations like my 2nd example, if I left off && '.'.

Yeah, I agree true is bad, but I'm not sure about . either. Maybe it could be a constant which is a Symbol.

findUp(dir => {
	return path.existsSync(path.join(dir, 'foo')) && path.existsSync(path.join(dir, 'bar')) && findUp.currentDir;
});

Where currentDir is a Symbol that signifies us to use the current directory.

@sholladay
Copy link
Collaborator

Works for me. 👍

We're good to go @CVarisco. To recap: if typeof filenames === 'function', call it in each directory, passing it the directory path. If it returns a truthy string, join it with the directory and resolve the promise, else keep searching. The existing logic should continue to behave the same if filenames is a string or array.

You can, if you feel like it, implement the extra sugar to support returning true or a findUp.currentDir Symbol. I'm equally happy to do that in a follow up PR myself.

@CVarisco
Copy link
Author

Hi guys!

I'm sorry but in the next days, I'll be a little bit busy.
I can work on it in the next week maybe.
Otherwise, you can work on it when you want, no problem.

Really sorry about that 🙁

@sindresorhus
Copy link
Owner

You can, if you feel like it, implement the extra sugar to support returning true or a findUp.currentDir Symbol.

Not true, only the Symbol.

@sindresorhus
Copy link
Owner

@CVarisco No worries. There's no rush :)

This was referenced Mar 17, 2018
@sholladay
Copy link
Collaborator

Between the small merge conflict and the amount of discussion in here, I think a new PR is going to be cleaner and easier for everyone, so going to close this.

I opened #21 to reflect what we want to implement. If you are still up for doing that @CVarisco, it would be super appreciated. Either way, thank you for contributing. You definitely helped us figure out a good path forward. ❤️

@sholladay sholladay closed this Mar 17, 2018
@CVarisco
Copy link
Author

@sholladay thanks a lot for your comment.
I'm really sorry about that but now I'm busy 👍
If someone else can do it before that I can work on it, it would be great!

Sorry again, thanks for your amazing job that improve the community ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants