Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add relativePath proc and test #8166

Closed
wants to merge 7 commits into from
Closed

Add relativePath proc and test #8166

wants to merge 7 commits into from

Conversation

demotomohiro
Copy link
Contributor

Add a getRelPathFromAbs proc in ospaths module that convert given absolute path to a relative path.

@Araq
Copy link
Member

Araq commented Jun 30, 2018

It should be named getRelativePathFromAbsolute.

@Varriount
Copy link
Contributor

@Araq What about just "makeRelative"? Or "convertToRelative"?

@demotomohiro
Copy link
Contributor Author

@Varriount I have proc getRelPathFromRel (According to Araq, it should be named getRelativePathFromRelative) that calculate relative path from 2 relative paths and a path.

proc getRelPathFromRel(path, baseDir, curDir: string): string = 
  ## Convert 'path' to a relative path from baseDir.
  ##
  ## Both 'path' and 'baseDir' must be relative paths from 'curDir'.
  ## This proc never read filesystem.
  ## 'baseDir' is always assumed to be a directory even if that path is actually a file.
  runnableExamples:
    doAssert getRelPath("a/b", "../a", "/x/y") == "../y/a/b"

I'm not sure whether this proc should be added to standard library.
But without "FromAbsolute" suffix, distinguishing getRelativePathFromRelative and getRelativePathFromAbsolute will be hard.

@timotheecour
Copy link
Member

timotheecour commented Jul 17, 2018

/cc @demotomohiro
I don't understand why we need to distinguish getRelativePathFromRelative vs getRelativePathFromAbsolute
We just need the exact same as

I don't really care for the name though, but relativePath is simple

@demotomohiro
Copy link
Contributor Author

If this pull request is merged, I will send pull request to add getRelativePath or relativePath procedure that can take relative or absolute paths and return a relative path.
Such a procedure checks whether given path is relative or absolute and call getRelativePathFromRelative or getRelativePathFromAbsolute.
If you knew paths are absolute, you can skip the branch by calling getRelativePathFromAbsolute.

getRelativePathFromRelative procedure require 3 paths.
e.g.

  #path "a" and ".." are relative to "/x/y"
  doAssert getRelativePathFromRelative("a", "..", "/x/y") == "y/a"

calculate a relative path of "a" relative to "..".
That require parent directory name of path "a".
In pathlib in python or boost::filesystem in C++, such case throw exception or return "".
If you knew your paths are absolute, you don't need to pass third parameter to getRelativePathFromAbsolute.
If both relative path and absolute path were distinct type, overloaded relativePath procedures would work.

@dom96
Copy link
Contributor

dom96 commented Jul 19, 2018

I like relativePath for a name. I'd rather not have three different procedures just to avoid a branch.

@demotomohiro
Copy link
Contributor Author

I commited relativePath that convert both absolute and relative path to relative path.
It passed tests.
But runnableExamples of relativePath is ugly because there are many unixToNativePath call.
I think runnableExamples should be writen simpler way.
Should I put these code under when defined(posix): or define following procedure under runnableExample and rewrite examples using it?

  proc relativePathHelper(path, baseDir, curDir: string = ""): bool =
    return relativePath(path.unixToNativePath,
                        baseDir.unixToNativePath,
                        curDir.unixToNativePath)

path.unixToNativePath("a"),
baseDir.unixToNativePath("a"),
curDir.unixToNativePath("a"))
#debugEcho r
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of leaving debugging comments, how about:

if r != res.unixToNativePath:
  print whatever you want
return r == res.unixToNativePath

@timotheecour
Copy link
Member

timotheecour commented Oct 15, 2018

relativePath(dir) => same as relativePath(dir, cwd)
doAssert relativePath("foo/bar", "foo") == "bar"
doAssert relativePath("/foo/bar", "/foo") == "bar"

doAssert relativePath("/foo/bar", "foo") => throws
doAssert relativePath("foo/bar", "/foo") => throws

user is free to use these to avoid throwing in last 2 cases where one path is rel the other abs:

doAssert relativePath("/foo/bar".relativePath, "foo")
doAssert relativePath("foo/bar", "/foo".relativePath)

that way, interface is cleaner and more compositional;

it's also more similar to what's done in other languages, eg:
https://dlang.org/library/std/path/relative_path.html
https://docs.python.org/3/library/os.path.html
https://www.boost.org/doc/libs/1_61_0/libs/filesystem/doc/reference.html#op-relative
which all take 1 or 2 args path arguments, not 2 or 3 path arguments.

  • this PR needs to be rebased against devel

  • sorry if review process is taking so long

  • is there a way to reduce number of new symbols added?

@timotheecour
Copy link
Member

timotheecour commented Oct 18, 2018

@demotomohiro +1 for your suggested relativePathHelper ; right now it's indeed a bit ugly; @Araq are you ok with that? (so you won't need to do the work then undo it)

actually since tospaths.nim does that, maybe just using when defined(posix) is good enough in runnableExamples

@demotomohiro
Copy link
Contributor Author

@timotheecour Thank you for viewing my PR.

3rd param in relativePath is used not only when one of path and baseDir is a relative and another is a absolute path, but also when both of them are relative path.
I added comments that explains requirements of parameters.

For example, relativePath in this PR returns "a" in following code.
Result of this code requires parent directory name of the path ".".

echo relativePath(".", "..", "a")

I researched relativePath in other langauges.
None of them cannot return relative path "a".

Test Python's os.path.relpath and pathlib.PurePosixPath.relative_to
https://wandbox.org/permlink/dLkyBeGHAnC7OJ0c
os.path.relpath always assumes given paths are absolute paths or relative paths relative to current working directory.
When I want to convert a relative path that is NOT relative to cwd,
I have to convert it to a absolute path or change cwd before calling os.path.relpath.
For example, suppose path "." is converted to absolute path "/foo/a" and ".." to "/foo",
then calling os.path.relpath("/foo/a", "/foo") returns "a".
Or set current workding directory to "/foo/a" so that path "." and ".." are relative to cwd,
then calling os.path.relpath(".", "..") returns "a".

pathlib.PurePosixPath( . ).relative_to( .. ) => ValueError: '.' does not start with '..'

Test C++ boost::filesystem::relative
https://wandbox.org/permlink/4qUq43a9uH2XNN7y
Outputs are same to Python's os.path.relpath.

std.path.relativePath in D:
https://dlang.org/library/std/path/relative_path.html
According to the document, "If path is a relative directory, return it unaltered."

Struct std::path::Path in Rust:
https://doc.rust-lang.org/std/path/struct.Path.html
I don't know much about Rust langauage.
It seems Rust's standard library doesn't have a method that convert a path to a relative path.

Nim compiler already have relativeTo:
compiler/pathutils.nim
It takes only absolute path.

Copy link
Contributor

@krux02 krux02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like it at all. It adds over 300 lines of code, is very low level everywhere (hard to maintain), and I can't specify at runtime if the path should be treated case sensitive.

proc isSep(c: char): bool {.noSideEffect.} = c in {DirSep, AltSep}

proc cmpCharInPath(a, b: char): bool {.noSideEffect.} =
when FileSystemCaseSensitive:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if I like FileSystemCaseSensitive as a constant. It is not uncommon to have an ntfs hard drive mounted somewhere in a unix file system (the Windows partition).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how it's done and it works well. There is no way to get this info in a portable way at runtime.


result = if numUp > 0: ParDir & (DirSep & ParDir).repeat(numUp-1) else: ""
if pos < path.len:
return result / path.substr(pos)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is way too low level. Just normalize the input paths and strip the common prefix.

Copy link
Member

@timotheecour timotheecour Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw if you're gonna go that route (normalize), one way to do it would be to move os.normalizePath to ospaths.normalizePath (as mentioned in TODO here: #6587 (comment)) (but doing so in a separate PR as it's a useful change by itself)

@timotheecour
Copy link
Member

timotheecour commented Oct 24, 2018

I really want a relativePath proc added to stdlib however:

  • I have to agree with what @krux02 said (and I had mentioned it earlier: "is there a way to reduce number of new symbols added?"): this PR adds too much (low level) code which is a problem for future maintainance (if this was code reusable in other context, that would be a different story)

  • the behavior of relativePath should always satisfy:

`c = relativePath(a, b)` implies: `cmpPaths(a,  b / c)`

we need to be able to call relativePath with 1 param (as in other languages):

  • relativePath(abspath) = relativePath(abspath, cwd)

  • relativePath(relpath): for this case I need to think more: according to above rule we should throw; another sensible option is to return relpath

other bugs

  • you're not taking care of .. in 2nd argument
    relativePath("a/b", "c/../") returns ../../a/b => should be a/b

  • you're not taking care of duplicate separators, eg:
    relativePath("a//b", "a") returns /b => should be b

@Araq
Copy link
Member

Araq commented Oct 24, 2018

As it has been said, my implementation in compiler / pathutils also provides a relativeTo.

@timotheecour
Copy link
Member

the other thing I was thinking about before @Araq 's comment was to reuse this:

iterator parentDirs(path: string; fromRoot = false; inclusive = true): string {...}

with fromRoot = true to avoid reinventing the wheel with low-level code, and be robust to issues such as duplicate separators and handling of .. (haven't verified whether that would work though)

or maybe integrating @Araq 's relativeTo is (part of) the solution

@timotheecour timotheecour changed the title Add getRelPathFromAbs proc and test Add relativePath proc and test Oct 25, 2018
@timotheecour timotheecour added the OS (stdlib) Related to OS modules in standard library label Oct 25, 2018
@demotomohiro
Copy link
Contributor Author

Thank you for reviewing my PR.

You don't like my code and there is already relativeTo in compiler/pathutils.
I think improving relativeTo in pathutils is better than fixing code in this PR.
(Tests in this PR can be reused if improved relativeTo takes 3 parameters)

It accepts only absolute path types.

proc relativeTo*(fullPath: AbsoluteFile, baseFilename: AbsoluteDir;
                 sep = DirSep): RelativeFile =

When I give it relative paths, it doesn't work with some relative paths.
For example,
relativeTo("a", "../b") returns "../../a". It should be "../x/a" (x is parent directory of a").
relativeTo("a/b/c", "a/x/../b") returns "../../../b/c". It should be "c".
relativeTo("./a/b", "a/b") returns "../.././a/b" and relativeTo("a/b", "./a/b") returns "../../../a/b". Both results should be "".
Also, it doesn't work with windows absolute path with different drive:
relativeTo("c:\\a", "d:\\b") returns "..\\..\\c:\\a". It should raise error or return "c:\\a".

I think there are 3 ways to make relativeTo supports relative path.

  1. normalize input paths.
    Remove "." and ".." if possible.

  2. Improve PathIter in compiler/pathutils.nim so that it returns each directory name in a path as if input path was normalized.
    Do not need to allocate memory for normalized paths.
    But if there are ".." in anywhere of input paths, proc next(it: var PathIter; x: string) could be complicated because it has to scan whole path before returning first directory name.
    (Such proc next maybe not difficult if it returns directory names in reverse order, but it is useless for relativeTo)

  3. Only accepts paths that relativeTo in compiler/pathutils.nim can correctly convert otherwise raise error.
    Only work when both of inputs are normalized absolute path or relative path.
    Input paths must be normalized before calling relativePath.
    No 3rd parameter.
    Mixing absolute path and relative path is error or return input path as is if it is a relative path.
    If both input are relative path and base has more ".." prefix than full, raise error because result relative path require parent directory name of full but it is not provided.

2 parameters vs 3 parameters

If one of full or base is relative path and other one is absolute path, conversion is required so that both of inputs become absolute or relative.
I think converting relative one to absolute path is easier than converting absolute one to relative.
2 parameters version can do:

  • convert a relative path relative to current working directory.
    Getting current working directory is not available from procedure with noSideEffect.
    boost::filesystem::relative in C++ and os.path.relpath in Python do this.
  • Don't accept such inputs.

3 parameters version convert relative one to absolute path by prepending 3rd parameter.

If both input paths are relative path and base has ".." prefix more than full after normalization, parent directories of path are required to make a correct relative path.
2 parameters version might convert input paths to absolute path by getting current working directory but it is not noSideEffect and input paths have to be relative to CWD.
Or this version raise error for such inputs.
For example, if you want to convert path "a" to a path relative to "../b", you have to convert path "a" to "x/a" and path "../b" to "b" before calling relativePath.

In 3 parameters version, 3rd parameter is a parent directory of full and base.
3rd parameter can be a absolute or relative path.
But it doesn't mean it can always convert correctly.
For example,
relativePath("a", "../../b", "x") raise error because correct result is "../y/x/a" but 3rd parameter doesn't include directory "y".
relativePath("a", "../../b", "y/x") returns "y/x/a".

@demotomohiro
Copy link
Contributor Author

iterator parentDirs(path: string; fromRoot = false; inclusive = true): string {...}

with fromRoot = true to avoid reinventing the wheel with low-level code, and be robust to issues such as duplicate separators and handling of .. (haven't verified whether that would work though)

This code echo "foo", "foo/bar", "foo/bar/abc".

for i in parentDirs("foo/bar/abc", true):
  echo i

It can be used to find common parent directory, but it has order of n^2 time complexity.
Something like cmpPaths procedure that also returns first different directory or filename position would be better.

we need to be able to call relativePath with 1 param (as in other languages):

  • relativePath(abspath) = relativePath(abspath, cwd)
  • relativePath(relpath): for this case I need to think more: according to above rule we should throw; another sensible option is to return relpath

It should be separate procedure defind in os module as getCurrentDir() is defined in os module and noSideEffect proc cannot call it.

@timotheecour
Copy link
Member

timotheecour commented Oct 27, 2018

It can be used to find common parent directory, but it has order of n^2 time complexity.

no, using something like zip it's still O(n)

I think you can do smthg like this (untested)

while true:
  if finished(myIter1): ...
  if finished(myIter2): ...
  let dir1 = myIter1()
  let dir2 = myIter2()
  ...

let me know if smthg like that could work (if it doesn't i can try to look into it more, but I think something like zip can be used on iterator pairs)

@timotheecour
Copy link
Member

It should be separate procedure defind in os module as getCurrentDir() is defined in os module and noSideEffect proc cannot call it.

=> #9523

as for the noSideEffect, it's a seperate issue; we could define an overload instead, or maybe enhance the logic behind noSideEffect, or maybe even drop noSideEffect in this case

@timotheecour
Copy link
Member

btw: I'll reply to your other comments from #8166 (comment) later

@demotomohiro
Copy link
Contributor Author

I don't understand why it is O(n) using zip.
parentDirs("a//b//c", true) returns "a/", "a//b/", "a//b//c".
Only last separator of each returned values are reduced to one separator.
parentDirs("a/../b/../c", true) returns "a/", "a/../", "a/../b/", "a/../b/../", "a/../b/../c".
It doesn't remove "..".

If input paths were normalized, consecutive directory separators are collapsed, "." (current directory) are removed and ".." are removed if possible and ".." appear only in head of the path.
Then, finding common parent directory would be like this:

  let normPath = normalizedPath(path)
  let normBase = normalizedPath(baseDir)
  let m = min(normPath.len, normBase.len)
  var commonParentPos = 0
  for i in 0..<m:
    if not cmpCharInPath(path[i], baseDir[i]):
      break
    if path[i] in {DirSep, AltSep}:
      commonParentPos = i

This is not tested but I don't think iterator parentDirs is needed if input paths are normalized.

@timotheecour
Copy link
Member

This is not tested but I don't think iterator parentDirs is needed if input paths are normalized.

right

using normalizedPath code should be much simpler i hope; still remains issue of #9523 to allow 1 param relativePath

@Araq
Copy link
Member

Araq commented Dec 13, 2018

Merged my own stuff instead. But thank you for your efforts.

@bluenote10
Copy link
Contributor

Where is this functionality now?

@Araq
Copy link
Member

Araq commented Apr 14, 2019

os.relativePath

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS (stdlib) Related to OS modules in standard library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants