Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tilde expansion in path names #1136

Closed
StefanKarpinski opened this issue Aug 8, 2012 · 39 comments
Closed

tilde expansion in path names #1136

StefanKarpinski opened this issue Aug 8, 2012 · 39 comments
Assignees
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request

Comments

@StefanKarpinski
Copy link
Member

@timholy implemented tilde_expand (see file.jl) some time ago and it is now used automatically by a few file-system commands, notably cd(). This concerns me, however, since while it's convenient for some things, it's inconsistently used, and it violates the treatment of strings as just data. What if the name of a file is just ~? We're now in a position where that name needs to be escaped and handled specially in code. On the other hand, tilde expansion is awfully handy.

So here's a possibility: bake tilde expansion into the backtick syntax instead. For example, we could make this work:

julia> run(`tail ~/.julia_history`)
           error("~user tilde expansion not yet implemented")
       end
tilde_expand("~")
tilde_expand("~/foo")
tilde_expand("~stefan/foo")
tilde_expand("~stefan")
`cat ~/.julia`
run(`cat ~/.julia`)
run(`tail ~/.julia_history`)
run(`tail $(ENV["HOME"])/.julia_history`)

Currently this just looks for a file literally named ~/.julia_history. Obviously, this would be handy for entering commands, but you could also use it in situations where you want shell-like treatment of any string. For example cd(~) would expand to cd(ENV["HOME"]) whereas cd("~") would treat its argument literally. Very importantly, tilde expansion would be done at compile time, not based on the run-time data — this is essential to avoiding nasty surprises and corner cases. In particular, interpolated values are not tilde expanded:

julia> `~`
`/Users/stefan`

julia> file = "~";

julia> `$file`
`~`

One corollary of this approach is that file and directory operations have to work with Cmd objects, which maybe ought to be renamed in light of their broadened use-cases under this proposal. Maybe they should be called ShellString objects since backtick strings are basically strings with shell-like behavior. Except that they're also not strings, so maybe that's not a great name either.

@timholy
Copy link
Member

timholy commented Aug 8, 2012

Just to make sure I understand, you're proposing that cd(/juliafunc) would take me to /home/tim/juliafunc, but that cd("~/juliafunc") would be a literal string and look for a subdirectory of the current directory named ""?

I kind of like it, it seems to offer control and flexibility. Especially since Windows temporary files seem to start with ~, I had to put an exception in tilde_expand for Windows. It would be good for the Windows folks to comment here, of course. @loladiro and @vtjnash ?

@StefanKarpinski
Copy link
Member Author

Some interesting data points to consider about the shell's tilde expansion...

It only applies at the beginning of words:

bash-3.2$ echo ~
/Users/stefan
bash-3.2$ echo foo~
foo~
bash-3.2$ echo ~stefan
/Users/stefan
bash-3.2$ echo foo~stefan
foo~stefan

If the username does not exist, no expansion occurs:

bash-3.2$ echo ~foo
~foo

If the user name is interpolated, no expansion occurs:

bash-3.2$ var=stefan
bash-3.2$ echo ~$var
~stefan

I'm not sure we should mimic all of that, but it's worth considering.

@StefanKarpinski
Copy link
Member Author

Just to make sure I understand, you're proposing that cd(~/juliafunc) would take me to /home/tim/juliafunc, but that cd("/juliafunc") would be a literal string and look for a subdirectory of the current directory named ""?

Yep, that's exactly what I'm proposing.

@Keno
Copy link
Member

Keno commented Aug 8, 2012

On windows ~ can either denote a short DOS name if used in the middle of a path segment and (as you mentioned) is also sometimes used to denote temporary files. I am a bit torn as to whether to allow tilde expansion to happen on Windows though I am tending towards allowing it.

@pao
Copy link
Member

pao commented Aug 8, 2012

PowerShell allows ~ as an alias for $env:HOME (it's also able to do some fancier things).

@Keno
Copy link
Member

Keno commented Aug 8, 2012

Oh, I've actually never noticed that. In that case we should definitely allow it on windows.

@JeffBezanson
Copy link
Member

Doing tilde expansion at compile time sounds like a terrible idea! To me that is absolutely wrong.

@StefanKarpinski
Copy link
Member Author

The tilde expansion doesn't have to be done at compile time, but determining what is tilde expanded and what isn't would be done at compile time. So ~ could be expanded to be the equivalent of $(ENV["HOME"]). The point is that $var where var happens to be "~" should not cause tilde expansion.

@JeffBezanson
Copy link
Member

Oh, carry on then :)

@timholy
Copy link
Member

timholy commented Aug 8, 2012

Stefan, I'm sure you already noticed this, but for the record: of your three points, the first matches the current behavior of tilde_expand, the second does not (and it might be better if we changed that, although it does lead to system-dependent results), and the third is essentially "no" but kindof orthogonal since it would be up to the caller.

But I am presuming that a new implementation would be needed with backticks, anyway.

@vtjnash
Copy link
Member

vtjnash commented Feb 1, 2013

bump. now that the file API cleanup happened [#1782], what is the status of this?

@diegozea
Copy link
Contributor

diegozea commented Apr 4, 2013

Close related to this: #2761 (comment)

@sje30
Copy link
Contributor

sje30 commented May 9, 2013

Bump. I'd really like to see ~ recognised within strings as the home directory;

cd(expanduser("~"))

just seems too clunky.

@kmsquire
Copy link
Member

When this was first proposed, I thought it sounded like a good idea (expanding tilde only in backticks, that is). But really, it is rather annoying that it doesn't work in most file contexts (or even in backticks yet).

I also don't quite follow the original argument that it "violates the treatment of strings as just data". What's wrong with assigning meaning to data in proper context?

As for filenames named or starting with ~: 1) they're quite unusual in unix, and 2) they could be handled correctly by escaping.

So, to me, at least, the convenience of tilde expansion on normal strings (and backtick strings) in context outweighs any perceived disadvantages...

@StefanKarpinski
Copy link
Member Author

File names that begin with ~ are rare on UNIX, but not impossible, and that's exactly the point. Julia's shell/file stuff is very carefully designed to make it easy to write fully correct code without escaping. Not sure if you read my two blog posts on the subject:

  1. http://julialang.org/blog/2012/03/shelling-out-sucks/
  2. http://julialang.org/blog/2013/04/put-this-in-your-pipe/

This approach changes running external programs from a brittle, janky, awkward hack that is almost always not-quite-correct, to a completely legitimate and resilient way to write software. In fact, if you don't care about the overhead, using an external program is a great way to perfectly isolate a subtask by letting the OS take care of it.

I'm still ok with doing tilde expansion and glob expansion in backticks, because there it's syntactic, so you know it's intentional. To see the difference, path = "~/foo"; $path/bar.exec should produce ["~/foo/bar"] with no tilde expansion done, whereas path = "foo"; ~/$path/bar.exec should produce ["/User/stefan/foo/bar"] when run on my machine. The difference is that in the first case, the path just happens to be a string starting with a tilde, whereas in the second case, the tilde is intended.

@StefanKarpinski
Copy link
Member Author

One of the first things to do here would be to implement correct user home directory lookup – cross-platform, of course. Even for the current user, looking at ENV["HOME"] is kind of a hack. The next step would be to implement a cross-platform globbing function that can just take a glob string and expand it into a list of path names. Then we can figure out how to hook it into the backtick machinery.

@vtjnash
Copy link
Member

vtjnash commented Jul 31, 2015

since homedir support landed recently in libuv (libuv/libuv#350), we may want to consider switching over to that once #12266 is merged

@StefanKarpinski
Copy link
Member Author

Oh nice! We should definitely do that.

@vtjnash
Copy link
Member

vtjnash commented Aug 24, 2015

it looks like this proposal would provide a fairly accurate match of the sh semantics (for which tilde expansion precedes variable interpolation). The man page for bash seems to be the most detailed (and the one referenced in the testing above for #1136 (comment)). In particular, shells may differ in their handling of arg=~/path and path=$path:~/path:

$ man bash
Tilde Expansion

If a word begins with an unquoted tilde character ('~'), all of the characters
preceding the first unquoted slash (or all characters, if there is no unquoted
slash) are considered a tilde-prefix. If none of the characters in the
tilde-prefix are quoted, the characters in the tilde-prefix following the tilde
are treated as a possible login name. If this login name is the null string,
the tilde is replaced with the value of the shell parameter HOME. If HOME is
unset, the home directory of the user executing the shell is substituted
instead. Otherwise, the tilde-prefix is replaced with the home directory
associated with the specified login name.

If the tilde-prefix is a '~+', the value of the shell variable PWD replaces the
tilde-prefix. If the tilde-prefix is a '~-', the value of the shell variable
OLDPWD, if it is set, is substituted. If the characters following the tilde in
the tilde-prefix consist of a number N, optionally prefixed by a '+' or a '-',
the tilde-prefix is replaced with the corresponding element from the directory
stack, as it would be displayed by the dirs builtin invoked with the
tilde-prefix as an argument. If the characters following the tilde in the
tilde-prefix consist of a number without a leading '+' or '-', '+' is assumed.

If the login name is invalid, or the tilde expansion fails, the word is
unchanged.

Each variable assignment is checked for unquoted tilde-prefixes immediately
following a : or the first =. In these cases, tilde expansion is also
performed. Consequently, one may use file names with tildes in assignments to
PATH, MAILPATH, and CDPATH, and the shell assigns the expanded value.

My proposal then is to implement this only for a literal ~ inside a backtick at the start of a argument string. Splitting on = or : would run into platform discrepancies. Thoughts?

@vtjnash vtjnash self-assigned this Aug 24, 2015
@tkelman
Copy link
Contributor

tkelman commented Dec 22, 2016

this hasn't had any work on it recently, moving to 1.0

@tkelman tkelman modified the milestones: 1.0, 0.6.0 Dec 22, 2016
@StefanKarpinski
Copy link
Member Author

Deprecations in #19786 pave the way for this change in 1.0.

@yuyichao yuyichao removed the help wanted Indicates that a maintainer wants help on an issue or pull request label Apr 11, 2017
@StefanKarpinski
Copy link
Member Author

Seems like we should have this and globbing. Globbing should happen when a command is run, relative to command's working directory. This is actually a feature since you can't use these special characters in command syntax anymore.

@StefanKarpinski StefanKarpinski modified the milestones: 1.x, 1.0 Jul 27, 2017
@StefanKarpinski StefanKarpinski added help wanted Indicates that a maintainer wants help on an issue or pull request and removed needs decision A decision on this change is needed labels Jul 27, 2017
@tecosaur
Copy link
Contributor

Hmmm, I hope this isn't unwanted, but as a relatively new user, I thought I'd chime in with my confusion that cd("~/Documents") didn't work as I expected. I see the comments above that this may be undesirable when referring to files, but whenever a directory is being referend to, wouldn't expanding ~ make sense?

@goretkin
Copy link
Contributor

I see the comments above that this may be undesirable when referring to files, but whenever a directory is being referend to, wouldn't expanding ~ make sense?

There is still the same ambiguity. You could make a directory called ~:

julia> cd("/tmp")

julia> mkpath(joinpath(pwd(), "~", "Documents"))
"/private/tmp/~/Documents"

julia> pwd()
"/private/tmp"

julia> cd("~/Documents")

julia> pwd()
"/private/tmp/~/Documents"

And that's distinct from

julia> cd(expanduser("~/Documents"))

julia> pwd()
"/Users/goretkin/Documents"

@ngharrison
Copy link

ngharrison commented Dec 2, 2021

Just an idea to help users, if the path doesn't exist and the first character in the string is ~, add a suggestion to the error message about ~ not being allowed for the home directory.

@tecosaur
Copy link
Contributor

That would be great. This does seem like one of a number of spots where the error message could be much more helpful. Rust would probably be a good inspiration in this regard.

@vtjnash
Copy link
Member

vtjnash commented Feb 10, 2024

The suggestion has been added to the error message printing for ~ and the cmd parser has its own issue for adding support for more special characters

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request
Projects
None yet
Development

No branches or pull requests