Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore original path when DO of a script fails #2374

Open
lkppo opened this issue Apr 27, 2019 · 9 comments
Open

Restore original path when DO of a script fails #2374

lkppo opened this issue Apr 27, 2019 · 9 comments

Comments

@lkppo
Copy link

lkppo commented Apr 27, 2019

DO doesn't restore the original path if code being executed in a subfolder fails. This seems inconsistent.

It can be improved by trapping the error, restore the path and propagate the error to the caller.

This bug can be reproduced by creating a buggy script in a subfolder and launching it from a rebol interpreter where the current path is not the subfolder of the buggy script.

  1. Create a folder "subfolder" and put a buggy script myscript.r in that folder with a code that make an error of execution
  2. Launch an interpreter
  3. Let's say if you >> pwd you will get something like == %mytopfolder/
  4. If you >> do %subfolder/myscript.r the buggy script will raise an error
  5. Running >> pwd gives == %mytopfolder/subfolder/

Now the current directory is the subfolder because the script crashed. The context of execution is not restored.

Maybe other things need also to be restored on recovery.

@hostilefork hostilefork changed the title DO doesn't restore the path the code executed fails Restore original path when DO of a script fails Apr 27, 2019
@gchiu
Copy link

gchiu commented Apr 27, 2019

this is an annoying issue so if we decide to fix it, then great.

@Mark-hi
Copy link

Mark-hi commented Apr 27, 2019

NOTE: this comment applies to changes to the current directory that take place from inside the running script. Please see my further comment below for the real story of this issue!

The current directory is part of the process environment, like env-vars, open files, ports, etc.
You might as well ask for files that the script opened (before the error) to be closed, ports closed by the script to be (re-)opened, and environment variables changed by the script to be reset to their original values. Can't be done, and don't think about doing it, because it's wrong.
If you want to do what this issue says, use CALL to spawn another <redbol> process. It will inherit the current directory (among other things), but CDs inside it will not affect the calling environment.

@rgchris
Copy link

rgchris commented Apr 27, 2019

@Mark-hi makes a good point—though I'd say as a temporary change of directory is a feature of DO FILE, perhaps—if a failed script can be detected—it should revert unless the current path has been explicitly changed by that script.

@Mark-hi
Copy link

Mark-hi commented Apr 28, 2019

Thank you for mentioning this @rgchris! My apologies to you and @lkppo, I misunderstood this issue report completely.
I thought it was the script itself that changed its current directory.
If DO changes the current directory before calling the script, that's a different story, and I totally agree that that is super not good.
Ever since Rebol 2 at least, scripts are supposed to have access to the directory they reside in (via system/script/path). If DO changes to that directory before running the script, that means that that would never be necessary, as it would always be the same as what WHAT-DIR returns at script startup.
I just tried it out and you are right, in both Rebol 2 and Rebol 3 Alpha, no matter how you call a script, system/script/path and what-dir are equal when it starts.
Clearly a big bad bug!

@lkppo
Copy link
Author

lkppo commented Apr 28, 2019

No problem. I too have fallen a thousand times in this way. :-))

I was surprised that this bug is also in Rebol 2 (I'm learning the language and Ren-c). Not knowing the whole story I was wondering if the path is the only contextual data affected.

@hostilefork
Copy link
Member

hostilefork commented Apr 28, 2019

Technical point to bring into the discussion: the notion of "current directory" bears some scrutiny.

For instance: if you do http://example.com/some-script.reb, the desire has been to think of the execution "directory" as being http://example.com/ so that other scripts are fetched relative to that URL.

But the filesystem runtime (e.g. Windows SetCurrentDirectory() or POSIX chdir()) can't hold a state that's a URL path.

Ren-C was changed so that the "current directory" reported by WHAT-DIR is stored in a Rebol system variable. (well, @Mark-hi points out that there was a variable before, but it just was wired in somewhere you won't find in the source to WHAT-DIR in particular). Anyway, it is allowed to be a URL! or a FILE!. However, if you ask WHAT-DIR and it's a file path...it currently re-syncs that independent notion with whatever the filesystem happens to think it is. (If the filesystem "extension" is loaded...which it is not in the web build)

I'm not sure what this implies for the notion of rolling things back. But just mentioning it.

@Mark-hi
Copy link

Mark-hi commented Apr 28, 2019

(The following are the meanderings of a mildly addled mind. None of this is canon, or written down anywhere else, yet.)
I like the way Rebol 2 separated two location notions, that of "current directory" and of "script source".
Current directory is a process concept, namely, "where I am executing", for locating (usually data) local files the script or module might read or write, and is exposed through what-dir.
Script source is a language concept, namely, "where I am stored", and is for locating related (usually code) local scripts or modules that might be run or imported, and is exposed through system/script/path.
The code/data distinction is actually unimportant; the important thing to note here is the access requirement, to wit, one may require write access while the other is explicitly limited to the read-only operations of "running" and "importing".
Slightly coincidentally, URLs are read-only.
Lastly, it is difficult for me to imagine a scenario requiring a script or module to be "executing in" a URL directory -- and that's not just considering that it must be read-only, or how easily such a location could be passed in as an argument. Barring a convincing example of such, it is certainly feasible that there be a current directory that is different (and is blank in network-only environments) and that any "related" read-only stuff be run or imported via the script source location.
With all this in mind, I respectfully suggest that system/script/path is the one that needs to be able to be URL! or FILE!, but that what-dir must always be a FILE! (or blank).

@Oldes
Copy link

Oldes commented Apr 29, 2019

In above commit the system/script and directory location is restored in all cases, including an error state.

@hostilefork
Copy link
Member

hostilefork commented Sep 3, 2021

Better late than never when looking at these issues... :-/

TL;DR

DO of a script by FILE! or URL!--including scripts invoked on the command line--should not change the working directory. Instead, a different syntax (TAG!) is used to locate resources relative to the executing script.

(This replaces a prior convention of using tags to lookup modules by shorthand in the module repository, e.g. import <json>. This will now use another notation, e.g. import @json.)

Rationale

Because DO takes arguments from its caller, it needs a way of understanding those arguments in context. And when a user passes relative paths in on the command line, those paths are relative to their working notion of what the directory is.

Information about the caller's notion of working directory can't be thrown out. And the most straightforward way to preserve the information is simply to not change the directory in the first place.

-But- this means a solution is needed for what changing the directory was intended to address: Making it easier for scripts to find resources relative to their own file location. The solution is to use TAG! as a syntax for looking for things relative to system.script.path.

Behavior of IMPORT

The IMPORT command is different, because it loads the module only once...and hence it can't make any particular assumptions about the directory it is being called from.

This might mean that the current directory should be set to NULL, as a way of preventing misconceptions people might have about the relationship between IMPORT and the script that imports it being 1:1

(For the near term, the current behavior of setting the working directory to that of the module being imported can be left as-is.)

Application to Other Functions

Among functions that would want to honor this convention would be LOAD. The loading of data and resources has similar issues: a script may wish to load things the client asked it to load, or it may wish to load data that lives relative to wherever it is.

Hence a service function will be available for any other loading-oriented function that wants to translate TAG!s or shorthands to actual URL! or FILE! paths.

Example

Let's say you're in this situation on the command line:

/home/me$ r3 /usr/tools/script.r ./launchme.r

Or equivalently, this situation inside the interpreter:

>> what-dir
== %/home/me/

>> do/args %/usr/tools/script.r ["./launchme.r"]

Then the contents of %script.r will be processed as such:

Rebol [File: %script.r, Notes: {Located in %/usr/tools/}]

import <helpers/stuff.r>  ; gets %/usr/tools/helpers/stuff.r
data: load <smiley 32x32.png>  ; %"/usr/tools/smiley 32x32.png"

file: as file! system.script.args.1  ; will be %./launchme.r
do file  ; will run %/home/me/launchme.r

hostilefork added a commit to hostilefork/corpus that referenced this issue Sep 3, 2021
As a new rule for Ren-C as a whole, scripts don't change the working
directory when invoked from the command line.  Instead, the feature
of finding resources relative to the script location is done by using
TAG! values as the argument to DO, LOAD, IMPORT:

metaeducation/rebol-issues#2374 (comment)

So long as the note needs changing, this goes ahead and mentions
that the whitespace interpreter is being used as a testbed for the
UPARSE parser that is written entirely in interpreted usermode
code, and is thus glacially slow compared to native code.  The
native code rewrite is being held off until a working stream
parsing design has been achieved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants