Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue730 #226

Draft
wants to merge 44 commits into
base: main
Choose a base branch
from
Draft

Issue730 #226

wants to merge 44 commits into from

Conversation

FlorianPommerening
Copy link
Member

No description provided.

TABCOMPLETION.md Show resolved Hide resolved
driver/tab_completion.py Outdated Show resolved Hide resolved
driver/tab_completion.py Show resolved Hide resolved
driver/tab_completion.py Outdated Show resolved Hide resolved
driver/tab_completion.py Show resolved Hide resolved
src/search/command_line.cc Outdated Show resolved Hide resolved
build.py Show resolved Hide resolved
build.py Show resolved Hide resolved
build.py Outdated Show resolved Hide resolved
build.py Outdated Show resolved Hide resolved
build.py Show resolved Hide resolved
@maltehelmert
Copy link
Contributor

A few general points that we discussed live:

  • This changes the translator to no longer be fully black box. There are good reasons for and against this, and perhaps we want to discuss this more. If we keep it this way, I think the parts of the translator that are meant to be used as a library from the outside should say this clearly in the py file (if that's not already the case) so that in future changes where we touch these files, we're aware that they are used outside of the regular use of the translator.

  • We are generally happy to convert (const) char * to string implicitly when calling a function that takes a string, so using static_caststd::string(foo) is a bit of a stylistic digression. If you prefer to be explicit, personally I'd be much happier to write this using constructor call syntax "string(foo)" because casts always read a bit like a warning flag, but no warning flag is justified in the relevant places.

  • A few new functions look quite long to me; I wonder if it would make sense to break them up.

  • In the driver, I think it would be good to keep the information about where the various executables are located in one place. The new code replicates this in a few places. (I think we already have a canonical place for this information, but I'd have to search what it is.)

@maltehelmert
Copy link
Contributor

Like we discussed, I added comments for the points I noticed, but this is not a code review because I didn't check most of the larger cases of moved code, I skipped over parts I couldn't easily follow, and I didn't think much about the bigger picture, only/mostly comments that were very local to the code I looked at, such as style things.

@SimonDold SimonDold marked this pull request as draft July 19, 2024 12:16
@SimonDold SimonDold marked this pull request as ready for review July 19, 2024 12:16
@speckdavid speckdavid marked this pull request as draft July 19, 2024 12:17
env["COMP_POINT"] = str(comp_point)
env["_ARGCOMPLETE"] = "1"
env["_ARGCOMPLETE_STDOUT_FILENAME"] = f.name
subprocess.check_call([sys.executable, python_file], env=env)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this uses a different mechanism to call the translator than the one we use when we otherwise call the translator from the driver.

For "regular" use, we call the translator as an executable, so we use the Python executable found by its shebang line. Here we explicitly use sys.executable instead, so we call Python rather than calling the translator itself. So if the Python executable on the PATH is python3.10 but we manually call python3.11 fast-downward.py, then when we call the translator as a component, we treat it as an executable (which would use Python 3.10), but when we call it in tab completion mode, we would override its shebang line with sys.executable (using Python 3.11).

I think it makes more sense to always treat it as an executable or to always treat it as Python source file not intended as an executable, and I think I would prefer the former because it treats the component as more of a black box, which I think is good.

If someone has a good argument for using sys.executable, we could also use that in both places, but I don't think we should mix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other place where the translator is called that I could find was in run_components.py where it is also called with sys.executable (https://github.com/aibasel/downward/blob/main/driver/run_components.py#L65). We cannot simply reuse this code, because it does a lot of additional stuff: setting up resource limits, handling how output is printed, etc. It also takes an args argument that is assumed to have arguments for the call in place. Internally, it uses get_error_output_and_returncode which is meant for user-facing execution (e.g., it prints the call string to the log which would be a problem for us here because it would be interpreted as a completion suggestion). I don't think the overlap is big enough that we have to extract some common functionality here.

As to the point about using the shebang: I thought the shebang is only considered if you use subprocess with shell=True which I thought to be a security red flag. I see the point of overwriting the shebang this way, but I don't think this is necessarily a problem. Say I have a Python venv and call the driver with the Python version from that venv (without activating it). Then the translator keeps using the venv, which I wouldn't find surprising as a user who doesn't know about the internals of Fast Downward. I think what I'm trying to argue is that if sys.executable and the executable on the PATH are not the same, then the user already specified a preference for one over the other in the call to the driver so it is OK to keep using it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I looked one level less deeply into the code; not into the code that calls the translator but the code that finds the translator. Because that function was called "get_executable", I assumed that this is then used as an executable, but actually it isn't any more. I see now that this was changed as part of issue733 in order to be able to do experiments that compare the translator on different Python versions by calling the driver with different Python versions.

No, the shebang and shell=True are not related. (And indeed shell=True needs a lot of care to avoid security issues.) Linux has no notion of "file extensions" inside, a filename is just a string. So if I run a file as an executable, Linux needs some way to figure out how to run it. So it looks inside the file. If the file has a recognized binary format (usually ELF), then the kernel knows how to run ELF binaries. Otherwise it looks at the first line to find out which other executable to use in order to invoke the program. It's an OS mechanism, not a shell mechanism. (This is different from Windows, which registers specific file extensions with specific programs and has no real notion of an executable script.)

I'm OK with continuing to use get_executable, though I do find it surprising behaviour. (I understand the use case from issue733, although it could also have been achieved by setting PATH.) I don't know other programs that behave in this way when calling another Python program that is intended as a separate executable. Because for me it's a very unintuitive thing to do, I don't like encoding this logic twice.

I think as a user I would be surprised if a venv takes precedence over the installed version if I don't activate it. Perhaps this depends on whether you think of the driver and translator as independent programs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(BTW, looks like Linux now also has a registration mechanism that can use file extensions as part of the binfmt_misc infrastructure. Not relevant to what we're doing here, but it means that "Linux has no notion of file extensions" is not true; binfmt_misc does support dispatch based on file extensions, although I haven't seen an example of this in use.)

Copy link
Contributor

@maltehelmert maltehelmert Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, I see why you find this difficult to untangle. In my view, part of the reason for this is that the {_,}get_..._executable functions don't have a very good design with the way they handle errors in two different ways depending on an argument passed in. I think in a cleaner design they would not have this argument and should instead raise an exception in cases where they fail. These exceptions could then be handled in different ways by the different callers and the if downward etc. tests avoided. Then if we want to call the translator as [sys.executable, translator_python_file], I'd have this list be what get_translator_executable (perhaps with a different name; it's not a good name currently as described, but also not a good name if it returns this list; something like get_translator_command could be better when returning a list) returns. And rather than a method for calling generic Python code, I wouldn't parameterize it with the translator source file, but always keep the information on how to call the translator as a unit, so that we can use get_translator_command when we need to construct the command-line, i.e., in the line that all these comments comment on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tried to write another version that uses get_*_command() instead of get_*_executable() and trows exceptions there. This now has the bundling of [sys.executable, translate.py] at only one place, but now we repeat some error handling code instead. We also moved the lookup of the executable further inside to _get_completions_from_translator. We didn't move it here (_call_argcomplete), because the surrounding function _get_completions_from_translatoralso needs it to compute the line for which we want to get completions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit that addresses these comments is a239859. Does it resolve this comment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants