Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue resolving generated Python module with source at root #17531

Closed
dlsmith opened this issue Nov 11, 2022 · 6 comments · Fixed by #17637
Closed

Issue resolving generated Python module with source at root #17531

dlsmith opened this issue Nov 11, 2022 · 6 comments · Fixed by #17637
Assignees
Labels

Comments

@dlsmith
Copy link

dlsmith commented Nov 11, 2022

Describe the bug
Python failed to resolve a generated module (in this case, from protobuf) when the project was structured with the Python source at the root of the package. My hypothesis is that this is due to the way Python resolves modules: it looks first in the execution directory, then to PYTHONPATH. Pants links the generated code from the sandbox via the PYTHONPATH, but if it replicates/preserves the semantics of Python, the interpreter may first look in a file tree with the parent module structure but without the generated files, leading to an import error.

I've included a minimal example below that demonstrates the issue and works around it by nesting the source root under a src/ prefix.

Pants version
2.14.0

OS
MacOS

Additional info
pants-codegen-repro.tar.gz

@dlsmith dlsmith added the bug label Nov 11, 2022
@benjyw
Copy link
Contributor

benjyw commented Nov 11, 2022

As @dlsmith intuited, this is because ./pants run on a source file runs the process with the cwd set to the repo root, Python prepends the cwd to the sys.path, and Pex preserves this sys.path entry.

The sandbox dir contains the sources (authored and generated) and is added to the sys.path. And that's where we want them to be loaded from. But if the cwd is also the source root, then Python will see the top-level package in the source tree first, load it from there, and because it's not a namespace package, it will look no further for any imports under that package. And the generated code is not in the source tree, so it won't find it.

This does not happen with Pants processes in general, because they are fully sandboxed. It's just ./pants run that runs the user process (still from a sandbox, but) with cwd set to the repo root.

@benjyw
Copy link
Contributor

benjyw commented Nov 11, 2022

Solution is to override that sys.path[0] setting somehow, looking into it

@benjyw benjyw self-assigned this Nov 11, 2022
@jsirois
Copy link
Contributor

jsirois commented Nov 11, 2022

I think this requires a Pex feature. It's the one non-hermetic affordance Pex makes, and that's due to competing goals:

  1. Isolation from system installed distributions
  2. Behave just like a Python interpreter

Here Pex chooses the Python interpreter behavior of pre-pending . to the sys.path (2) over hermeticity (1).

I think this could be both a Pex build-time option and PEX_* runtime override.

@benjyw
Copy link
Contributor

benjyw commented Nov 11, 2022

@dlsmith how disruptive is this for you?

@dlsmith
Copy link
Author

dlsmith commented Nov 12, 2022

@benjyw Not at all disruptive. I'm happy to adjust the structure of my repo to work around the issue. Though maybe it would be worth adding a warning in the documentation?

@benjyw
Copy link
Contributor

benjyw commented Nov 13, 2022

Yeah that makes sense. The docs are in the repo (under docs/) if you feel like cooking up a quick PR? :)

Long term I'd like to fix this in pex, but the priority is not high if this isn't a big issue for you, and no one else appears to have hit this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants