Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

from ..foo import * glob imports are silently ignored by Rust parser #19248

Closed
huonw opened this issue Jun 5, 2023 · 0 comments · Fixed by #19249
Closed

from ..foo import * glob imports are silently ignored by Rust parser #19248

huonw opened this issue Jun 5, 2023 · 0 comments · Fixed by #19249
Assignees
Labels
backend: Python Python backend-related issues bug
Milestone

Comments

@huonw
Copy link
Contributor

huonw commented Jun 5, 2023

Describe the bug

The new Rust parser doesn't seem to understand a 'plain' parent import, like from ..file1 import *. It does understand from ..file1 import func, though.

This reproducer constructs a directory structure like:

.
├── pants.toml
└── src
    └── a
        ├── BUILD
        ├── file1.py
        └── file2.py

where file2.py has from .file1 import *. It runs pants peek src/a/file2.py with both the python parser (which shows a dependency on src/a/file1.py) and with the Rust parser (which doesn't). Unlike #19173, there's no inference warnings either.

cd $(mktemp -d)

cat > pants.toml <<EOF
[GLOBAL]
pants_version = "2.17.0rc0"

backend_packages = ["pants.backend.python"]

[python]
interpreter_constraints = [">=3.8"]

[python-infer]
use_rust_parser = false
EOF

# set up the file structure:
mkdir -p src/a
touch src/a/file1.py
echo "from .file1 import *" > src/a/file2.py

echo "python_sources()" > src/a/BUILD

tree .

# OK: with Python parser `"dependencies": ["src/a/file1.py"]`
pants peek src/a/file2.py

# flip to the rust parser
sed -i '' 's/use_rust_parser = false/use_rust_parser = true/' pants.toml

# BUG: with Rust parser `"dependencies": []`
pants peek src/a/file2.py

Pants version
2.17.0rc0

OS
macOS

Additional info
#19173

@huonw huonw added bug backend: Python Python backend-related issues labels Jun 5, 2023
@huonw huonw changed the title from ..foo import * are silently ignored by Rust parser from ..foo import * glob imports are silently ignored by Rust parser Jun 5, 2023
@huonw huonw self-assigned this Jun 5, 2023
@huonw huonw added this to the 2.17.x milestone Jun 5, 2023
@huonw huonw closed this as completed in ea97fa8 Jun 6, 2023
github-actions bot pushed a commit that referenced this issue Jun 6, 2023
…ser (#19249)

This fixes #19248 by implementing explicit handling for `*` wildcard
imports in the Rust-based Python dependency inference parser.

The commits are somewhat individually reviewable:

1. switching how the `insert_import` arguments are consumed from the
syntax tree, for `from ... import ...` statements in particular:
   - before: `name` and `module_name` correspond to `import $name`,
     `"$name"`, `from $module_name import $name`
   - after: `base` and `imported` correspond to `import $base`, `"$base"`,
     `from $base import $imported`
   - This is in particular flipping the last one, and changes whether the
     `from ...` part is the optional arg (before) or not (after).
   - The motivation is that there's more functional similarities between
     the `from ...` part and a plain `import ...` statement, than between the
     `import ...` part and a plain `import ...` statement, despite the
     superficial similarities of it! (`from foo import bar` is a little like
     `import foo as __secret; bar = __secret.bar`.)
2. Add some tests that fail
3. Fix the bug
4. (and others) some tweaks, including defence-in-depth against similar
    problems happening in future
huonw pushed a commit that referenced this issue Jun 6, 2023
…ser (Cherry-pick of #19249) (#19255)

This fixes #19248 by implementing explicit handling for `*` wildcard
imports in the Rust-based Python dependency inference parser.

The commits are somewhat individually reviewable:

1. switching how the `insert_import` arguments are consumed from the
syntax tree, for `from ... import ...` statements in particular:
   - before: `name` and `module_name` correspond to `import $name`,
     `"$name"`, `from $module_name import $name`
   - after: `base` and `imported` correspond to `import $base`, `"$base"`,
     `from $base import $imported`
   - This is in particular flipping the last one, and changes whether the
     `from ...` part is the optional arg (before) or not (after).
   - The motivation is that there's more functional similarities between
     the `from ...` part and a plain `import ...` statement, than between the
     `import ...` part and a plain `import ...` statement, despite the
     superficial similarities of it! (`from foo import bar` is a little like
     `import foo as __secret; bar = __secret.bar`.)
2. Add some tests that fail
3. Fix the bug
4. (and others) some tweaks, including defence-in-depth against similar
    problems happening in future
huonw added a commit that referenced this issue Jul 5, 2023
This sets up a GHA workflow with a matrix that tests various public
repos against various of pants, using `scie-pants`'s ability to override
`pants.toml`. This is inspired by Rust's
https://github.com/rust-lang/crater ability to test changes and validate
releases across the Rust ecosystem, but is a very much more limited
approach.

This has a few different goals:

- validate releases by running them across public repos
- catch bugs in new features like the Rust Python parser, by forcibly
activating them and running them globally
- get a sense of how people use pants

I'm imagining that a motivated individual might run each release through
this and then skim for problems and file bugs, and/or someone that's
implemented a new feature might activate it (using environment
variables) to look for bugs.

Specifically, this workflow checks out the `pantsbuild/example-...`
repos plus a handful of repos from
https://github.com/search?q=path%3Apants.toml&type=code, and then runs
basic goals like `pants lint ::`, both with the repo's default
configuration, and then with an explicit version/options override.
Running the repo's default configuration as a baseline ensures that we
can compare existing behaviour to new behaviour.


![image](https://github.com/pantsbuild/pants/assets/1203825/d3625e18-9bfd-4d03-9d74-f87ca0db1fd4)

For instance, https://github.com/huonw/pants/actions/runs/5377555807 is
a run against 2.17.0rc1. It shows a few things:

- scie-pants or maybe pants itself crashes on `pants version`, in some
cases:
https://github.com/huonw/pants/actions/runs/5377555807/jobs/9756174075
(this reproduces reliably in these testing runs, but I haven't
investigated)
- several repositories (including `example-docker`) will need fixes for
2.17:
- removing `pants.backend.python.goals.setup_py` breaks several
repositories that have custom plugins that use it
- `example-docker` uses a deprecated feature that's removed in 2.17 (and
indeed, the baseline shows that there's a warning about it)
- `OpenSaMD` seems to have restrictions on the black version that are no
longer satisfied in 2.17?
  - `treb` fails due to the black upgrade changing formatting

https://github.com/huonw/pants/actions/runs/5377729654 is a run against
2.17.0a1, with the Rust parser enabled
(`PANTS_PYTHON_INFER_USE_RUST_PARSER=true`), which contains some bugs
(#19248, #19173) that I found by running that version on my work repo.
This is thus simulating that sort of exploratory run. AFAICT, not many
people use relative imports, so it doesn't actually find those bugs, but
it does find #19174 (e.g.
https://github.com/huonw/pants/actions/runs/5377729654/jobs/9756587004#step:16:16,
https://github.com/huonw/pants/actions/runs/5377729654/jobs/9756585394#step:10:18).

This is a proof of concept. I think there's a few decision points:

1. Experiment in or out of tree:
1. Leave it out of tree, and let me remember to run it occasionally, and
report the results
   2. Land it in tree, and others can run it too
2. How much effort to put into getting random external repos to run?
Lots of the repositories don't have all their goals run automatically,
e.g. `pants tailor ::` or `pants fmt ::` make changes, pants test ::` or
`pants package ::` require external libs to be installed, or `pants test
::` requires external processes to be already running:
1. Keep our configuration up to date, for the moving target of each
external repository
2. Define a protocol that repositories can opt-in to, e.g. two aliases
that get run in succcession:
1.`pants pants-public-repository-testing-set-up`, expected to be `run
shell.sh` or similar to install any external dependencies/start any
services
2. `pants pants-public-repository-testing-run`, expected to be any goals
that are good for testing, e.g. `tailor --check lint check package ::`,
or whatever
4. Implement more features to allow more of these to work automatically
(quite hard, I imagine)
5. Related to that, how to maintain the list of public repositories? If
we did the alias protocol idea, we could have a semi-automated search
for repositories that opted in. Do we have expectations on repositories?
E.g. their baselines should pass without warnings?
4. Maybe this should be running automatically after each release?

I think there's enough approvals to say 2, land it for experimentation.

Fixes #249.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: Python Python backend-related issues bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant