Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Pex CLI can fail to find pex.bin. #1272

Closed
jsirois opened this issue Mar 15, 2021 · 0 comments · Fixed by #1270
Closed

The Pex CLI can fail to find pex.bin. #1272

jsirois opened this issue Mar 15, 2021 · 0 comments · Fixed by #1270
Assignees
Labels

Comments

@jsirois
Copy link
Member

jsirois commented Mar 15, 2021

As originally reported in pantsbuild/pants#11211, if the timing is right and the set of available interpreters that conform to Pex's interpreter constraints is right it appears that the interpreter identification / selection that runs in parallel in the PEX bootstrap phase can lead to Pex PEX failing to run.

Interpreter selection is started here in maybe_reexec_pex:
https://github.com/pantsbuild/pex/blob/1111cae72e4eea2c1681745925e50c38ec7cb3e8/pex/pex_bootstrapper.py#L471-L474

If the current interpreter:

  1. Conforms to Pex interpreter constraints (i.e.: Has a version in ">=2.7,!=3.0.,!=3.1.,!=3.2.,!=3.3.,!=3.4.*,<3.10").
  2. Is not the 1st interpreter on the PEX_PYTHON_PATH.

Then this code will short circuit some time after the 1st interpreter is yielded (i.e.: on the second interpreter or greater):
https://github.com/pantsbuild/pex/blob/1111cae72e4eea2c1681745925e50c38ec7cb3e8/pex/pex_bootstrapper.py#L158-L175

Since the short circuit happens on the second or later interpreter, this execute_parallel code is engaged by the lazy iterator and it spawns a single thread to manage concurrent shell-outs:
https://github.com/pantsbuild/pex/blob/1111cae72e4eea2c1681745925e50c38ec7cb3e8/pex/pex_bootstrapper.py#L103-L108
https://github.com/pantsbuild/pex/blob/1111cae72e4eea2c1681745925e50c38ec7cb3e8/pex/interpreter.py#L776-L796
https://github.com/pantsbuild/pex/blob/1111cae72e4eea2c1681745925e50c38ec7cb3e8/pex/jobs.py#L384-L404

That iterator neither auto-cancels nor is cancelled by this Pex bootstrap code. The upshot, is when the code continues on from maybe_reexec_pex to PEX.execute() there can still be background work going on identifying candidate interpreters that will never be picked:
https://github.com/pantsbuild/pex/blob/1111cae72e4eea2c1681745925e50c38ec7cb3e8/pex/pex_bootstrapper.py#L471-L474

This is fine in and of itself, but it appears it can cause an import of pex to interleave between the unimports here and the sys.path re-arrangement below:
https://github.com/pantsbuild/pex/blob/1111cae72e4eea2c1681745925e50c38ec7cb3e8/pex/bootstrap.py#L36-L52

This causes pex to be re-imported from the .bootstrap/ instead of from the pex distribution for the Pex PEX, which is disastrous since the .bootstrap/ only contains enough of pex to bootstrap a PEX file and not the whole CLI. The result is something like:

Traceback (most recent call last):
  File "/home/ziyadedher/.cache/pants/named_caches/pex_root/unzipped_pexes/42d4f75d046397e5f014ec8afd1a39b3ba8787c6/.bootstrap/pex/pex.py", line 446, in execute
    exit_code = self._wrap_coverage(self._wrap_profiling, self._execute)
  File "/home/ziyadedher/.cache/pants/named_caches/pex_root/unzipped_pexes/42d4f75d046397e5f014ec8afd1a39b3ba8787c6/.bootstrap/pex/pex.py", line 378, in _wrap_coverage
    return runner(*args)
  File "/home/ziyadedher/.cache/pants/named_caches/pex_root/unzipped_pexes/42d4f75d046397e5f014ec8afd1a39b3ba8787c6/.bootstrap/pex/pex.py", line 409, in _wrap_profiling
    return runner(*args)
  File "/home/ziyadedher/.cache/pants/named_caches/pex_root/unzipped_pexes/42d4f75d046397e5f014ec8afd1a39b3ba8787c6/.bootstrap/pex/pex.py", line 508, in _execute
    return self.execute_entry(self._pex_info.entry_point)
  File "/home/ziyadedher/.cache/pants/named_caches/pex_root/unzipped_pexes/42d4f75d046397e5f014ec8afd1a39b3ba8787c6/.bootstrap/pex/pex.py", line 610, in execute_entry
    return runner(entry_point)
  File "/home/ziyadedher/.cache/pants/named_caches/pex_root/unzipped_pexes/42d4f75d046397e5f014ec8afd1a39b3ba8787c6/.bootstrap/pex/pex.py", line 625, in execute_pkg_resources
    runner = entry.resolve()
  File "/home/ziyadedher/.cache/pants/named_caches/pex_root/unzipped_pexes/42d4f75d046397e5f014ec8afd1a39b3ba8787c6/.bootstrap/pex/vendor/_vendored/setuptools/pkg_resources/__init__.py", line 2481, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
ModuleNotFoundError: No module named 'pex.bin'

Note that in the chain of events described above, the critical event - the re-import of pexfrom a background thread - is neither observed when instrumenting imports with a sys.path_importer_hook (only a single import of encodings.ascii by encodings via the __import__ function is observed) nor does it appear this should ever happen by inspection. In other words it appears all pex imports happen before the thread is ever spawned.

@jsirois jsirois added the bug label Mar 15, 2021
@jsirois jsirois self-assigned this Mar 15, 2021
jsirois added a commit that referenced this issue Mar 15, 2021
Although there is currently no identified way this can happen, this fix
is known to work for one victim of #1272.

Fixes #1272
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant