Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in _populate_deps(?) for --venv PEXes #2088

Closed
huonw opened this issue Mar 8, 2023 · 1 comment · Fixed by #2090
Closed

Regression in _populate_deps(?) for --venv PEXes #2088

huonw opened this issue Mar 8, 2023 · 1 comment · Fixed by #2090
Assignees

Comments

@huonw
Copy link
Collaborator

huonw commented Mar 8, 2023

It seems f723302 was a significant performance regression for --venv PEXes,especially for PEXes with many small files. Potentially it's a correctness fix for which we just have to eat the performance impact. In addition to a regression, this time doesn't seem to be bracketed by any of the tracing spans, and so isn't obvious in logs.

The table compares the cold-cache start-up time for two packages that are both fairly large, but with drastically different numbers of files. The table compares 2.1.119 and 2.1.120, but I separately bisected the regression to f723302.

package wheel size number of files time 2.1.119 (s) time 2.1.120 (s)
semgrep==1.14.0 27MB 123 1.4 1.8
plotly==5.13.1 15MB ~12k 2.5 6.6

Reproducer, using ts -i to record the incremental timings in text (leading number on each line of the output):

# 2.1.119
curl -L https://github.com/pantsbuild/pex/releases/download/v2.1.119/pex -o pex-2.1.119

python pex-2.1.119 --venv -m platform semgrep==1.14.0 -o semgrep_via_119.pex
time PEX_VERBOSE=1 PEX_ROOT=$(mktemp -d) python semgrep_via_119.pex 2>&1 | ts -i '%.s'

python pex-2.1.119 --venv -m platform plotly==5.13.1 -o plotly_via_119.pex
time PEX_VERBOSE=1 PEX_ROOT=$(mktemp -d) python plotly_via_119.pex 2>&1 | ts -i '%.s'

# 2.1.120
curl -L https://github.com/pantsbuild/pex/releases/download/v2.1.120/pex -o pex-2.1.120

python pex-2.1.120 --venv -m platform semgrep==1.14.0 -o semgrep_via_120.pex
time PEX_VERBOSE=1 PEX_ROOT=$(mktemp -d) python semgrep_via_120.pex 2>&1 | ts -i '%.s'

python pex-2.1.120 --venv -m platform plotly==5.13.1 -o plotly_via_120.pex
time PEX_VERBOSE=1 PEX_ROOT=$(mktemp -d) python plotly_via_120.pex 2>&1 | ts -i '%.s'

Output for plotly, with 2.1.119:

...
0.158249 pex: Laying out PEX zipfile /Users/huon/tmp/pex-perf-regression/plotly_via_119.pex
0.000613 pex: Laying out PEX zipfile /Users/huon/tmp/pex-perf-regression/plotly_via_119.pex :: Installing /Users/huon/tmp/pex-perf-regression/plotly_via_119.pex to /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmp.K8DBvXx7/unzipped_pexes/9b545f5764cfad9084f31d8fa169a9d3b12acb9d
1.749525 pex: Laying out PEX zipfile /Users/huon/tmp/pex-perf-regression/plotly_via_119.pex: 1750.2ms
0.000070 pex:   Installing /Users/huon/tmp/pex-perf-regression/plotly_via_119.pex to /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmp.K8DBvXx7/unzipped_pexes/9b545f5764cfad9084f31d8fa169a9d3b12acb9d: 1749.7ms
0.001437 pex: Executing installed PEX for /Users/huon/tmp/pex-perf-regression/plotly_via_119.pex at /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmp.K8DBvXx7/unzipped_pexes/9b545f5764cfad9084f31d8fa169a9d3b12acb9d
0.242556 pex: Testing /Users/huon/.pyenv/versions/3.9.10/bin/python3.9 can resolve PEX at /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmp.K8DBvXx7/unzipped_pexes/9b545f5764cfad9084f31d8fa169a9d3b12acb9d
0.008913 pex:   Testing /Users/huon/.pyenv/versions/3.9.10/bin/python3.9 can resolve PEX at /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmp.K8DBvXx7/unzipped_pexes/9b545f5764cfad9084f31d8fa169a9d3b12acb9d: 8.9ms
0.052860 pex: Hashing pex
0.010083 pex: Hashing pex: 10.1ms
0.000098 pex: Isolating pex
0.000277 pex: Isolating pex :: Extracting pex to /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmp.K8DBvXx7/isolated/f2df3e7415f34f911478882907d6c86e4750ccaa
0.064839 pex: Isolating pex: 65.1ms
0.000036 pex:   Extracting pex to /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmp.K8DBvXx7/isolated/f2df3e7415f34f911478882907d6c86e4750ccaa: 64.7ms
0.149548 pex: Re-writing /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmp.K8DBvXx7/venvs/9b545f5764cfad9084f31d8fa169a9d3b12acb9d/547bc0ed92f40a50fbeafded4a247435b07c2533.lck.work/bin/activate
0.000239 pex: Re-writing /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmp.K8DBvXx7/venvs/9b545f5764cfad9084f31d8fa169a9d3b12acb9d/547bc0ed92f40a50fbeafded4a247435b07c2533.lck.work/bin/activate.csh
0.000226 pex: Re-writing /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmp.K8DBvXx7/venvs/9b545f5764cfad9084f31d8fa169a9d3b12acb9d/547bc0ed92f40a50fbeafded4a247435b07c2533.lck.work/bin/activate.fish
0.042920 macOS-12.5.1-arm64-arm-64bit
PEX_VERBOSE=1 PEX_ROOT=$(mktemp -d) python plotly_via_119.pex 2>&1  1.07s user 1.25s system 92% cpu 2.503 total
...

Output for plotly with 2.1.120 is broadly the same, except for the last few lines:

...
0.000250 pex: Re-writing /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmp.dnHmgqeE/venvs/d3179052216c59d19f39c057259d9b6f94635d28/547bc0ed92f40a50fbeafded4a247435b07c2533.lck.work/bin/activate.csh
0.000249 pex: Re-writing /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmp.dnHmgqeE/venvs/d3179052216c59d19f39c057259d9b6f94635d28/547bc0ed92f40a50fbeafded4a247435b07c2533.lck.work/bin/activate.fish
4.125079 macOS-12.5.1-arm64-arm-64bit
PEX_VERBOSE=1 PEX_ROOT=$(mktemp -d) python plotly_via_120.pex 2>&1  1.52s user 4.61s system 93% cpu 6.568 total
...

In particular there's 4.1s of extra work happening silently (even at PEX_VERBOSE=9) after the last "Re-writing" line.

A py-spy flame graph of a similar PEX (built with 2.1.125) hints at some possibilities for the location of the regression. Unfortunately, I haven't been able to get this recorded reliably, so it's not the PEX in the reproducer above, nor do I have a trace of 2.1.119 to compare.

17565-2023-03-07T13-27-24+11-00

(As always, thanks for PEX; it is a reliable workhorse in our deploys. 👍 )

@huonw huonw changed the title Regression in _populate_deps for --venv PEXes Regression in _populate_deps(?) for --venv PEXes Mar 8, 2023
@jsirois
Copy link
Member

jsirois commented Mar 8, 2023

You bisected the right change. As you may notice, that change purports to be a fix for loose layout PEXes - which you don't use. So likely the Pex misidentifes itself.

diff --git a/pex/pex.py b/pex/pex.py
index eebab84..feb50d4 100644
--- a/pex/pex.py
+++ b/pex/pex.py
@@ -172,6 +172,7 @@ class PEX(object):  # noqa: T000
         # type: () -> Layout.Value
         if self._layout is None:
             self._layout = Layout.identify(self._pex)
+        print(">>> I'm {} and my layout is: {}".format(self._pex, self._layout), file=sys.stderr)
         return self._layout

     def pex_info(self, include_env_overrides=True):

Shows:

$ python -mpex cowsay -c cowsay -ocowsay.pex --venv
>>> I'm /home/jsirois/.pex/pip-20.3.4-patched.pex/3a5d8c60b769730f991f150aaa45975637d8ae30 and my layout is: loose
$ rm -rf ~/.pex
$ ./cowsay.pex 'Moo!'
>>> I'm /home/jsirois/.pex/unzipped_pexes/33d2aea7a351e1778d02fa3b66ec67f811445092 and my layout is: loose
  ____
| Moo! |
  ====
    \
     \
       ^__^
       (oo)\_______
       (__)\       )\/\
           ||----w |
           ||     ||

I'm ~afk through March 14th and will fix this then if no-one else has. But that's the issue.

@jsirois jsirois self-assigned this Mar 9, 2023
This was referenced Mar 9, 2023
jsirois added a commit to jsirois/pex that referenced this issue Mar 9, 2023
The change in pex-tool#2033 that safeguarded loose `--venv` PEXes such that they
could be executed, moved and continue to execute properly from their new
location. That fix silently broke zipapp and packed layout `--venv` PEX
symlinking. Although `--venv` PEXes for those layouts continued to work,
they used copying to build their venvs instead of symlinking even when
symlinking (the default for `--venv` PEXes) was the intended style.

Fixes pex-tool#2088
jsirois added a commit that referenced this issue Mar 9, 2023
The change in #2033 that safeguarded loose `--venv` PEXes such that they
could be executed, moved and continue to execute properly from their new
location, silently broke zipapp and packed layout `--venv` PEX
symlinking. Although `--venv` PEXes for those layouts continued to work,
they used copying to build their venvs instead of symlinking even when
symlinking (the default for `--venv` PEXes) was the intended style.

Fixes #2088
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants