Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show basename of hashing URL in debug logs #1113

Merged
merged 3 commits into from
May 23, 2020

Conversation

atugushev
Copy link
Member

@atugushev atugushev commented Apr 21, 2020

Before:

Generating hashes:
  django
    Hashing https://files.pythonhosted.org/packages/a9/4f/8a247eee2958529a6a805d38fbacd9764fd566462fa0016aa2a2947ab2a6/Django-3.0.5-py3-none-any.whl
    Hashing https://files.pythonhosted.org/packages/ed/52/1f281f39fe38d10c6c73e1c1d26a0aad5406be1108bf5f50423751ea8aa3/Django-3.0.5.tar.gz

After:

Generating hashes:
  django
    Hashing Django-3.0.5-py3-none-any.whl
    Hashing Django-3.0.5.tar.gz

Steps to test the PR:

$ docker run --rm -it python:3.8 /bin/bash
$ pip install git+https://github.com/atugushev/pip-tools@reduce-hashing-url-in-logs#egg=pip-tools
$ pip download future numpy
$ echo torch===1.5.0 | pip-compile - -v -o /tmp/requirements.txt \
    --find-links=. \
    -f https://download.pytorch.org/whl/torch_stable.html 
    --generate-hashes \
    --pip-args='--no-index' 

Changelog-friendly one-liner: Show basename of a hashing URL in pip-compile's debug logs.

Contributor checklist
  • Provided the tests for the changes.
  • Gave a clear one-line description in the PR (that the maintainers can add to CHANGELOG.md on release).
  • Assign the PR to an existing or new milestone for the target version (following Semantic Versioning).

@atugushev atugushev added the enhancement Improvements to functionality label Apr 21, 2020
@codecov
Copy link

codecov bot commented Apr 21, 2020

Codecov Report

Merging #1113 into master will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1113   +/-   ##
=======================================
  Coverage   99.49%   99.49%           
=======================================
  Files          36       36           
  Lines        2752     2752           
  Branches      326      326           
=======================================
  Hits         2738     2738           
  Misses          8        8           
  Partials        6        6           
Impacted Files Coverage Δ
piptools/repositories/pypi.py 95.41% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4db0eb0...6ec9274. Read the comment docs.

@atugushev atugushev added the logging Related to log or console output label Apr 21, 2020
@atugushev atugushev requested a review from AndydeCleyre April 22, 2020 14:09
@atugushev atugushev force-pushed the reduce-hashing-url-in-logs branch from 3ecb3b0 to f41eee3 Compare April 22, 2020 14:10
@AndydeCleyre
Copy link
Contributor

I'm missing something here trying to replicate either the Before or After samples above; how should I be getting that?

$ rm -f requirements.txt
$ printf '%s\n' django plumbum | pip-compile -v --generate-hashes -o requirements.txt - 2>&1 | grep -A 5 'Generating'
Generating hashes:
  plumbum
  asgiref
  django
  sqlparse
  pytz

Same as on master.

@atugushev
Copy link
Member Author

Ha-ha 😄 Now hashes generates in a second because of #1109!

@atugushev
Copy link
Member Author

atugushev commented Apr 22, 2020

Could be tested with echo package_from_findlinks | pip-compile - -vo- --generate-hashes --find-links=wheels --pip-args="--no-index".

@atugushev
Copy link
Member Author

@AndydeCleyre

Steps to test the PR:

$ docker run --rm -it python:3.8 /bin/bash
$ pip install git+https://github.com/atugushev/pip-tools@reduce-hashing-url-in-logs#egg=pip-tools
$ pip download future numpy
$ echo torch===1.5.0 | pip-compile - -v -o /tmp/requirements.txt \
    --find-links=. \
    -f https://download.pytorch.org/whl/torch_stable.html 
    --generate-hashes \
    --pip-args='--no-index' 

@AndydeCleyre
Copy link
Contributor

AndydeCleyre commented Apr 23, 2020

Thanks very much!

I don't know if this will be a problem in practice, but pip's Link.url_without_fragment and Link.show_url can parse quite differently. Here's what show_url does:

return posixpath.basename(self._url.split('#', 1)[0].split('?', 1)[0])

For (a pretty bad) example:

In [1]: from pip._internal.models.link import Link                             

In [2]: l = Link('https://www.python.org/dev/peps/pep-0508/#environment-markers')                                                                              

In [3]: l.url_without_fragment         
Out[3]: 'https://www.python.org/dev/peps/pep-0508/'

In [4]: l.show_url                     
Out[4]: ''

In [5]: l = Link('https://www.python.org/dev/peps/pep-0508/')                  

In [6]: l.url_without_fragment         
Out[6]: 'https://www.python.org/dev/peps/pep-0508/'

In [7]: l.show_url                     
Out[7]: ''

So the question is: will we ever be generating hashes from links with # or ? where we don't want to exclude from the log msg everything after the first occurrence of # or ?? Or links that for some reason look like directories, ending in /?

Unrelated question: why does generating hashes (e.g. for torch) download and hash wheels for all platforms, instead of just the current one?

EDIT: I guess I have my answer to that unrelated question here.

@atugushev
Copy link
Member Author

atugushev commented Apr 24, 2020

@AndydeCleyre

So the question is: will we ever be generating hashes from links with # or ? where we don't want to exclude from the log msg everything after the first occurrence of # or ?? Or links that for some reason look like directories, ending in /?

That shouldn't be possible. But we could be more accurate with:

if link.netloc == PyPI.file_storage_domain:
    url = link.show_url
else:
    url = link.url_without_fragment

logged_url = redact_auth_from_url(url)

See related pip code.

Unrelated question: why does generating hashes (e.g. for torch) download and hash wheels for all platforms, instead of just the current one?
EDIT: I guess I have my answer to that unrelated question here.

Yeah, to be able to install requirements.txt with required hashes on all platforms.

@AndydeCleyre
Copy link
Contributor

Alright, I tried to break it and couldn't, so LGTM 👍 .


I tried making an evil folder with an evil build tag:

$ mkdir 'right#here?'
$ cd 'right#here?'
$ pip download plumbum
$ mv plumbum-1.6.9-py2.py3-none-any.whl 'plumbum-1.6.9-1#?-py2.py3-none-any.whl'
$ echo plumbum | pip-compile - -o /dev/null --generate-hashes -v --find-links=. --pip-args --no-index
<snip>
    Hashing plumbum-1.6.9-1%23%3F-py2.py3-none-any.whl
<snip>

Then I tried making an evil serve.py:

import hug

@hug.get('/plumbum-1.6.9-1#?-py2.py3-none-any.whl', output=hug.output_format.file)
def wheel(nonsense: hug.types.text):
    return '/home/andy/Code/right#here?/plumbum-1.6.9-1#?-py2.py3-none-any.whl'
$ pip install hug
$ hug -f serve.py
$ echo 'plumbum @ http://localhost:8000/plumbum-1.6.9-1%23%3F-py2.py3-none-any.whl?nonsense=bad' | pip-compile - -o /dev/null --generate-hashes -v
<snip>
  Hashing plumbum-1.6.9-1%23%3F-py2.py3-none-any.whl
<snip>

@atugushev atugushev added this to the 5.2.0 milestone May 23, 2020
@atugushev atugushev merged commit e09f9ee into jazzband:master May 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvements to functionality logging Related to log or console output
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants