Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permit pipenv and other tools to report itself as the downloader in pip's user agent #5424

Closed
wants to merge 1 commit into from

Conversation

theacodes
Copy link
Member

This will allow pipenv to report itself as the downloader, which in turn, will allow us to measure pipenv adoption. This is marked as "trivial" because it's an internal-only change and the only project we want to use this is pipenv.

Context: pypa/packaging-problems#58 (comment)

@ncoghlan
Copy link
Member

ncoghlan commented May 20, 2018

I was initially wondering if it may be better to structure this as two environment variables: PIP_USER_AGENT_INSTALLER_NAME and PIP_USER_AGENT_INSTALLER_VERSION

That way other pip-scripting clients like pip-compile and pip-sync could also set it, without everyone needing to be aware of the name/version string formatting that PyPI expects.

However, I then realised that there'd be a major downside of implementing this as a full override: it would mean we lost the information about which version of pip is being scripted, and hence about how well supported new sdist formats (etc) are actually going to be.

What if we instead added the wrapper name in square brackets, so that what you got on the PyPI side looked like:

pip[pipenv]/10.0.1

That should work without any Warehouse updates, or changes to folks Big Query queries, while still being able to provide some more granular info on which pip wrapper tools are seeing broad adoption.

@ncoghlan
Copy link
Member

I decided to check my assumption about that square-bracket based format being compatible with linehaul already, and that turns out to be incorrect: https://github.com/pypa/linehaul/blob/master/linehaul/user_agents.py

The linehaul UA parser has a defined set of formats that it understands, and neither pipenv/ nor pip[pipenv]/ would match any of them.

Given that, I think we'd have to tackle this by:

  1. Defining a "pip wrapper" format that appends extra info to the regular pip UA string (e.g. "pip/10.0.1 via pipenv/2018.5.18")
  2. Add support for that format to linehaul
  3. Add environment variable based support for that format to pip
  4. Actually start setting those environment variables in wrapper tools

@theacodes
Copy link
Member Author

theacodes commented May 20, 2018 via email

@pradyunsg
Copy link
Member

pradyunsg commented May 29, 2018

"pip/10.0.1 via pipenv/2018.5.18"

I like this format. On the BigQuery side of things, if we can, it should probably result in mirroring details.installer's schema, adding something for the wrapper tool -- details.wrapper maybe?

@dstufft is likely the best guy for commenting on this. :)

@ncoghlan
Copy link
Member

ncoghlan commented May 30, 2018

I'd actually forgotten that the UA in modern pip versions included a JSON blob after the initial "pip/version" UA marker.

Given that, we shouldn't actually need to modify the UA prefix at all - instead, we'd submit a PR to update linehaul's UserAgent schema to include an extra optional wrapper = pyrsistent.field(type=Installer, factory=Installer.create) field, then after that had been deployed, update the JSON data in pip's UA to include the extra subfield with the wrapper info based on the PIP_USER_AGENT_WRAPPER_NAME and PIP_USER_AGENT_WRAPPER_VERSION environment variables.

@dstufft Does ^^^ sound like a workable approach to you?

@pradyunsg pradyunsg added type: enhancement Improvements to functionality state: needs discussion This needs some more discussion labels May 30, 2018
@conspicuousClockwork
Copy link
Contributor

There is also utility for those who need to modify the user-agent for edge cases. For example, all GET requests on a proxy I am currently using requires the header to contain a specific string anywhere in it.

Would it be plausible to add an extra subfield to the JSON, so the user may add in their own, even if their wrapper of preference is already utilizing PIP_USER_AGENT_WRAPPER_NAME and PIP_USER_AGENT_WRAPPER_VERSION?

PIP_USER_AGENT_PROVISIONAL_STRING Maybe?

@ncoghlan if this sounds okay to you I'd like to open a seperate issue and PR since this one seems to require changes across both linehaul and pip and is more for the use of wrappers instead of users.

@BrownTruck
Copy link
Contributor

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will be eligible for code review and hopefully merging!

@BrownTruck BrownTruck added the needs rebase or merge PR has conflicts with current master label Feb 25, 2019
@theacodes
Copy link
Member Author

Hey all, is this still useful? I'm happy to rebase and such but I want to make sure it doesn't die on the vine again.

@pradyunsg
Copy link
Member

I think so. @dstufft is likely the best person for this. Maybe pinging him over email would help.

@theacodes theacodes force-pushed the user-agent-override branch 2 times, most recently from 8ccbde8 to b050bdf Compare February 26, 2019 06:55
@pypa-bot pypa-bot removed the needs rebase or merge PR has conflicts with current master label Feb 26, 2019
@theacodes
Copy link
Member Author

Okay, this is rebased. @dstufft please take a look when you can?

@cjerdonek
Copy link
Member

cjerdonek commented Feb 26, 2019

Being able to add context like this does seem useful, but I'd like to hear thoughts on @ncoghlan's idea above to include the info in the JSON instead (what he is calling the "wrapper" info).

One concern I'd have with letting people set the leading characters to an arbitrary string is that it would break the guarantee that pip's user agent string will be parseable. Also, even if the info about pip is in the JSON, linehaul only knows to parse the JSON if the string starts with pip/ in the first place:
https://github.com/pypa/linehaul/blob/420354cf789b064f0d38ce02573f6af51aa0306a/linehaul/ua/parser.py#L43-L54

It looks like linehaul has to add custom code for each leading prefix it recognizes (with special logic for each), so it seems like letting the leading prefix vary would break that model:
https://github.com/pypa/linehaul/blob/420354cf789b064f0d38ce02573f6af51aa0306a/linehaul/ua/parser.py#L92

@theacodes
Copy link
Member Author

theacodes commented Feb 26, 2019 via email

@BrownTruck
Copy link
Contributor

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will be eligible for code review and hopefully merging!

@BrownTruck BrownTruck added the needs rebase or merge PR has conflicts with current master label Mar 31, 2019
@pradyunsg
Copy link
Member

@cjerdonek let's close this, given #5549?

@cjerdonek
Copy link
Member

@pradyunsg This is a different request / different purpose. The string in PR #5549 is meant to be ignored by linehaul / statistics reports, whereas this PR is meant to make pipenv visible to the stats...

@cjerdonek
Copy link
Member

In my mind, the next step on this PR is to follow what @ncoghlan suggested above, namely--

Given that, we shouldn't actually need to modify the UA prefix at all - instead, we'd submit a PR to update linehaul's UserAgent schema to include an extra optional wrapper = pyrsistent.field(type=Installer, factory=Installer.create) field,

The reason is that for the additional "wrapper" info to be useful, we need to have buy-in from linehaul that it will actually parse and report this info.

@cjerdonek cjerdonek changed the title Add ability to override pip's user agent with PIP_USER_AGENT_INSTALLER_OVERRIDE Permit pipenv and other tools to report itself as the downloader in pip's user agent Apr 2, 2019
@cjerdonek
Copy link
Member

FYI, I updated this issue's title to reflect the underlying goal of the issue rather than one possible implementation. This also helps to reduce the superficial similarity with PR #5549.

@chrahunt
Copy link
Member

chrahunt commented Oct 13, 2019

I created #7186 for implementation of the solution proposed by @ncoghlan above, and pypi/linehaul#50 for the first step. I will close this PR and we can do any necessary followup there.

@chrahunt chrahunt closed this Oct 13, 2019
@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Nov 12, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Nov 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation needs rebase or merge PR has conflicts with current master state: needs discussion This needs some more discussion type: enhancement Improvements to functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants