autover (or package using autover) reports incorrect version when installed inside a git repository #54

ceball · 2018-06-28T16:49:46Z

I have a problem where asking an installed param package for its version is returning version info from an unrelated git repository instead of from the .version file in the installed param package.

I'll try to reproduce it (the problem is happening while pip is building a package that has param as a build time dependency), but my guess is it's something along the lines of...

If autover (or param) is installed as a package inside a git repository (i.e. installed at something like git_repo/some/path/to/lib/python/autover/), autover (or param) reports its own version as that of git_repo, rather than the version in git_repo/some/path/to/lib/python/autover/.version.

I.e. something related to https://github.com/pyviz/autover/blob/master/autover/version.py#L259 maybe? It looks suspicious to me (won't it match any repository?), although I don't fully understand what's happening/supposed to be happening there (e.g. not sure why the .version file isn't being used), so I might be totally wrong.

jlstevens · 2018-06-28T17:23:40Z

You are right that https://github.com/pyviz/autover/blob/master/autover/version.py#L259 is supposed to prevent this but I have no idea what is happening when param is a build time dependency.

ceball · 2018-06-28T18:26:35Z

I also have no idea what pip is doing (will debug it), but it seems wrong that an installed param would ever report a git repository's info (let alone one that's not param's) instead of its .version info.

jlstevens · 2018-06-28T18:57:03Z

That is by design: version control information is generally less stale than what you can find from a static file.

ceball · 2018-06-28T21:50:50Z

If autover (or param) is installed as a package inside a git repository (i.e. installed at something like git_repo/some/path/to/lib/python/autover/), autover (or param) reports its own version as that of git_repo, rather than the version in git_repo/some/path/to/lib/python/autover/.version.

I think that's what it is...

$ mkdir t123
$ cd t123
$ git init
$ echo test > test.txt
$ git add test.txt 
$ git commit -m "init"
$ git tag -a v0.0.1 -m "test"
$ conda create -n xyz python=3.6
$ conda activate xyz
$ pip install param -t .
Collecting param
  Cache entry deserialization failed, entry ignored
  Using cached https://files.pythonhosted.org/packages/6d/f1/a9feccdbc333813f02a54c808b330a75210b98699ef8199e0250763b8e14/param-1.7.0-py2.py3-none-any.whl
Installing collected packages: param
Successfully installed param-1.7.0
$ python
>>> import param
>>> param.__version__
'0.0.1'
$ rm -rf .git
$ python
>>> import param
>>> param.__version__
'1.7.0'

I used the path . above, but it could be any path inside the git repo:

$ rm -rf param* numbergen
$ python
>>> import param
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'param'
$ pip install param -t ./some/other/path
Collecting param
  Using cached https://files.pythonhosted.org/packages/6d/f1/a9feccdbc333813f02a54c808b330a75210b98699ef8199e0250763b8e14/param-1.7.0-py2.py3-none-any.whl
Installing collected packages: param
Successfully installed param-1.7.0
$ PYTHONPATH=./some/other/path python
>>> import param
>>> param.__version__
'0.0.1'
>>> param.__file__
'/tmp/t123/some/other/path/param/__init__.py'
$ cat some/other/path/param/.version 
{"git_describe": "v1.7.0-0-gb75e757", "version_string": "1.7.0"}

(And presumably doesn't apply only to param or autover.)

ceball · 2018-06-29T08:02:37Z

Just to confirm, this applies to anything using autover, and isn't special to pip (i.e. it's only about being inside a git repository):

$ mkdir /tmp/t123 && pushd /tmp/t123
$ git init
$ echo test > test.txt
$ git add test.txt 
$ git commit -m "init"
$ git tag -a v0.0.1 -m "test"
$ conda create -p ./456
$ conda activate /tmp/t123/456
$ python -c "import holoviews; print(holoviews.__version__,holoviews.__file__)"
0.0.1 /tmp/t123/456/lib/python3.6/site-packages/holoviews/__init__.py
$ cat /tmp/t123/456/lib/python3.6/site-packages/holoviews/.version
{"git_describe": "v1.10.5-0-g98b4ea7d", "version_string": "1.10.5"}
$ conda list holoviews
# packages in environment at /tmp/t123/456:
#
# Name                    Version                   Build  Channel
holoviews                 1.10.5                   py36_0

Isn't the '' in repo_matches going to match any repository?

                output = run_cmd([cmd, 'remote', '-v'],
                                 cwd=os.path.dirname(self.fpath))
                repo_matches = ['', # No remote set
                                '/' + self.reponame + '.git' ,
                                # A remote 'server:reponame.git' can also be referred
                                # to (i.e. cloned) as `server:reponame`.
                                '/' + self.reponame + ' ']
                if not any(m in output for m in repo_matches):
                    return self

(I don't quite follow the whole chain to determine the version, though, as I mentioned above...)

jlstevens · 2018-06-29T11:04:46Z

The line '/' + self.reponame + '.git' is supposed to check for the .git directory e.g param.git. This is to make sure it doesn't pick up on any other repository.

ceball · 2018-06-29T16:48:09Z

The line '/' + self.reponame + '.git' is supposed to check for the .git directory e.g param.git. This is to make sure it doesn't pick up on any other repository.

But it doesn't matter, because '' used as above is going to match anything - right?

The intention of '' seems to be to match "no remote". I.e. the whole logic is: "repo matches if there's no remote or if any remote matches the repo name". If that sounds right to you, I could change the above code so that's what happens.

Separately, I'd like to propose that the .version file should always be used if present (instead of looking to git first). This wouldn't work at present, because .version is always generated. I guess I'd like to see it generated and put into packages only. However, this is presumably a larger/riskier change instead of just being a bug fix (which is what I think the suggestion above would be), so I'll open an issue and we could consider it in the future.

jbednar · 2018-06-29T16:59:08Z

I think we should consider having .version override info from git, because it provides a way to manually ensure the version is correct regardless of context. But there are many competing pressures there, and I don't know what the best answer is. E.g. we could also consider having a .version file committed to the repo after a major release but used only as as a fallback. The version in that file would have to be something valid but not actually currently used, indicating that it's at least version X but is not precisely X. E.g. something like 1.7.1post0dirty (not sure if that's legal, but you get the idea), such that even if someone gets a git archive, they at least know it's newer than 1.7.1, and that it's not precisely 1.7.1. I think that would take care of nearly every version issue we've been having.

jlstevens · 2018-06-29T18:00:20Z

I.e. the whole logic is: "repo matches if there's no remote or if any remote matches the repo name". If that sounds right to you, I could change the above code so that's what happens.

From what I remember, that was the intention, yes. If you could make it behave that way and verify it addresses this issue, that would be really helpful.

I think we should consider having .version override info from git, because it provides a way to manually ensure the version is correct regardless of context.

I might be happy to have a switch to change the priority but I don't want to change the default behavior right now: as you say there are too many competing pressures and I can't reason about all the cases that changes to the default behavior might affect. However, offering the option to change the priority is ok as long as the user accepts potentially unforeseen consequences!

ceball · 2018-06-29T22:13:29Z

The intention of '' seems to be to match "no remote". I.e. the whole logic is: "repo matches if there's no remote or if any remote matches the repo name". If that sounds right to you, I could change the above code so that's what happens.

I've realized that I'd prefer not to mess with that code, because although it seems wrong to me for what it's trying to do, what it's trying to do also seems wrong...but at the same time, I know I don't fully get all the use cases. So I'd prefer to stay away from it! :)

Instead, I've decided to upgrade my 'packages should use the .version file' proposal into a plea :) I strongly think packages (wheel, sdist, conda package, etc) should be versioned in an uncomplicated way. Having a package run git to report its version information just seems completely wrong to me. I think this means that if there's a .version file, it should always be used - and so the .version file should only be put into packages (or other places where the version is supposed to be fixed).

I think we should consider having .version override info from git, because it provides a way to manually ensure the version is correct regardless of context. But there are many competing pressures there

I'm not sure what the competing pressures are for a released package. It's a mystery to me why a released package should ever have anything to do with git. Similarly, I don't see why a git repository would ever need to have a .version file in it (not talking about your suggestion of a committed file containing a version fallback for people who get an untagged github archive - that's a separate issue).

Anyway, the reason this all came up is from checking the version of build dependencies. In the setup.py of a project like parambokeh, I have something like "if param<1.6 raise Error('to build a parambokeh package, install pip>=10 and try again, or pip install param>=1.6 yourself first then try again')". param>=1.6 is listed as a build dependency in pyproject.toml, so the check is actually redundant for pip>=10...but the check completely fails when people do have pip>=10, because the param installed by pip as a build dependency reports its version as parambokeh's git repository version, rather than 1.6 as in the .version file.

Such setup.py version checks are only supposed to be temporary until pip 10 is common. The checks are only for people building parambokeh via setup.py who have pip<10. Those people are not regular users (who'll use released wheels/conda packages) - they are:

packagers who have out of date pip
users who want to use an unreleased version of parambokeh and have out of date pip
packagers/users who want to build parambokeh via setup.py, but who are not using pip (or other pyproject.toml-supporting tool) e.g. maybe they are doing python setup.py build or similar.

So, unless anyone strongly objects, I'll just drop those version checks, because agreeing what we should do for .version vs git is probably going to be difficult, and even if we agree, testing any such changes will take some time (probably with various problems most likely only coming to light after a period of real world usage).

jbednar · 2018-06-30T02:34:09Z

I've realized that I'd prefer not to mess with that code, because although it seems wrong to me for what it's trying to do, what it's trying to do also seems wrong...but at the same time, I know I don't fully get all the use cases. So I'd prefer to stay away from it! :)

I don't think that's defensible; we can't just leave broken code in place, when we know it's broken!

Instead, I've decided to upgrade my 'packages should use the .version file' proposal into a plea :) (a) I strongly think packages (wheel, sdist, conda package, etc) should be versioned in an uncomplicated way. Having a package run git to report its version information just seems completely wrong to me. (b) I think this means that if there's a .version file, it should always be used - and so the .version file should only be put into packages (or other places where the version is supposed to be fixed).

I agree with (a), but (b) doesn't follow directly from it; seems like it's many steps removed from it. Are these the missing steps?

Versioning in released packages should be straightforward; that's the easy (and most important) case [agreed],
We don't want to modify code in the package itself to store the version number [probably agreed]
We're ok with including an extra static file with the version info [agreed]
We need the logic for processing that file not to depend on any external steps or tools that might fail [agreed]

If so I think I agree with all the steps, but I'm not certain there's no other reasonable option for every link in the chain, so it's worth making the argument explicitly.

I'm not sure what the competing pressures are for a released package. It's a mystery to me why a released package should ever have anything to do with git.

The pressures are all the various contexts in which versions are needed. I agree that a released package should never be contacting git.

Similarly, I don't see why a git repository would ever need to have a .version file in it (not talking about your suggestion of a committed file containing a version fallback for people who get an untagged github archive - that's a separate issue).

I don't see how that's a separate issue. There could be a separate solution, e.g. two such files, one .version for released packages (overriding anything if present) and a .version_ref or some such that provides a baseline version for github archive (not overriding anything). But unless we explicitly and actively separate the two issues in that way, I don't see how they are somehow fundamentally separate.

So, unless anyone strongly objects, I'll just drop those version checks, because agreeing what we should do for .version vs git is probably going to be difficult, and even if we agree, testing any such changes will take some time (probably with various problems most likely only coming to light after a period of real world usage).

I don't get it. What does dropping the version checks buy us, compared to simply fixing the immediate problem of an incorrect check for an empty repo name? I'd strongly prefer simply to decide about .version; it's pointless to delay things any further!

ceball · 2018-06-30T21:45:11Z

I fixed the check for no remote. However, it still doesn't report the version from the .version file - it now reports None. As far as I can tell, it only uses the .version file for the version if there's an exception while trying to run git.

… check that remote matches repo name. See #54

ceball · 2018-06-30T22:14:42Z

I've made a PR that includes the above mentioned remote name check, and which also tries to address the problem after that (it checks the .version file for info if git hasn't returned a version).

ceball · 2018-06-30T23:06:53Z

I don't see how that's a separate issue

Ok, I said the .version file should only exist in packages, but yes, I just wanted a way that the full, static version info would take precedence if present. I was suggesting .version file present or absent as the mechanism for that, but I'd be fine with e.g. having a partial version fallback field in .version, and having the .version file always present - as long as the full static version info is used first if present in the .version file. The partial version fallback field would be used last (after checking the full version field, and after attempting to get info from git). So the partial version fallback would be used only for untagged github zip files and other weird situations like a git clone with the .git deleted.

I do think that adding such fallback info should be a new issue, but I agree it means my request to have the .version file only in packages is possibly shortsighted, and maybe there's a better mechanism for ensuring the static version info takes precedence.

I guess the competitor to having the fallback info is my suggestion of raising an exception in setup.py if the version number isn't present and can't be determined (#42 (comment)). I would be ok with either. The fallback suggestion is friendlier but more effort. I believe we previously decided not to do it because of the effort required to keep such info up to date, but it could now be done by pyctdev (I guess you'd have it open a PR after a release).

See holoviz-dev/autover#54

ceball changed the title ~~autover (and hence param) reporting incorrect version of itself when installed inside another git repository?~~ autover (or package using autover) reports incorrect version when installed inside a git repository Jun 29, 2018

ceball added a commit that referenced this issue Jun 30, 2018

If there's no version info from git, try the .version file. Also, fix…

5846835

… check that remote matches repo name. See #54

ceball mentioned this issue Jun 30, 2018

Fix repository name matching #55

Closed

ceball added a commit to holoviz-dev/pyct that referenced this issue Jul 7, 2018

Remove build dependency checks.

79cd502

See holoviz-dev/autover#54

ceball mentioned this issue Jul 7, 2018

Remove build dependency checks. holoviz-dev/pyct#48

Merged

ceball added a commit to holoviz-dev/pyct that referenced this issue Jul 7, 2018

Remove build dependency checks. (#48)

b2b1a3f

See holoviz-dev/autover#54

ceball added a commit to ioam/parambokeh that referenced this issue Jul 7, 2018

Remove build dependency checks.

fcb2a7d

See holoviz-dev/autover#54

ceball added a commit to ioam/parambokeh that referenced this issue Jul 7, 2018

Remove build dependency checks.

385d05a

See holoviz-dev/autover#54

ceball added the bug label Oct 12, 2018

ceball mentioned this issue Dec 5, 2019

[Request] : Prevent git to be used for release packages #58

Open

ceball mentioned this issue Mar 5, 2020

Added try/except for retrieving library version holoviz/pyviz_comms#48

Closed

jlstevens mentioned this issue Jan 9, 2021

Fixed check on remote to avoid issue with nested repos #65

Merged

jlstevens closed this as completed in #65 Jan 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

autover (or package using autover) reports incorrect version when installed inside a git repository #54

autover (or package using autover) reports incorrect version when installed inside a git repository #54

ceball commented Jun 28, 2018

jlstevens commented Jun 28, 2018

ceball commented Jun 28, 2018

jlstevens commented Jun 28, 2018

ceball commented Jun 28, 2018 •

edited

Loading

ceball commented Jun 29, 2018 •

edited

Loading

jlstevens commented Jun 29, 2018

ceball commented Jun 29, 2018 •

edited

Loading

jbednar commented Jun 29, 2018

jlstevens commented Jun 29, 2018

ceball commented Jun 29, 2018

jbednar commented Jun 30, 2018

ceball commented Jun 30, 2018

ceball commented Jun 30, 2018

ceball commented Jun 30, 2018

autover (or package using autover) reports incorrect version when installed inside a git repository #54

autover (or package using autover) reports incorrect version when installed inside a git repository #54

Comments

ceball commented Jun 28, 2018

jlstevens commented Jun 28, 2018

ceball commented Jun 28, 2018

jlstevens commented Jun 28, 2018

ceball commented Jun 28, 2018 • edited Loading

ceball commented Jun 29, 2018 • edited Loading

jlstevens commented Jun 29, 2018

ceball commented Jun 29, 2018 • edited Loading

jbednar commented Jun 29, 2018

jlstevens commented Jun 29, 2018

ceball commented Jun 29, 2018

jbednar commented Jun 30, 2018

ceball commented Jun 30, 2018

ceball commented Jun 30, 2018

ceball commented Jun 30, 2018

ceball commented Jun 28, 2018 •

edited

Loading

ceball commented Jun 29, 2018 •

edited

Loading

ceball commented Jun 29, 2018 •

edited

Loading