Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The project dir name should include the Python version #58

Closed
facundobatista opened this issue Apr 20, 2023 · 10 comments · Fixed by #76
Closed

The project dir name should include the Python version #58

facundobatista opened this issue Apr 20, 2023 · 10 comments · Fixed by #76
Assignees

Comments

@facundobatista
Copy link
Owner

Currently, if you run the .pyz with Python X, it will create all directories and install everything. If you run the same .pyz with Python Y, it will reuse that installation, but is wrong, as the virtualenv and dependencies installation may depend on the Python version to work properly).

So, the project dir name should be composed by the project name and .pyz timestamp (as today) plus the Python version.

This way, in the situation aforementioned, the unpacker would just create a new directory with a clean install and everything would just work.

@sinoroc
Copy link

sinoroc commented Apr 21, 2023

the project dir name should be composed by the project name and .pyz timestamp (as today) plus the Python version

I would suggest using something like wheel tags in the directory name to identify which virtual environment is usable with which Python interpreter, instead of just the Python version. For example on Windows you could have Python 3.8 in both 32 and 64 bits and the virtual environment are not necessarily compatible with both.

@sinoroc
Copy link

sinoroc commented Apr 21, 2023

the project dir name should be composed by the project name and .pyz timestamp (as today) plus the Python version.

What about a hash of the .pyz file instead of a timestamp?

@facundobatista
Copy link
Owner Author

I would suggest using something like wheel tags in the directory name to identify which virtual environment is usable with which Python interpreter, instead of just the Python version. For example on Windows you could have Python 3.8 in both 32 and 64 bits and the virtual environment are not necessarily compatible with both.

It's a good idea to have more information to detect incompatibilities. But I don't think that "wheel tags" is the best approach here, because those are thought to select some stuff that is fixed in our case (like the platform).

Maybe the best approach would be to mimic what Python is doing with the .pyc files (e.g. site.cpython-310.pyc).

Suggestion to work with:

  • project name
  • .pyz timestamp (or hash, see other conversation thread)
  • python implementation (cpython, etc) from platform.python_implementation()
  • major and minor python versions (from platform.python_version_tuple())
  • the python magic number (used to identify bytecode changes in PYC files) from importlib.util.MAGIC_NUMBER

What do you think?

@facundobatista
Copy link
Owner Author

the project dir name should be composed by the project name and .pyz timestamp (as today) plus the Python version.

What about a hash of the .pyz file instead of a timestamp?

The timestamp has the benefit of providing the information about which directory is more recent than the other, the hash will be too obscure.

Are you afraid of having different .pyzs with same timestamp?

@sinoroc
Copy link

sinoroc commented Apr 24, 2023

Are you afraid of having different .pyzs with same timestamp?

I would rather try to avoid having two "installations" for two copies of the same .pyz file, both copies with the exact same content but two different timestamps.

@sinoroc
Copy link

sinoroc commented Apr 24, 2023

I don't think that "wheel tags" is the best approach here

I was not suggesting to use wheel tags as is, but rather "something like wheel tags".

I am not too confident about this topic, I really do not know enough about this to give definitive advice, but if I understood correctly then a good (practical and technical rather than conceptual) suggestion could be to look at what is done in "PEP 711 – PyBI: a standard format for distributing Python Binaries" with their "wheel tags"-like specification for file names.

@facundobatista
Copy link
Owner Author

Are you afraid of having different .pyzs with same timestamp?

I would rather try to avoid having two "installations" for two copies of the same .pyz file, both copies with the exact same content but two different timestamps.

Ah, I see now. It's a very good idea!

So, let's re-collect rules. The directory name will be a dash-separated join of:

  • project name: for simplicity, to recognize to which project corresponds this installation
  • .pyz hash: to have different installations for different .pyzs; this will be the first 20 chars of sha256's hexdigest
  • Python information to have different installations for different Pythons; this will be a dot-separated join of:
    • python implementation (cpython, etc): from platform.python_implementation()
    • major and minor python versions: from platform.python_version_tuple()
    • the python magic number (used to identify bytecode changes in PYC files): stripped magic number from importlib.util.MAGIC_NUMBER

@sinoroc
Copy link

sinoroc commented Apr 25, 2023

  • project name: for simplicity, to recognize to which project corresponds this installation

I am wondering about this... If I "install" Something.pyz, but then later rename the file to SomethingElse.pyz (without changing its content) and run it. What should happen? I think PyEmpaq should ignore the name difference and reuse the already unpacked version, because the hashes should still be the equal.

@sinoroc
Copy link

sinoroc commented Apr 25, 2023

  • this will be the first 20 chars of sha256's hexdigest

This, I do not know. I have no knowledge of what is a good choice for the hash in this use case.

@facundobatista
Copy link
Owner Author

I am wondering about this... If I "install" Something.pyz, but then later rename the file to SomethingElse.pyz (without changing its content) and run it. What should happen? I think PyEmpaq should ignore the name difference and reuse the already unpacked version, because the hashes should still be the equal.

Yes, that is what actually happens. The "project name" is retrieved from the metadata, not from the file itself, which is more prone to change that one my think (nobody will do Something.pyz to SomethingElse.pyz, but for sure something like Something-0.3.2.pyz -> Something.pyz will happen a lot)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants