Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion on upstreaming cpiofile (merge it back) to python3 upstream? #141

Open
ydirson opened this issue May 15, 2024 · 3 comments
Open

Comments

@ydirson
Copy link
Contributor

ydirson commented May 15, 2024

At the time I was working on python3 porting, I had done some digging around cpiofile.py, to find out it is a fork of one of python's standard modules (zipfile IIRC), and that the structural differences were not so large. So I came up with the idea that the original modules could be extended into something like an archive module, with zipfile and cpiofile being just 2 implementations sharing most of the code. Naturally, seeing recent activity around cpiofile brings this back into memory :)

@bernhardkaindl
Copy link
Collaborator

bernhardkaindl commented May 15, 2024

@ydirson I and Ross found that cpiofile.py was forked from an early copy of tarfile, you see it immediately, same author and contributors (in the credits) next to/below the licence, they match:
https://github.com/Mojang/play/blob/master/python/Lib/tarfile.py

Also the current API of tarfile is still similar:
https://docs.python.org/3/library/tarfile.html

zipfile is similar to tarfile in API, but it has some differences.

cpiofile: merge it back into python3
Do you suggest upstreaming cpiofile to Python3 upstream?

I don't think anyone here would like to sponsor that work, and I think upstream Python would just tell us that they do not like to support that old (and IMHO obsolete) cpio format in upstream Python3.

On the idea of the archive module, with zipfile and cpiofile being just 2 implementations sharing most of the code:

That would also include tarfile, which is also similar in API.

  • The problem is that zipfile and tarfile are in active use and are not new projects.
  • Even if they and cpiofile are largely similar, they differ in some core aspects of their API and which capabilities they offer.
  • While it would have possibly been a good idea to force zipfile and tarfile to implement the same identical API and have a common test suite that tests that, that ship has sailed many years ago when those were written and merged into the Python stdlib. We can't do anything about that now.
  • Making code shared between them also makes the whole archive project more difficult to maintain.
    • It would have been nicer from the user perspective to have them uniform, but this is not what has happened, unfortunately.
  • Because of these differences, I don't think that anyone would be interested in basically rewriting tarfile, zipfile and cpiofile.

And the upstream developers would refer to the many libarchive wrappers:

In general, when switching to a new API, implementations that are just wrappers for libarchive would be best, like:

https://pypi.org/project/python-libarchive/
https://github.com/Changaco/python-libarchive-c
https://pypi.org/project/libarchive/

For XenServer / xcp-ng:

If the host-upgrade-plugin is the only user of xcp/cpiofile.py, then I guess the host-upgrade-plugin could migrate away from xcp/cpiofile.py:

  • But that's not my decision to make, and the Python3 work on it done in Nanjing.
  • OTOH, not having a dependency on libarchive maybe good too.
  • We've xcp/cpiofile.py working quite well (75% test coverage), so if there is no need to drop it, we can keep it for now.

I'd like to actually close this, as I'm not thinking that anyone would be assigned to do such work.

I'd rather migrate away from the cpio format in general if we can, but I also think that it's not important.

@bernhardkaindl bernhardkaindl changed the title cpiofile: merge it back into python3 Discussion on upstreaming cpiofile (merge it back) to python3 upstream? May 15, 2024
@ydirson
Copy link
Contributor Author

ydirson commented May 15, 2024

forked from an early copy of tarfile

Yeah, I was right to be careful with what I recalled 😉

I think upstream Python would just tell us that they do not like to support that old (and IMHO obsolete) cpio format

cpio format still have at least 2 major uses that don't seem to go away anytime soon (Linux initrd, and RPM), I would not necessarily assume they would refuse it.

My point (from experience writing tests and porting it to python3) was mostly that this is a big piece of code, and maintaining it separately from tarfile does indeed cause extra work that could be best allocated in other places.

Maybe it would make sense to use an alternate CPIO implementation, but even if there is no other user for cpiofile than host-upgrade_plugin, it is used to add data to install.img, which, as an initrd can only be a cpio file. Now, probably the cpiofile implementation is just overkill for just generating a simple cpio archive, and possibly we could start by getting host-upgrade_plugin move away from cpiofile...

@bernhardkaindl
Copy link
Collaborator

bernhardkaindl commented May 15, 2024

cpio format still have at least 2 major uses that don't seem to go away anytime soon (Linux initrd, and RPM), I would not necessarily assume they would refuse it.

Yeah, maybe one just have to give it a try.

My point (from experience writing tests and porting it to python3) was mostly that this is a big piece of code, and maintaining it separately from tarfile does indeed cause extra work that could be best allocated in other places.

I also extended the tests and updated the tests and fixed some other bugs found along the way while reviewing it. The Python3 work was quite a lot of work, but now that's done already.

  • I think it helps that we now have tests that could be rewritten in to use the upstream unittest module, but it's still a big project to do.
  • However, as a hobby project, it might be a nice project to see if it is doable.
  • That project would of course not be a project of python-libs or XenServer (I think), that would be someone's private project.

I just wanted to say that there is no one currently assigned to handle issues reported here and also not to handle ideas for the host-upgrade-plugin.

I just try to respond, but to get tasks assigned, it would have to be scheduled by management and I'm no longer in a team that is responsible for these Dom0 components like python-libs and the host-upgrade-plugin.

Also, generally no one reads the issues submitted here, I was just the exception because I had a watch this project to review pull requests, but I reduced the watch to mentions now to get fewer notifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants