Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Provide Nightly Build to PyPi #872

Open
kevinjqliu opened this issue Jun 29, 2024 · 11 comments
Open

[Feature] Provide Nightly Build to PyPi #872

kevinjqliu opened this issue Jun 29, 2024 · 11 comments

Comments

@kevinjqliu
Copy link
Contributor

kevinjqliu commented Jun 29, 2024

Feature Request / Improvement

Starting an issue to gather feedback on providing nightly builds for pyiceberg. Resolves #734.
Thanks @syun64 for the pointers and feedback.

PyIceberg Release Process

The current release process as documented in How to release.
To publish a release candidate (RC) to the public,

  • Tag and sign Major/Minor release via git, push to the apache branch
  • Build artifacts with Github action
  • Upload to Apache SVN
  • Upload to PyPi
    • Download artifact from Python release Github action
    • twine upload

Proposed Nightly Build Process

Goals

  • Only PyPi is needed, can skip SVN
  • Automate nightly build. Using cron-based Github Action
  • Automate upload to PyPi. Using Github Action to push directly to PyPi
  • Make sure the PyPi package is uploaded as pre-release/development versions
  • Make nightly build installable via pip install pyiceberg --pre, preferred.
    • Alternatively, install via a new nightly package, i.e. pyiceberg-nightly

Reference

@kevinjqliu
Copy link
Contributor Author

I've recently received feedback from users that it would be beneficial to have more releases. A faster release cadence might be more desirable than having a nightly build.

Both will require some kind of automation for the release process.

@kevinjqliu
Copy link
Contributor Author

Based on this tutorial, I was able to publish new versions of the library to PyPi via Github Action.

Here are the relevant steps:

I created an account on Pypi and was able to publish my forked repo of Pyiceberg using the pypi-publish GitHub Action.

To do so, I created .github/workflows/publish.yml file and pushed it to my forked repo's main branch.

I had to change the package name to pyiceberg-kevinliu to not conflict with the existing package.

I set up "Trusted Publisher Management" via the Pypi website for my forked repo.

On the forked repo, I created a new release and tag, named "v0.6.1". This kicks off the Github Action to publish to Pypi

Resulting in this new package https://pypi.org/project/pyiceberg-kevinliu/

@kevinjqliu
Copy link
Contributor Author

I hope we can use some parts of the above to make future releases faster and more automated

@kevinjqliu
Copy link
Contributor Author

@Fokko / @HonahX / @syun64
Would love to get your thoughts on this.

@sungwy
Copy link
Collaborator

sungwy commented Jul 15, 2024

Very exciting to hear that you were already able to get a package published through Github Actions! For now, I'm leaning towards this approach of having a separate namespace for for nightly builds like pyiceberg-nightly.

One downside of that approach is that this will create a name collision issues if users accidentally install both pyiceberg and pyiceberg-nightly packages in the same environment.

But I think there's still a lot to gain by separating out the package namespace of an intentional publication (release candidates, and successful releases) versus an automated nightly publication from main, just in terms of how easy it would be for us to manage our packages. I'm not quite sure what the best way to do this would be, but I would imagine we would want to support the concept of having a retention policy on the nightly package as well, so we clean up packages that were published a year ago, as an example.

Here's a link to some relevant discussion on this topic on a PyPi warehouse discussion thread

@HonahX
Copy link
Contributor

HonahX commented Jul 16, 2024

@kevinjqliu Thanks for doing the experiments!

But I think there's still a lot to gain by separating out the package namespace of an intentional publication (release candidates, and successful releases)

+1, the ASF released policy also suggests that we should "hide" the nightly build from non-developer as much as possible. We could also consider other ways of separation: For example:

@Fokko
Copy link
Contributor

Fokko commented Jul 16, 2024

FWIW, Iceberg Java also publishes nightly snapshots: https://repository.apache.org/content/groups/snapshots/org/apache/iceberg/iceberg-core/ But it is hidden quite well for a reason :D

I'm open to it. I'm not sure if a separate package is the best, as you can also set tags on the releases itself: https://pypi.org/project/pyiceberg/#history You can see the pre-releases there.

Another nice thing is that we would test our release pipelines on a daily basis 💪

@kevinjqliu
Copy link
Contributor Author

Seems like we have a way forward for nightly build. We can run the Pypi upload on a GitHub action nightly cron.

I want to take a step back and talk about the general release process. I want to figure out how to shorten the burden of the release process so that we can release at a faster cadence.
The release instructions document several steps of the process

  • set git tag
  • sign files with GPG and upload to SVN
  • upload to Pypi
  • email devlist about new release

Are these steps all necessary to release a new version?
Is there room for automation similar to the Pypi automation above?
Can we use the Github Release process somehow?

@djouallah
Copy link

any news on this, currently pyiceberg is broken with polaris and will like to use the latest update that fix it

@kevinjqliu
Copy link
Contributor Author

@djouallah I'll take another look at the nightly build. But we're in the process of releasing 0.8.0; its in the voting stage https://lists.apache.org/thread/0xcw56z1bpldypm7pv92h70fhhq0qgfq

@kevinjqliu
Copy link
Contributor Author

FYI https://lists.apache.org/thread/oowhcfwv3fcjzdzm76tbn99k5q84mr75
One step closer to nightly build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants