-
-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Package registration process #849
Comments
No: unfortunately this requires full write access to the repo: https://developer.github.com/v3/apps/permissions/#permission-on-contents As part of step 2 or 4, you will probably also want to ensure the Project.toml file is up-to-date: it is possible to ask for restricted access to just that file: https://developer.github.com/v3/apps/permissions/#permission-on-single-file |
I think the steps here make sense as a long-term goal to strive for but aiming for all of this as a start feels a bit ambitious. I think we should start off with something as similar as possible to the current workflow we have. That would be the following:
A git tag. That is all.
CIBot runs on the registry PR. Long-term goal is to move this to move this to be report-based.
If CIBot + diff looks ok, merge. Otherwise, delete tag and redo the process. My goal with this is to be able to remove METADATA as quickly as possible and with as little work as possible. I think it is fair to say that we are extremely starved for work when it comes to these things and we should have that in mind when deciding what to do next. |
I know it's early to start bikeshedding, but this sounds like Package version registration process, as opposed to Package registration. To me, the later implies new packages, no? Or is the idea that there won't be much daylight between registering a new version and registering a new package? It seems like there are a couple of additional steps for a new package, though I suppose a lot of things (like checking for name conflicts etc) are less critical with the new system. Anyway, I agree with @KristofferC w/r/t getting moving as quickly as possible except where moving too quickly blocks paths for future improvement. I continue to love Pkg3, thanks for all the hard (and thoughtful) work! Edit: aaaaand I now see that your message on slack was to specific people, not a general call. Sorry for butting in 😳 |
No worries, I wanted feedback from anyone, but wanted to make sure specific people saw it.
Registering versions of packages was what was intended. Registering a package is just registering the first version of it. We can have a rule like the first version always requires manual registry manager approval, which would give a chance to review the name and license and whatever. |
Personally, once I can see a big picture that makes sense to me and I know what each piece needs to do and be, filling in the parts becomes easier. These parts don't need to be very complex or do very much initially. The key is to get the skeleton in place so that when we want to add features, there's a clear and well-defined place to do it. Getting the shape of the thing wrong for the sake of getting something out there a bit faster strikes me as a false economy.
We already know that tagging first has serious problems. People don't and won't delete tags—and git intentionally makes tag deletion quite difficult. Saying "A git tag. That is all." is also sweeping under the rug things you still need such as the repo in question. A tag is just git's way of associating a symbolic version name with a commit. Why is making an API request with a repo and a tag name simpler then sending an API request with a repo, a version number and a tree hash? They're both equally easy for a bot to deal with; the version number + tree hash can be retried easily if it fails without screwing around with deleting and retagging in git (which you're not really supposed to do—once a tag is public it's supposed to be written in stone, which is why git makes deleting them so annoying and difficult).
All that a "report" means is "the output of whatever check processes we run". Initially it can just be the output of CIBot. However, in the future when we want to check more things, this is where we put those checks and the report includes the output of whatever verification processes we run. Perhaps you've interpreted my outline as a document of all the things we need to do in the first version? That's not the intention: it's an outline of how all the things we eventually want to do fit into a coherent process. The initial version will be a minimal subset of this outline.
This can be in the form of a PR, the merging of which triggers the rest of the process. The key point is that it should require package maintainer approval and mostly not require any manual intervention by a registry manager. The main simplification of your three-step proposal is by not putting tagging between review and registration. This allows the review to be done on a registration PR, which is nice and is how we do it today. That does seem like a good simplification and collapses the review and register steps into one step since review approval is indicated by merging the PR which is the registration step. I still think it's better to put tagging at the end once we know that a version is valid, correct and registered. I originally had it after registration but then started thinking "but what if the package maintainer doesn't do the tagging correctly?" But that's probably a silly worry. We can give package maintainers the option of either giving the bot write access and letting it create the tag or there can be a manual alternative if they don't want to give write access. If they don't do it right, it can be fixed. Honestly, it hardly even matters if tags get created at all—they really only exist for convenience when working with git tools in a cloned package repo. |
Thinking further out, I think we might want to further separate this into official "channels". For example, common channels might include:
Where the first two imply a different rate of updates, and strictness of criteria—and corresponding reliability differences—and the third largely bypasses step 3 (human review). |
I'm not really sure how that fits into this registration process proposal? |
I think of the tag as the canonical record of the existence of the version, as it is under the direct control of the project. The registration process is simply an external reference to that existing fact (and not the other way around). Similar to how the canonical copy of other artifacts is project-local (url, manifest, dependency list, uuid), but can then also get duplicated across an arbitrary set of other registries, webpages, applications, git repos, etc. |
By "thinking further out", I meant to imply this was a list of non-goals for the current implementation (which is only intended to cover the primary use case—directly replacing the functionality of METADATA.jl—and not yet providing new features). |
Yes, and this is why tagging has to happen after version validation, not before. |
I agree, unfortunately it is kind of hard to fit that into the GitHub model (they really need a "Release Request" workflow). |
FWIW I tried to outline a protocol independently and I came up with essential the same thing:
|
I think we can have three basic components for this whole thing: Registration bot
Merge bot
Tag bot
The system is already usable with just the registration bot. The merge bot alleviates the need for registry managers having to manually merging all registration PRs: ones that pass checks and are approved by the package maintainer are merged automatically. The tag bot is only necessary to eliminate the need for package maintainers to manually tag versions. |
From a maintainer point of view, I think the current AttoBot based workflow is pretty good, the only real place it falls down is having to rewrite git tags. This is definitely an abuse of git and can easily (and by design) result in different users having the same tag pointing to different commits. However, I'd like to point out that deleting or writing new tags with a unique name is not such a big deal. We could drive the workflow off a simple release candidate tag naming convention as follows:
The benefits of this are that it is all "native git", and I believe uses tags as they are meant to be used, unlike our current workflow. It also allows maintainers to have complete control over their own tags which is great for people like myself who like to write release notes in the tag annotation. |
Would the "bots" be Github-specific? Can the process be later generalized to other (public) repository hosting services, like Gitlab or Bitbucket? |
There's no inherent reason the release automation bot would be github specific. For example all of bitbucket, gitlab and gitea have webhooks for git pushes so integration with those systems would presumably be a "simple matter of work". With a purely tag-based workflow, presumably even a plain git server could be integrated with the release bot using a few standard git hooks POSTing to the bot API. |
The rc-tag based workflow had occurred to me as well but it seems like potentially a lot of unnecessary tags littering the repo. Note also that this workflow requires the bot to maintain state whereas the other workflow does not. It's worth considering but it's a pretty large departure from the PR-based workflow that we have already been using and which is discussed above. Note that if you want to write tag notes, then you can still do so in a PR-based workflow by tagging manually. |
Well, the rc-related tags could be deleted after release by maintainers who don't like them. It's ok to delete tags from a repository, it's just not ok to delete them and replace them by another tag with the same name. I'm not sure where this requires the bot to maintain state? Let me restate the order of events from the bot's point of view:
Seems to me like the PR on the registry can still be used to hold the state. The main difference here is that we'd be driving the process via immutable tags, rather than mutable github releases. I think? |
Why not use a release branch instead of the release tags? Git branches are much more fluid than tags so I think it fits well with the release workflow. This way, the registration bot can watch branches (say) |
That's a pretty good idea, it would be super easy to use as a maintainer and also addresses Stefan's worry about having a lot of useless tags lying around. So this would address the pre-release workflow very nicely. What about finalizing the release and creating the tag? Some ideas:
[edited for clarity] |
It probably is a dumb question, but why would you need git tag or github release? Since the registry is the source of truth, git tag and github release are not required for Pkg.jl to work, right? It certainly is nice to have git tags to be a well-behaving git repository. But wouldn't it make sense to exclude it from the minimal requirement for the registration process, if it's not required? Even if it's excluded from the registration process, I think we can have something like (say) |
The lack of tags could be an issue if we want to support downloading tarballs from arbitrary git servers: by default they restrict the downloading of archives that aren't pointed too by a ref (i.e. a current branch or tag), e.g. https://git-scm.com/docs/git-upload-archive#_security This isn't an issue with GitHub, as it doesn't even support |
You don't, strictly speaking. But you need some way for the package maintainer to signal that they've reviewed the release. And it would be extremely useful if the registration process ensured that versions were reflected natively and reliably in git tags across the ecosystem. |
@simonbyrne Thanks. I didn't know about @c42f Yes, I totally agree that having git tags is a good practice. I'm just suggesting to exclude it from the hard requirements since the permission requirement in github complicates the implementation of the process. Regarding the "way for the package maintainer to signal that they've reviewed the release", I think by requesting the registration the intent is pretty clear (= "please register it if there is no problem"). Is there a reason why to have a human intervention between the approval (review) and merging registration PR? Package maintainer has to fix the problem after the rejection but why not just go ahead and merge the PR if the registry manager (or a bot) say yes? |
Yes, I wondered about that. It would be nice if the registration process could be completely fire-and-forget in the case that all tests pass, given sufficiently good tests. Having the tagging bot do the work could be optional so if you want a fire-and-forget process, you can give the tagging bot permission to do the tag for you once the tests pass. If you don't want to give it permission just write the tag yourself. If you have your own infrastructure you can run your own tagging bot. etc. |
On the other hand, the release bot can provide valuable insight in its report which is not easily available on the developer's local machine when they create the release candidate branch/tag. (eg, interactions with other packages in the ecosystem). |
I had the impression that CIBot is limited by its computing resource at the moment (e.g., by reading https://discourse.julialang.org/t/the-current-metadata-release-process/16672/15). I thought you'd want to avoid people casually invoking CIBot for, e.g., testing an alpha version. (Though it would be great if that's OK.) Thinking about the tag-invoked release model, it's probably not so hard to use if the registration bot directly responds to the push of a tag of the right format. Then re-registration after check/review failure can be done by just a few git commands (instead of manual re-release in github UI). Then creating |
I also started implementing the core process, which I'm calling Server is at the center
HTTP request: PkgDev client ==> register server
register server:
register server ==> GitHub/GitLab API
HTTP reply: register server ==> PkgDev client
register server: queue up registration checks register server ==> GitHub/GitLab API
register server ==> CI server (request checks) CI server: run checks CI server ==> register server (check response) register server ==> GitHub/GitLab API
register server ==> GitHub/GitLab API
Process that pulls tags from each repo with an open PR
Authentication
Maybe kick of with CheckRunEvent Or maybe kick of with PR that changes version number in project file?
|
@vchuravy has asked
Which is a good question and one we should keep in mind. |
Possible process for triggering registator from discussion on triage:
|
Unless I'm doing it the hard way, currently the process of tagging a new release for one's own private registries is a bit painful and error-prone (e.g., I forget to update the version number in the I'm guessing there's some script somewhere for mapping new METADATA.jl tags to changes to the General registry. Is it possible to get one's hands on that? |
The script that runs in a loop and converts METADATA to General is here: https://github.com/JuliaLang/Pkg.jl/blob/master/bin/loop.sh It calls lots of stuff in the |
At Invenia we do all operations on our METADATA fork and convert to registry format every time with sync with upstream METADATA or perform any operations on our fork. All the steps on the registry are done automatically through GitLab CI pipelines and @StefanKarpinski's code mixed with scripts that @ararslan wrote. |
It is now checked, which can lead to rejections upon tagging new releases. Cf. https://discourse.julialang.org/t/version-of-package-version-of-documentation-version-of-project-toml/20582/3 JuliaLang/Pkg.jl#849 JuliaLang/Pkg.jl#351
WIP implementation of a registration server: https://github.com/JuliaComputing/Registrator.jl |
Seems like this can be closed now? |
Before we can move from METADATA being the source of truth for registered packages to JuliaRegistries/General being the source of truth, we need a process and infrastructure for registering packages. Here's an outline of some of my thoughts on what the process should look like.
1. Request
Who: package maintainer
Package maintainer proposes a new version tag
Possible data to include:
Or maybe just one of
patch
,minor
ormajor
and version number is automatic based on existing version numbers. This is almost certainly too much data but I wanted to mention all the possible info one might want. Ideally we'd like to automate as much of this as possible from the repo.2. Check
Who: automated
Automated validation of whether a proposed version tag is acceptable.
Check that:
Produces report of results
3. Review
Who:
Maintainer review:
If maintainer accepts, go to registry manager review:
4. Tagging
Who:
Propagate git tag to package repo
Can we use the github API to create tags automatically?
What would the workflow be for non-github packages?
The tagged tree also needs to be publicly accessible but I think tagging guarantees that in git. If the tag does not match the version approved in the review step above then the version is not properly tagged and the rest of the process is blocked until the tag is fixed.
5. Registration
Who: automated
Once a new version has been approved and tagged, it can finally be registered. This consists of making the appropriate updates to the registry repo. This will be completely automatic.
The text was updated successfully, but these errors were encountered: