Description
As part of an effort to improve the reliability and reduce the complexity of Go releases, we should automate more of the release process.
A possible solution is a hosted release management tool (relui) that is responsible for the scheduling and operating of the release, much as releasebot
handles this process on the CLI. We can improve the observability of the process by using a persistent web UI for coordinating releases.
As a proof of concept, relui acts as the coordinator (and currently the task executor) for release tasks.
Rough Architecture
For the proof of concept, there will be a single application for both starting and running workflows. However, careful API boundaries should be maintained between the scheduler process and the runner processes to allow for separation in the future.
In general, the runner process should only know how to execute specific tasks without broader context, and report back to the scheduler. Tasks should have an associated type, and the runner should decide whether or not it has an implementation for a given task type.
Tasks, or BuildableTasks, have a shared configuration. All tasks have a type, a name, an optional task name to depend on, and an artifact URL. If a task depends on another named task, it will be provided with that task's artifact URL upon starting.
Finally, workflows should be described through a configuration, in order to allow us to share steps between workflows, and separate concerns between the implementation and the definition of all steps that must be completed for a release. This is especially important for steps that may need to run on different platforms (outside of our tooling on GCP, such as the signing process).
Rough initial workflow for a local workstation release:
Clone repo @ ref
In:
- Repo
- Ref
Out:
- Tarball of Go src
- Tarball URI
Run make.bash
In:
- Tarball URI of Go Src
Out: - Tarball URI of Go src after make
Build Race Detector
In:
- Tarball URI of Go src after make
Out: - Tarball URI of Go src after race build
Clean (version.cache, pkg/bootstrap, race for other GOOS/GOARCH, pkg/GOOS_GOOARCH/cmd)
In:
- Tarball URI of Go src after race build
Out: - Tarball URI of Go src after cleanup
Run all.bash
In:
- Tarball URI of Go Src after cleanup
Out: - Tarball URI of Go src after all.bash
Finalize
In:
- Tarball URI of Go Src after cleanup (not all.bash)
Out: - Tarball URI of binary release
Tasks (remaining as of 2020-10-05)
- Subscribe to a PubSub Topic
- Publish status upon accepting a task
- Run FetchGitSource task
- Create Status API
- Datastore Integration
relworker
(bootstrap worker) Subscribe to a PubSub Topic
A worker can connect to PubSub and subscribe to the configured topic. Messages should be subscribed to using the Receive API, which handles spawning goroutines for handling the message, as well as auto-extending the Ack deadline while a message is being processed.
For now, we can just log that we got the message,can handle it, and Ack it.
See: https://pkg.go.dev/cloud.google.com/go/pubsub#Subscription.Receive
Publish status on accepting a task
When the worker picks up a task, it should update the status of the task in relui as started.
Run FetchGitSource task
The FetchGitSource task should fetch the specified Git repo at the configured ref. The source should be tarred, record the artifact URL to relui, and mark the task as complete.
On gitiles, there is an +archive URL for this task:
Web: https://go.googlesource.com/go/+archive/refs/heads/master
Archive: https://go.googlesource.com/go/+archive/refs/heads/master.tar.gz
Handle non-transient errors on FetchGitSource
If a permanent error occurs when executing a task, the message should be ACK’d to prevent retries, and a terminal status for the task should be reported back to relui.
relui
Create Status API (gRPC server)
A worker can communicate the status of a task back to the coordinator as it progresses on a task. The initial API should at least be able to mark a task as started.
- Host gRPC and HTTPS on the same port (with some caveats)
- Host gRPC on same service but different port as HTTPS relui web
- Use separate instances of relui for gRPC
Datastore integration
Currently, relui commits state to disk. It should have a persistent database for handling multiple instances, like Datastore.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status