Skip to content

x/build/cmd/relui: create proof of concept for release automation #40279

Closed
@toothrot

Description

@toothrot

As part of an effort to improve the reliability and reduce the complexity of Go releases, we should automate more of the release process.

A possible solution is a hosted release management tool (relui) that is responsible for the scheduling and operating of the release, much as releasebot handles this process on the CLI. We can improve the observability of the process by using a persistent web UI for coordinating releases.

As a proof of concept, relui acts as the coordinator (and currently the task executor) for release tasks.

Rough Architecture

For the proof of concept, there will be a single application for both starting and running workflows. However, careful API boundaries should be maintained between the scheduler process and the runner processes to allow for separation in the future.

In general, the runner process should only know how to execute specific tasks without broader context, and report back to the scheduler. Tasks should have an associated type, and the runner should decide whether or not it has an implementation for a given task type.

Tasks, or BuildableTasks, have a shared configuration. All tasks have a type, a name, an optional task name to depend on, and an artifact URL. If a task depends on another named task, it will be provided with that task's artifact URL upon starting.

Finally, workflows should be described through a configuration, in order to allow us to share steps between workflows, and separate concerns between the implementation and the definition of all steps that must be completed for a release. This is especially important for steps that may need to run on different platforms (outside of our tooling on GCP, such as the signing process).

Rough initial workflow for a local workstation release:

Clone repo @ ref
In:

  • Repo
  • Ref

Out:

  • Tarball of Go src
  • Tarball URI

Run make.bash
In:

  • Tarball URI of Go Src
    Out:
  • Tarball URI of Go src after make

Build Race Detector
In:

  • Tarball URI of Go src after make
    Out:
  • Tarball URI of Go src after race build

Clean (version.cache, pkg/bootstrap, race for other GOOS/GOARCH, pkg/GOOS_GOOARCH/cmd)
In:

  • Tarball URI of Go src after race build
    Out:
  • Tarball URI of Go src after cleanup

Run all.bash
In:

  • Tarball URI of Go Src after cleanup
    Out:
  • Tarball URI of Go src after all.bash

Finalize
In:

  • Tarball URI of Go Src after cleanup (not all.bash)
    Out:
  • Tarball URI of binary release

Tasks (remaining as of 2020-10-05)

  • Subscribe to a PubSub Topic
  • Publish status upon accepting a task
  • Run FetchGitSource task
  • Create Status API
  • Datastore Integration

relworker

(bootstrap worker) Subscribe to a PubSub Topic

A worker can connect to PubSub and subscribe to the configured topic. Messages should be subscribed to using the Receive API, which handles spawning goroutines for handling the message, as well as auto-extending the Ack deadline while a message is being processed.

For now, we can just log that we got the message,can handle it, and Ack it.

See: https://pkg.go.dev/cloud.google.com/go/pubsub#Subscription.Receive

Publish status on accepting a task

When the worker picks up a task, it should update the status of the task in relui as started.

Run FetchGitSource task

The FetchGitSource task should fetch the specified Git repo at the configured ref. The source should be tarred, record the artifact URL to relui, and mark the task as complete.

On gitiles, there is an +archive URL for this task:

Web: https://go.googlesource.com/go/+archive/refs/heads/master

Archive: https://go.googlesource.com/go/+archive/refs/heads/master.tar.gz

Handle non-transient errors on FetchGitSource

If a permanent error occurs when executing a task, the message should be ACK’d to prevent retries, and a terminal status for the task should be reported back to relui.

relui

Create Status API (gRPC server)

A worker can communicate the status of a task back to the coordinator as it progresses on a task. The initial API should at least be able to mark a task as started.

  1. Host gRPC and HTTPS on the same port (with some caveats)
    1. See caveats here: document what the server's ServeHTTP is missing compared to the Serve method grpc/grpc-go#2662 (comment)
  2. Host gRPC on same service but different port as HTTPS relui web
  3. Use separate instances of relui for gRPC

Datastore integration

Currently, relui commits state to disk. It should have a persistent database for handling multiple instances, like Datastore.

/cc @dmitshur @cagedmantis @andybons

Metadata

Metadata

Assignees

Labels

Buildersx/build issues (builders, bots, dashboards)FrozenDueToAgeNeedsFixThe path to resolution is known, but the work has not been done.

Type

No type

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions