Skip to content

cmd/go: add GOEXPERIMENT=cacheprog to let a child process implement the internal action/output cache #59719

Closed
@bradfitz

Description

@bradfitz

The cmd/go tool has great caching support. Unfortunately, its caching only supports filesystem-based caching.

I'd like to do things like hook into GitHub's native caching system at a lower level (instead of the inefficient thing people do now: untarring/tarring GOCACHE archives on every CI run, which is often slower than the CI action itself) and support things like a P2P cache gossip protocol between [trusted] coworkers within a company.

Clearly both those examples aren't realistic to add to cmd/go itself. So instead:

I propose that cmd/go support a GOCACHEPROG=/path/to/program environment variable (akin to GOCACHE=/path/to/dir) where the GOCACHEPROG is run as a child process and cmd/go speaks to it over stdin/stdout, translating the Go tool's internal cache interface, and then the GOCACHEPROG can do whatever caching mechanism/policy it wants.

I talked to @rsc about this once and he didn't seem opposed so I went off and implemented it and it's looking like it's going to be pretty awesome. (demo programs)

Thoughts, objections, etc?

(And preemptively: I have a soft spot for FUSE but FUSE is not an answer; it doesn't work in enough environments like CI test runner environments and it's finicky on basically all platforms but Linux, but also on Linux)


The protocol (from the code linked above) is currently:

// ProgCmd is a command that can be issued to a child process.
//
// If the interface needs to grow, we can add new commands or new versioned
// commands like "get2".
type ProgCmd string

const (
	cmdGet   = ProgCmd("get")
	cmdPut   = ProgCmd("put")
	cmdClose = ProgCmd("close")
)

// ProgRequest is the JSON-encoded message that's sent from cmd/go to
// the GOCACHEPROG child process over stdin. Each JSON object is on its
// own line. A ProgRequest of Type "put" with BodySize > 0 will be followed
// by a line containing a base64-encoded JSON string literal of the body.
type ProgRequest struct {
	// ID is a unique number per process across all requests.
	// It must be echoed in the ProgResponse from the child.
	ID int64

	// Command is the type of request.
	// The cmd/go tool will only send commands that were declared
	// as supported by the child.
	Command ProgCmd

	// ActionID is non-nil for get and puts.
	ActionID []byte `json:",omitempty"` // or nil if not used

	// ObjectID is set for Type "put" and "output-file".
	ObjectID []byte `json:",omitempty"` // or nil if not used

	// Body is the body for "put" requests. It's sent after the JSON object
	// as a base64-encoded JSON string when BodySize is non-zero.
	// It's sent as a separate JSON value instead of being a struct field
	// send in this JSON object so large values can be streamed in both directions.
	// The base64 string body of a ProgRequest will always be written
	// immediately after the JSON object and a newline.
	Body io.Reader `json:"-"`

	// BodySize is the number of bytes of Body. If zero, the body isn't written.
	BodySize int64 `json:",omitempty"`
}

// ProgResponse is the JSON response from the child process to cmd/go.
//
// With the exception of the first protocol message that the child writes to its
// stdout with ID==0 and KnownCommands populated, these are only sent in
// response to a ProgRequest from cmd/go.
//
// ProgResponses can be sent in any order. The ID must match the request they're
// replying to.
type ProgResponse struct {
	ID  int64  // that corresponds to ProgRequest; they can be answered out of order
	Err string `json:",omitempty"` // if non-empty, the error

	// KnownCommands is included in the first message that cache helper program
	// writes to stdout on startup (with ID==0). It includes the
	// ProgRequest.Command types that are supported by the program.
	//
	// This lets us extend the gracefully over time (adding "get2", etc), or
	// fail gracefully when needed. It also lets us verify the program
	// wants to be a cache helper.
	KnownCommands []ProgCmd `json:",omitempty"`

	// For Get requests.

	Miss      bool   `json:",omitempty"` // cache miss
	OutputID  []byte `json:",omitempty"`
	Size      int64  `json:",omitempty"`
	TimeNanos int64  `json:",omitempty"` // TODO(bradfitz): document

	// DiskPath is the absolute path on disk of the ObjectID corresponding
	// a "get" request's ActionID (on cache hit) or a "put" request's
	// provided ObjectID.
	DiskPath string `json:",omitempty"`
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions