Skip to content

Commit

Permalink
fixup! Add support for transferring compressed blobs via ByteStream
Browse files Browse the repository at this point in the history
restructure
  • Loading branch information
mostynb committed Sep 4, 2020
1 parent d342599 commit ed21a18
Showing 1 changed file with 66 additions and 53 deletions.
119 changes: 66 additions & 53 deletions build/bazel/remote/execution/v2/remote_execution.proto
Original file line number Diff line number Diff line change
Expand Up @@ -193,46 +193,51 @@ service ActionCache {
//
// For small file uploads the client should group them together and call
// [BatchUpdateBlobs][build.bazel.remote.execution.v2.ContentAddressableStorage.BatchUpdateBlobs].
//
// For large uploads, the client must use the
// [Write method][google.bytestream.ByteStream.Write] of the ByteStream API. The
// `resource_name` is `{instance_name}/uploads/{uuid}/blobs/{hash}/{size}`,
// where `instance_name` is as described in the next paragraph, `uuid` is a
// version 4 UUID generated by the client, and `hash` and `size` are the
// [Digest][build.bazel.remote.execution.v2.Digest] of the blob. The
// `uuid` is used only to avoid collisions when multiple clients try to upload
// the same file (or the same client tries to upload the file multiple times at
// once on different threads), so the client MAY reuse the `uuid` for uploading
// different blobs. The `resource_name` may optionally have a trailing filename
// (or other metadata) for a client to use if it is storing URLs, as in
// `{instance}/uploads/{uuid}/blobs/{hash}/{size}/foo/bar/baz.cc`. Anything
// after the `size` is ignored.
// [Write method][google.bytestream.ByteStream.Write] of the ByteStream API.
//
// For uncompressed data, The `WriteRequest.resource_name` is of the following form:
// `{instance_name}/uploads/{uuid}/blobs/{hash}/{size}{/optional_metadata}`
//
// Where:
// * `instance_name` is an identifier, possibly containing multiple path
// segments, used to distinguish between the various instances on the server,
// in a manner defined by the server. If it is the empty path, the leading
// slash is omitted, so that the `resource_name` becomes
// `uploads/{uuid}/blobs/{hash}/{size}{/optional_metadata}`.
// To simplify parsing, a path segment cannot equal any of the following
// keywords: `blobs`, `uploads`, `actions`, `actionResults`, `operations`,
// `capabilities` or `compressed-blobs`.
// * `uuid` is a version 4 UUID generated by the client, used to avoid
// collisions between concurrent uploads of the same data. Clients MAY
// reuse the same `uuid` for uploading different blobs.
// * `hash` and `size` refer to the [Digest][build.bazel.remote.execution.v2.Digest]
// of the data being uploaded.
// * `optional_metadata` is implementation specific data, which clients MAY omit.
// Servers MAY ignore this metadata.
//
// Clients can upload compressed data with a `resource_name` of the form
// Data can alternatively be uploaded in compressed form, with the following
// `WriteRequest.resource_name` form:
// `{instance_name}/uploads/{uuid}/compressed-blobs/{compressor}/{uncompressed_hash}/{uncompressed_size}{/optional_metadata}`
// where `compressor` is the lowercase string form of a `Compressor.Value` enum
// other than `identity` which is advertised by the server in
// [CacheCapabilities.supported_compressor][build.bazel.remote.execution.v2.CacheCapabilities.supported_compressor],
// `uncompressed_hash` and `uncompressed_size` represent the
// [Digest][build.bazel.remote.execution.v2.Digest] of the uploaded blob
// once decompressed. Servers MUST verify that the `uncompressed_hash` and
// `uncompressed_size` match that of the uploaded data once uncompressed,
// and MUST return an `INVALID_ARGUMENT` error in the case of mismatch.
//
// Where:
// * `instance_name`, `uuid` and `optional_metadata` are defined as above.
// * `compressor` is a lowercase string form of a `Compressor.Value` enum
// other than `identity`, which is supported by the server and advertised in
// [CacheCapabilities.supported_compressor][build.bazel.remote.execution.v2.CacheCapabilities.supported_compressor].
// * `uncompressed_hash` and `uncompressed_size` refer to the
// [Digest][build.bazel.remote.execution.v2.Digest] of the data being
// uploaded, once uncompressed. Servers MUST verify that these match
// the uploaded data once uncompressed, and MUST return an
// `INVALID_ARGUMENT` error in the case of mismatch.
//
// Uploads of the same data MAY occur concurrently in any form, compressed or
// uncompressed.
//
// Clients SHOULD NOT use gRPC-level compression for ByteStream API `Write`
// calls of compressed blobs, since this would compress already-compressed data.
//
// A single server MAY support multiple instances of the execution system, each
// with their own workers, storage, cache, etc. The exact relationship between
// instances is up to the server. If the server does, then the `instance_name`
// is an identifier, possibly containing multiple path segments, used to
// distinguish between the various instances on the server, in a manner defined
// by the server. For servers which do not support multiple instances, then the
// `instance_name` is the empty path and the leading slash is omitted, so that
// the `resource_name` becomes `uploads/{uuid}/blobs/{hash}/{size}`.
// To simplify parsing, a path segment cannot equal any of the following
// keywords: `blobs`, `uploads`, `actions`, `actionResults`, `operations`,
// `capabilities` and `compressed-blobs`.
//
// When attempting an upload, if another client has already completed the upload
// (which may occur in the middle of a single upload if another client uploads
// the same blob concurrently), the request will terminate immediately with
Expand All @@ -243,28 +248,36 @@ service ActionCache {
// `INVALID_ARGUMENT` error will be returned. In either case, the client should
// not attempt to retry the upload.
//
// For downloading blobs, the client must use the
// [Read method][google.bytestream.ByteStream.Read] of the ByteStream API, with
// a `resource_name` of `"{instance_name}/blobs/{hash}/{size}"`, where
// `instance_name` is the instance name (see above), and `hash` and `size` are
// the [Digest][build.bazel.remote.execution.v2.Digest] of the blob.
// Small downloads can be grouped and requested in a batch via
// [BatchReadBlobs][build.bazel.remote.execution.v2.ContentAddressableStorage.BatchReadBlobs].
//
// Clients can download compressed data with a `resource_name` of the form
// For large downloads, the client must use the
// [Read method][google.bytestream.ByteStream.Read] of the ByteStream API.
//
// For uncompressed data, The `ReadRequest.resource_name` is of the following form:
// `{instance_name}/blobs/{hash}/{size}`
// Where `instance_name`, `hash` and `size` are defined as for uploads.
//
// Data can alternatively be downloaded in compressed form, with the following
// `ReadRequest.resource_name` form:
// `{instance_name}/compressed-blobs/{compressor}/{uncompressed_hash}/{uncompressed_size}`
// where compressor is the lowercase string form of a `Compressor.Value` enum
// other than `identity` which is advertised by the server in
// [CacheCapabilities.supported_compressor][build.bazel.remote.execution.v2.CacheCapabilities.supported_compressor],
// `uncompressed_hash` and `uncompressed_size` represent the
// [Digest][build.bazel.remote.execution.v2.Digest] of the downloaded blob
// once decompressed. Clients SHOULD verify that the `uncompressed_hash` and
// `uncompressed_size` match that of the downloaded data once uncompressed, and
// take appropriate steps in the case of failure such as retrying a limited
// number of times or surfacing an error to the user. Servers MUST provide
// compressed data for any blob which is still advertised as being available,
// even if it is not already stored in compressed form. Servers MAY use any
// compression level they choose, including different levels for different
// blobs (e.g. choosing a level designed for maximum speed for data known to be
// incompressible).
//
// Where:
// * `instance_name` and `compressor` are defined as for uploads.
// * `uncompressed_hash` and `uncompressed_size` refer to the
// [Digest][build.bazel.remote.execution.v2.Digest] of the data being
// downloaded, once uncompressed. Clients MUST verify that these match
// the downloaded data once uncompressed, and take appropriate steps in
// the case of failure such as retrying a limited number of times or
// surfacing an error to the user.
//
// Servers MAY use any compression level they choose, including different
// levels for different blobs (e.g. choosing a level designed for maximum
// speed for data known to be incompressible).
//
// Servers MUST be able to provide data for all recently advertised blobs in
// each of the compression formats that the server supports, as well as in
// uncompressed form.
//
// Clients SHOULD NOT use gRPC-level compression on ByteStream API `Read`
// requests for compressed blobs, since this would compress already-compressed
Expand Down

0 comments on commit ed21a18

Please sign in to comment.