fixup! Add support for transferring compressed blobs via ByteStream

restructure
bazelbuild · Sep 4, 2020 · ed21a18 · ed21a18
1 parent d342599
commit ed21a18
Showing 1 changed file with 66 additions and 53 deletions.
diff --git a/build/bazel/remote/execution/v2/remote_execution.proto b/build/bazel/remote/execution/v2/remote_execution.proto
@@ -193,46 +193,51 @@ service ActionCache {
 //
 // For small file uploads the client should group them together and call
 // [BatchUpdateBlobs][build.bazel.remote.execution.v2.ContentAddressableStorage.BatchUpdateBlobs].
+//
 // For large uploads, the client must use the
-// [Write method][google.bytestream.ByteStream.Write] of the ByteStream API. The
-// `resource_name` is `{instance_name}/uploads/{uuid}/blobs/{hash}/{size}`,
-// where `instance_name` is as described in the next paragraph, `uuid` is a
-// version 4 UUID generated by the client, and `hash` and `size` are the
-// [Digest][build.bazel.remote.execution.v2.Digest] of the blob. The
-// `uuid` is used only to avoid collisions when multiple clients try to upload
-// the same file (or the same client tries to upload the file multiple times at
-// once on different threads), so the client MAY reuse the `uuid` for uploading
-// different blobs. The `resource_name` may optionally have a trailing filename
-// (or other metadata) for a client to use if it is storing URLs, as in
-// `{instance}/uploads/{uuid}/blobs/{hash}/{size}/foo/bar/baz.cc`. Anything
-// after the `size` is ignored.
+// [Write method][google.bytestream.ByteStream.Write] of the ByteStream API.
+//
+// For uncompressed data, The `WriteRequest.resource_name` is of the following form:
+// `{instance_name}/uploads/{uuid}/blobs/{hash}/{size}{/optional_metadata}`
+//
+// Where:
+// * `instance_name` is an identifier, possibly containing multiple path
+//   segments, used to distinguish between the various instances on the server,
+//   in a manner defined by the server. If it is the empty path, the leading
+//   slash is omitted, so that  the `resource_name` becomes
+//   `uploads/{uuid}/blobs/{hash}/{size}{/optional_metadata}`.
+//   To simplify parsing, a path segment cannot equal any of the following
+//   keywords: `blobs`, `uploads`, `actions`, `actionResults`, `operations`,
+//   `capabilities` or `compressed-blobs`.
+// * `uuid` is a version 4 UUID generated by the client, used to avoid
+//    collisions between concurrent uploads of the same data. Clients MAY
+//    reuse the same `uuid` for uploading different blobs.
+// * `hash` and `size` refer to the [Digest][build.bazel.remote.execution.v2.Digest]
+//   of the data being uploaded.
+// * `optional_metadata` is implementation specific data, which clients MAY omit.
+//   Servers MAY ignore this metadata.
 //
-// Clients can upload compressed data with a `resource_name` of the form
+// Data can alternatively be uploaded in compressed form, with the following
+// `WriteRequest.resource_name` form:
 // `{instance_name}/uploads/{uuid}/compressed-blobs/{compressor}/{uncompressed_hash}/{uncompressed_size}{/optional_metadata}`
-// where `compressor` is the lowercase string form of a `Compressor.Value` enum
-// other than `identity` which is advertised by the server in
-// [CacheCapabilities.supported_compressor][build.bazel.remote.execution.v2.CacheCapabilities.supported_compressor],
-// `uncompressed_hash` and `uncompressed_size` represent the
-// [Digest][build.bazel.remote.execution.v2.Digest] of the uploaded blob
-// once decompressed. Servers MUST verify that the `uncompressed_hash` and
-// `uncompressed_size` match that of the uploaded data once uncompressed,
-// and MUST return an `INVALID_ARGUMENT` error in the case of mismatch.
+//
+// Where:
+// * `instance_name`, `uuid` and `optional_metadata` are defined as above.
+// * `compressor` is a lowercase string form of a `Compressor.Value` enum
+//   other than `identity`, which is supported by the server and advertised in
+//   [CacheCapabilities.supported_compressor][build.bazel.remote.execution.v2.CacheCapabilities.supported_compressor].
+// * `uncompressed_hash` and `uncompressed_size` refer to the
+//   [Digest][build.bazel.remote.execution.v2.Digest] of the data being
+//   uploaded, once uncompressed. Servers MUST verify that these match
+//   the uploaded data once uncompressed, and MUST return an
+//   `INVALID_ARGUMENT` error in the case of mismatch.
+//
+// Uploads of the same data MAY occur concurrently in any form, compressed or
+// uncompressed.
 //
 // Clients SHOULD NOT use gRPC-level compression for ByteStream API `Write`
 // calls of compressed blobs, since this would compress already-compressed data.
 //
-// A single server MAY support multiple instances of the execution system, each
-// with their own workers, storage, cache, etc. The exact relationship between
-// instances is up to the server. If the server does, then the `instance_name`
-// is an identifier, possibly containing multiple path segments, used to
-// distinguish between the various instances on the server, in a manner defined
-// by the server. For servers which do not support multiple instances, then the
-// `instance_name` is the empty path and the leading slash is omitted, so that
-// the `resource_name` becomes `uploads/{uuid}/blobs/{hash}/{size}`.
-// To simplify parsing, a path segment cannot equal any of the following
-// keywords: `blobs`, `uploads`, `actions`, `actionResults`, `operations`,
-// `capabilities` and `compressed-blobs`.
-//
 // When attempting an upload, if another client has already completed the upload
 // (which may occur in the middle of a single upload if another client uploads
 // the same blob concurrently), the request will terminate immediately with
@@ -243,28 +248,36 @@ service ActionCache {
 // `INVALID_ARGUMENT` error will be returned. In either case, the client should
 // not attempt to retry the upload.
 //
-// For downloading blobs, the client must use the
-// [Read method][google.bytestream.ByteStream.Read] of the ByteStream API, with
-// a `resource_name` of `"{instance_name}/blobs/{hash}/{size}"`, where
-// `instance_name` is the instance name (see above), and `hash` and `size` are
-// the [Digest][build.bazel.remote.execution.v2.Digest] of the blob.
+// Small downloads can be grouped and requested in a batch via
+// [BatchReadBlobs][build.bazel.remote.execution.v2.ContentAddressableStorage.BatchReadBlobs].
 //
-// Clients can download compressed data with a `resource_name` of the form
+// For large downloads, the client must use the
+// [Read method][google.bytestream.ByteStream.Read] of the ByteStream API.
+//
+// For uncompressed data, The `ReadRequest.resource_name` is of the following form:
+// `{instance_name}/blobs/{hash}/{size}`
+// Where `instance_name`, `hash` and `size` are defined as for uploads.
+//
+// Data can alternatively be downloaded in compressed form, with the following
+// `ReadRequest.resource_name` form:
 // `{instance_name}/compressed-blobs/{compressor}/{uncompressed_hash}/{uncompressed_size}`
-// where compressor is the lowercase string form of a `Compressor.Value` enum
-// other than `identity` which is advertised by the server in
-// [CacheCapabilities.supported_compressor][build.bazel.remote.execution.v2.CacheCapabilities.supported_compressor],
-// `uncompressed_hash` and `uncompressed_size` represent the
-// [Digest][build.bazel.remote.execution.v2.Digest] of the downloaded blob
-// once decompressed. Clients SHOULD verify that the `uncompressed_hash` and
-// `uncompressed_size` match that of the downloaded data once uncompressed, and
-// take appropriate steps in the case of failure such as retrying a limited
-// number of times or surfacing an error to the user. Servers MUST provide
-// compressed data for any blob which is still advertised as being available,
-// even if it is not already stored in compressed form. Servers MAY use any
-// compression level they choose, including different levels for different
-// blobs (e.g. choosing a level designed for maximum speed for data known to be
-// incompressible).
+//
+// Where:
+// * `instance_name` and `compressor` are defined as for uploads.
+// * `uncompressed_hash` and `uncompressed_size` refer to the
+//   [Digest][build.bazel.remote.execution.v2.Digest] of the data being
+//   downloaded, once uncompressed. Clients MUST verify that these match
+//   the downloaded data once uncompressed, and take appropriate steps in
+//   the case of failure such as retrying a limited number of times or
+//   surfacing an error to the user.
+//
+// Servers MAY use any compression level they choose, including different
+// levels for different blobs (e.g. choosing a level designed for maximum
+// speed for data known to be incompressible).
+//
+// Servers MUST be able to provide data for all recently advertised blobs in
+// each of the compression formats that the server supports, as well as in
+// uncompressed form.
 //
 // Clients SHOULD NOT use gRPC-level compression on ByteStream API `Read`
 // requests for compressed blobs, since this would compress already-compressed