-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation (but not plumbing) of the gRPC remote downloader #10914
Implementation (but not plumbing) of the gRPC remote downloader #10914
Conversation
When Bazel downloads an external file (via `ctx.download()` or similar, it supports the concept of a "canonical ID". This ID is used to disambiguate download requests when the content checksum is unknown and the URL doesn't change between fetched resources. Links: * Design: https://github.com/bazelbuild/proposals/blob/master/designs/2019-04-29-cache.md * Implementation PR: bazelbuild#5144 * API doc: https://docs.bazel.build/versions/master/skylark/lib/repository_ctx.html#download This field was properly plumbed into the `Downloader` interface when it was added by PR bazelbuild#10547, but an ad-hoc change during import caused it to get lost. We need to put it back, or remote downloaders won't be able to do correct cache lookups for these resources.
cf9a90e
to
121d947
Compare
These fields are not registered as options yet, they only exist so that code depending on them can be merged prior to all the RemoteModule plumbing being worked out.
121d947
to
218be7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All in all this looks good, a few things I'd like clarification on.
* data will be written to the underlying stream regardless of whether it matches | ||
* the expected checksum. | ||
* | ||
* <p>This class is not thread safe, but it is safe to message pass this object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a meaning to "message pass", or is this just mis-pasted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's copied from HashInputStream.java
. I don't know what "message pass" means in the context of Java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't realize the source. Thanks for clarifying.
|
||
private final OutputStream delegate; | ||
private final Hasher hasher; | ||
private final HashCode code; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like "code" is the passed-in code this checks against, and "actual" is the computed checksum to compare with? Add some comments to clarify this (and to explain why this needs to be "volatile", because I'm not sure why).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are also copied from HashInputStream.java
. I have no context on why these names were chosen, or what the purpose of volatile
is for the Hash*Stream
classes.
To clarify: I'm happy to change these to be something else, but I don't have the ability to explain why existing Bazel code is the way it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. Let's leave as is, then.
if (checksum.isPresent()) { | ||
requestBuilder.addQualifiers( | ||
Qualifier.newBuilder() | ||
.setName("checksum.sri") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should "checksum.sri" be a constant somewhere? Do we have remote-exec specific constants anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to a constant. I can't find any common shared constants file, so I just put it at the top of this one.
return; | ||
} | ||
cacheClient.close(); | ||
channel.release(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is handling the release, but where is the corresponding retain call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It gets retained by the caller of new GrpcRemoteDownloader
. See the implementation and lifecycle of GrpcCacheClient.java
for the code I copied from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanations. I'll get this imported today.
Turns out I also need to sync open source https://source.bazel.build/bazel/+/master:third_party/remoteapis/ with the internal version. This may slow things down, my apologies. |
Extracted from #10622 Per discussion on that PR, there's still some unanswered questions about how exactly we plumb the new `Downloader` type into `RemoteModule`. And per #10742 (comment), it is unlikely that even heroic effort from me will get the full end-to-end functionality into v3.0. Given this, to simplify the review, I'm taking some of the bits the reviewer is happy with and moving them to a separate PR. After merger, `GrpcRemoteDownloader` and its tests will exist in the source tree, but will not yet be available as CLI options. R: @michajlo CC: @adunham-stripe @dslomov @EricBurnett @philwo @sstriker Closes #10914. PiperOrigin-RevId: 299908615
@jmillikin-stripe With apologies for the thread necromancy: I am trying to understand the rationale for including the authorization headers in the qualifiers for the My understanding of the spec [2][3] is that qualifiers serve to disambiguate resources residing at the same URI. However, it seems to me that authorization should only control access to a resource and would (should?) not affect its contents. Do you recall why we decided to do this? Would you object to an incompatible Bazel change to drop authorization headers from the qualifier? [1] https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/remote/downloader/GrpcRemoteDownloader.java;l=199;drc=1ccc0a378a65ad6d7a9b6f1117871fdeda5c26e8 |
Extracted from #10622
Per discussion on that PR, there's still some unanswered questions about how exactly we plumb the new
Downloader
type intoRemoteModule
. And per #10742 (comment), it is unlikely that even heroic effort from me will get the full end-to-end functionality into v3.0.Given this, to simplify the review, I'm taking some of the bits the reviewer is happy with and moving them to a separate PR. After merger,
GrpcRemoteDownloader
and its tests will exist in the source tree, but will not yet be available as CLI options.R: @michajlo
CC: @adunham-stripe @dslomov @EricBurnett @philwo @sstriker