-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REv3 idea: CAS.ExpandTree()? #140
Comments
That seems to be predicated on the assumption that Bazel is not going to allow addressing individual files in the future. That assumption does not seem correct to me. I'm not sure if there are features in Bazel around this right now, but it seems like a perfectly safe bet that there will be, even if Bazel team isn't currently planning to add them. The reason for this is simple: adding unnecessary input files to an action is bad for build performance (reduces caching & increases action costs). This is particularly true the larger the output directory tree is, i.e., the more you're trying to save network bandwidth with this feature, the worse performance gets. Therefore, I would bet on Bazel adding features to allow addressing individual files or filtering files or something else along these lines that requires Bazel to have a local list of files. |
No, that's not correct. It is predicated around the fact that in many cases directories declared using Even in the case where a Tree needs to be carved up, it might still be preferable to call |
My position is that rules will (should?) ~always use a subset of larger trees in order to avoid the performance costs. Maybe that's overly optimistic, but even so, if that's the recommended best practice, why spend significant effort on optimizing the other case? If the tree needs to be carved up, why would you have |
So you're proposing that GetActionResult() is made slightly stronger, in that it only returns results if all directories stored in a Tree referenced by the ActionResult are also stored in the CAS as separate Directory objects? That sounds reasonable. |
I've thought about suggesting something along these lines for a v3 proposal before, or at least sketching out the general problem of the asymmetry between the messages describing outputs & inputs and the work required to convert one to the other when another action wants to consume them. So generally in favour, but would also question whether there is something we could do to make the two more similar (although that might be too big a change even for v3?). |
I assumed that this was already required, but looking at the GetActionResult() comment again now it is slightly ambiguous. |
For a bit of background, the Tree handling in the API is a known weak spot
- we tried to clean it up once - get rid of GetTree and move it into
top-level protos so the CAS API had no knowledge of trees - but that
removal was rolled back and we stayed in an awkward partial state. I'd
definitely be open to improving it.
For merkle trees as inputs, the general properties we care about are:
- Recursively defined, so that sharing trees in inputs doesn't require
operating on a whole tree each time
- Parallelly uploadable, so that it doesn't add unnecessary round-trips
on the order of the depth of the tree.
Of course, this runs into problems for output trees - you would *not* want
to expand a recursively-defined merkle tree by making an RPC for each level
one-by-one - that'd be terribly performing.
So we started by adding GetTree to do a server-side expansion, then
realized that for most use-cases we don't actually need the CAS to have
special knowledge of trees at all. Interpretation of protos can be left in
the Execution API, with the CAS operating on blobs only. Which led to the
inclusion of OutputDirectory as a "flattened tree" - a separately
serialized proto so that we don't have to worry about a whole tree fitting
into one fixed-size RPC response, but still flattened under the assumption
that clients will have to interact with the whole tree at some point.
But I see your point here - even with all directory nodes in the CAS
individually (which is my interpretation of the comments as well, but
slightly ambiguous as you say), you still have to download the serialized
tree just to get to the root directory node, if that's all you want.
To that end, we might instead consider updating OutputDirectory (*Edit*: mistakenly said DirectoryNode) to have both
the digest of the tree (for access to the flattened form) *and* the digest
of the directory node itself (for reuse as an input)? Then it'd be clear
that if the server gives you a directory root digest back, all nodes must
be in the CAS too by definition, and you can reuse it as-is. This could be
optional in V2; upgraded to Required in V3?
(I'd also love to get rid of GetTree entirely; can't remember offhand why
it was added back though and if it still has a purpose).
… —
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#140 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABREW5PJDMMKBI6QWGBTZLRUPF5HANCNFSM4NPPEWOQ>
.
[image: image.gif]
|
👍 That sounds like a good idea.
👍 to that as well. I can imagine that something like GetTree() makes sense to have on a worker where you might want to gather a full input root. Still, there's no need to let that be part of the build client -> cluster protocol. |
Could it make sense to unify This would provide consistency across all operations and we should no longer need A possible disadvantage is that the flexibility would no longer provide a single canonical representation of a directory hierarchy. However, I'm not sure whether this would be a big issue in practice as long as individual clients use deterministic rules to decide embedding of subdirectories. This would also address #165 and #170. Any thoughts? Are there significant disadvantages I'm overlooking? |
I like it - the canonical digest of a |
The downside of such an approach is that there will no longer be a unique representation of a directory hierarchy. You can construct multiple identical directory layouts, but encode them differently:
It sacrifices the idea that a node in the Merkle tree of given contents has a unique digest. Couldn't this lead to lower AC hit rates if not all of the tools are in full agreement when to (un)pack directories in input roots? Another issue with this approach is that the decision when to (un)pack is made unilaterally. The worker decides whether everything needs to be stored separately or together, whereas the client may have a different opinion on what would have been smart, looking at the build overall. For example, when Builds without the Bytes is turned off, it may be more preferable to receive a Tree-like object. When Builds without the Bytes is turned on, a plain Directory would have been sufficed (and be preferable).
I would like us to go in a different direction here. If we ever want to support encryption (#133), it already makes sense to decompose CAS objects into two parts:
In such a model it would make a lot of sense to always store objects separate; never use something like Tree. The GetTree() call could be replaced by a generic GetTransitiveClosure() call to expand a full tree of CAS objects. |
I think this is fine as long as the digest algorithm is well specified. If there's a canonical form for encoding them for digest, and effectively we're just hiding hint information in a separate un-digested field, these worries go away. (Of course, some new worries around correctness of implementation now appear!) |
Let's assume there is a canonical form for digest. Now let's assume a worker gives us an output directory of which we don't know whether it is in canonical form. Wouldn't that require the client to download the full output directory hierarchy, so that it may be canonicalised by the client? The entire goal of this PR was to introduce a mechanism where clients do not need to download full directory output hierarchies. |
I'd like to break this anyways, personally - see e.g. #141 (comment) . Fundamentally, what I've realized we actually need is approximately "For clients producing overlapping directory trees, they can easily get high cache hit rates" . But that's more about clients being internally-deterministic and well behaved - I don't think we actually see ~any cross-tool caching, and I don't think it actually matters. So we should optimize for e.g. allowing Bazel to produce trees in a way efficient for it to compose, and reasonably compact for expanders. And we should allow BuildBarn to do the same. But we don't really benefit from those trees being the same as far as I can tell, and I don't think most consumers of trees care if they're in canonical form or not? At most, they might care that the same tree from the same source is deterministic, to e.g. easily say "this output tree has not changed". But that's a lot more limited than what we enforce now, and requiring a fully-expanded non-overlapping tree seems to be holding us back in a lot of contexts. To Daniel's point, there is still a canonical form that can optionally be used where needed. But I expect that's actually the exception and not the norm. |
I'm generally in favor of a more lenient approach and also in favor of unifying the two tree representations. I wonder how difficult it would be for a RE system to detect cases where a client sends the same tree in different representations. Such a mechanism could be used to detect misbehaving clients. |
Servers could canonicalize a sampling of trees, and report on rates of collisions? (E.g. X client produced tree Y which canonicalized to Z; different from previously seen tree Y' from client X that also canonicalized to Z). Might be a bit hard to expose in an actionable way, but shouldn't be too bad I don't think as the server could reasonably track the invocation ID / action ID from this sampling and be able to point directly back to examples where it was concretely observed. But one thing we'd have to learn observationally is what a "reasonable" rate of collisions is. |
Build systems like Bazel currently treat directory outputs as being opaque. There is no support for addressing individual files contained within.
If Bazel wants to pass on a directory output from one action to another, it currently needs to download a Tree object from the CAS and expand it into individual Directory objects. These then need to be reuploaded into the CAS.
This roundtrip could be avoided by adding a dedicated RPC of the shape:
In this case, the argument corresponds to the digest of the Tree object. The return value corresponds to the digest of the root directory contained within the Tree object. The call should also bump the TTL on all resulting CAS objects.
Do we consider this to be enough of a problem that it makes sense to add such an RPC?
The text was updated successfully, but these errors were encountered: