-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asset API: Use of Directory instead of Tree for PushDirectory #165
Comments
Thanks for engaging @Qinusty.
The Remote Asset API Push/FetchDirectory calls are deliberately operating on directory digests. The reasoning here is that the API is very likely used for inputs (read: source trees). Source trees tend to leave large portions of a tree untouched when observed through time, which could mean quite a bit of duplication if all were stored as a Tree, rather than a hierarchy of Directories. This is in line with the input_root_digest in Action.
Implementations can choose to optimize the verification path under the hood. Going from directory to tree is but a CAS.GetTree() call away, and an implementation may choose to cache that.
Clients need to be able to deal with cases where blobs disappear from CAS. I would say that not doing so leads to a poor user experience. That said, Servers SHOULD ensure that referenced files are present in the CAS at the time of the response. For what it's worth, I would welcome a suggestion to inline the Blob or Tree in the FetchBlob / FetchDirectory responses based on a to-be-defined request flag. The downside to that is that it would require a higher complexity in service implementations, rather than that being optional. I think that #144 and #159 are tangentially related to this issue. |
How is that different for directory outputs that use Trees? To make a bit more clear what I mean, let me take the sentence above and change 'Source trees' with 'Directory outputs of build actions'.
But it is not in line with tree_digest in OutputDirectory. In the case of REv2:
The reasoning behind this is somewhat obvious:
The Remote Asset API is not in line with REv2 in that sense. And that's a pity, because it means there can be less reuse of code. If there were concerns about performance/handling of large subtrees, it would have been better to discuss that separately. For example, this was already discussed to a certain extent in #140. Note how various people in that discussion were in agreement that GetTree should likely be removed. |
The use of the Directory instead of Tree for Push/Fetch Directory leads to additional overhead on each Fetch which could instead be performed once by the client when pushing the Directories to the CAS, or once by the Fetch Service when performing an optional remote fetch.
This overhead is made clear when attempting to ensure that the full directory Tree under the root_directory_digest is present with the CAS before responding to a FetchDirectory request.
Whilst verifying the existence of these files is not a requirement within the spec, I have encountered issues with clients falling over following a Fetch request which returned a digest to an item which isn't in the CAS. Is this something clients should handle, or servers should ensure the digests within ActionResults exist within the CAS.
It appears that a similar change occurred during the early days of the RE spec. f85ddf0#diff-23d388eca6738e626128af49c6c8f0e5R745-R750
The text was updated successfully, but these errors were encountered: