-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exp: show: Include additional info in --json. #7690
exp: show: Include additional info in --json. #7690
Conversation
19fc2f7
to
0b60702
Compare
0b60702
to
cf8e636
Compare
fs_path
and use_cache
in --json
.cf8e636
to
53f2507
Compare
53f2507
to
1da3517
Compare
209baee
to
33ecb88
Compare
33ecb88
to
48fe304
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM assuming this works for the vscode side
"hash": out.hash_info.value, | ||
"size": out.meta.size, | ||
"nfiles": out.meta.nfiles, | ||
"use_cache": out.use_cache, | ||
"is_data_source": out.stage.is_data_source, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Q] Do I have everything I need to work this out:
for dep in res[“deps”].values(): for out in res[“outs”].values(): if dep[“fs_path”] == out[“fs_path”]: # THIS IS AN INTERMEDIATE ARTIFACT dep["intermediate"] = True out["intermediate"] = True if out[“use_cache”]: # DEP IS DVC TRACKED dep["use_cache"] = True else: # DEP IS GIT-TRACKED dep["use_cache"] = False
The answer is yes
Given that the rules about deciding what files should be included appear to be currently unclear and product-dependent I would prefer not to include this logic in DVC.
This does make sense to me. However, there will be much less chance of providing consistency across dependencies if we push this logic out to the consumers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattseddon where does this snippet come from?
I see that is_data_source
is included which is what we need I believe, @daavoo could you remind please what is the definition of the out.stage.is_data_source
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct code is in the description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be enough for us right now as we're working on implementing:
The first step is to get all of the deps from exp show shown in the columns tree and experiments webview.
as per iterative/vscode-dvc#1183 (comment)
I think we should revisit when we get to:
Once extra information has been added to the DVC output we will provide it to the user (in some way to be determined) so they can easily filter out what they need.
and/or when we redesign the data that the extension needs
@daavoo thanks for addressing this!
I'm not sure that example is 100% correct. What happens if I have only a few |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please review the comment regarding the .dvc
files ... at least we should fix an example.
In general - why would I need that double cycle at all btw, why can't I just take all data source outs (I hope it includes imports) ... and for final outputs - take outs that don't have deps.
The example is from the current DVC perspective where VSCode has access to both
Reverse the loop. Iterate on
You can, as per the example above. dvc/tests/func/experiments/test_show.py Lines 691 to 716 in 48fe304
You kind of will still double looping because of And without also iterating on |
Got it, thanks @daavoo ! |
Use relpath as keys in deps and outs.
Add
use_cache
andis_data_source
for outs.This allows differentiating git-tracked dependencies and
intermediate outputs (pseudo-code bellow).
Closes #7575
Closes #7790
Example code on how to use the new fields: