-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Outputs as target supporting for dvc status
#4433
Conversation
fix iterative#4191 1. Add a related test which would fail on current version.
I marked the documentation checkbox as this is already supposed to be the case and mentioned in https://dvc.org/doc/command-reference/status. Thanks |
dvc status
dvc status
1. add deps to the tests. 2. make change to pass the tests.
Only show outputs in a rough way, need to be rearranged and summarized in a new PR related to #2180. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @karajan1001!
tests/func/test_status.py
Outdated
assert main(["status", "alice_bob"]) == 0 | ||
assert "alice_bob:" in caplog.text | ||
assert "changed outs:" in caplog.text | ||
assert "modified: alice" in caplog.text | ||
assert "modified: bob" in caplog.text | ||
caplog.clear() | ||
|
||
assert main(["status", "alice"]) == 0 | ||
assert "modified: alice" in caplog.text | ||
assert "modified: bob" not in caplog.text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be easier to check the status dictionary (like in test above that one, test_status_recursive
), instead of caplog messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pared
Already Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @karajan1001 for the PR. Looks good to me, although, have a few suggestions.
if not filter_info: | ||
status_info.update(stage.status(check_updates=True)) | ||
else: | ||
for out in stage.filter_outs(filter_info): | ||
status_info.update(out.status()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we push this to Stage::status()
or Stage::_status_outs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we should keep {stage_name: stage_status}
format, to make it in line with --show-json
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @karajan1001 for the PR. Looks good to me, although, have a few suggestions.
@skshetry , Thanks for your suggestions.
Also, we should keep
{stage_name: stage_status}
format, to make it in line with--show-json
.
Here, I just follow the format ofdvc status -c [outputs]
. Besides, if we show
"alice_bob":
"changed outs":
"alice": "modified"
We give misinformation that there is only one output "alice" in stage "alice_bob", here we need more emphasis on outputs than stages.
Should we push this to
Stage::status()
orStage::_status_outs
?
It has a different format from the currentStage::status()
andStage::_status_outs
now. So we can't reuse them here.
Maybe we can discuss the output format more in #2180?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karajan1001, better to be consistent here. If we need better CLI output, we can handle that in dvc/commands
. But, let's wait for others, let's see what they'll say.
It has a different format from the current Stage::status() and Stage::_status_outs now
Regarding status
, we could add filter_info
to it so that you could just do
if not filter_info: | |
status_info.update(stage.status(check_updates=True)) | |
else: | |
for out in stage.filter_outs(filter_info): | |
status_info.update(out.status()) | |
status_info.update(stage.status(filter_outs=filter_info)) |
But this would make sense if we made result consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karajan1001, better to be consistent here. If we need better CLI output, we can handle that in
dvc/commands
. But, let's wait for others, let's see what they'll say.It has a different format from the current Stage::status() and Stage::_status_outs now
Regarding
status
, we could addfilter_info
to it so that you could just doBut this would make sense if we made result consistent.
@skshetry
The question is dvc/commands
didn't know targets are stages or outputs.
Actually I think the most elegant way is that collect_granular
returns a list of outputs
or stages
or files
, and they share the same API .status()
which returns the result. But we need another object FilePathSlot
which is at a finer granularity than outputs to get rid of filter_info
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree with @skshetry here,
"alice_bob":
"changed outs":
"alice": "modified"
In that case we inform user that only one out has changed, but that does not mean that this particular stage has only this output. I would say that status
primary function is to inform about changes, and there is no need to list all of its outputs, if they did not change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree with @skshetry here,
"alice_bob": "changed outs": "alice": "modified"
In that case we inform user that only one out has changed, but that does not mean that this particular stage has only this output. I would say that
status
primary function is to inform about changes, and there is no need to list all of its outputs, if they did not change.
@pared Thank you.
How about the status -c
? Do they also need to follow this format, or just keep the current one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karajan1001 This is a good question.
So in case of our current test, if we commit modified alice
we will get Data and pipelines up to date
on dvc status
and we will get new: alice
on dvc status -c
. As this issue is about filtering the status
I would keep output of dvc status -c
as is and create issue discussing whether we should have consistent output format for both dvc status
and dvc status -c
.
Co-authored-by: Saugat Pachhai <suagatchhetri@outlook.com>
@karajan1001 is there anything that you would like to add here? Or can we drop the |
@pared |
@karajan1001 no problem, it just felt like the PR is complete, thats why I asked :) |
dvc status
dvc status
Would have been better to have a consistent format: #4433 (comment). π€·ββοΈ |
@skshetry sorry, my bad. Let's finish the discussion here and prepare follow up PR? |
fix #4191
β I have followed the Contributing to DVC checklist.
π If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. π