Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outputs as target supporting for dvc status #4433

Merged
merged 4 commits into from
Aug 25, 2020

Conversation

karajan1001
Copy link
Contributor

@karajan1001 karajan1001 commented Aug 20, 2020

fix #4191

  1. Add a related test which would fail on current version.

Thank you for the contribution - we'll try to review it as soon as possible. πŸ™

fix iterative#4191
1. Add a related test which would fail on current version.
@jorgeorpinel
Copy link
Contributor

I marked the documentation checkbox as this is already supposed to be the case and mentioned in https://dvc.org/doc/command-reference/status. Thanks

@efiop efiop changed the title Outputs as target supporting for dvc status [WIP] Outputs as target supporting for dvc status Aug 20, 2020
tests/func/test_status.py Outdated Show resolved Hide resolved
1. add deps to the tests.
2. make change to pass the tests.
@karajan1001
Copy link
Contributor Author

Only show outputs in a rough way, need to be rearranged and summarized in a new PR related to #2180.

@karajan1001 karajan1001 requested a review from pared August 24, 2020 10:59
Copy link
Contributor

@pared pared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @karajan1001!

Comment on lines 133 to 142
assert main(["status", "alice_bob"]) == 0
assert "alice_bob:" in caplog.text
assert "changed outs:" in caplog.text
assert "modified: alice" in caplog.text
assert "modified: bob" in caplog.text
caplog.clear()

assert main(["status", "alice"]) == 0
assert "modified: alice" in caplog.text
assert "modified: bob" not in caplog.text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be easier to check the status dictionary (like in test above that one, test_status_recursive), instead of caplog messages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pared
Already Done.

Copy link
Member

@skshetry skshetry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @karajan1001 for the PR. Looks good to me, although, have a few suggestions.

Comment on lines +22 to +26
if not filter_info:
status_info.update(stage.status(check_updates=True))
else:
for out in stage.filter_outs(filter_info):
status_info.update(out.status())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we push this to Stage::status() or Stage::_status_outs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we should keep {stage_name: stage_status} format, to make it in line with --show-json.

Copy link
Contributor Author

@karajan1001 karajan1001 Aug 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @karajan1001 for the PR. Looks good to me, although, have a few suggestions.
@skshetry , Thanks for your suggestions.

Also, we should keep {stage_name: stage_status} format, to make it in line with --show-json.
Here, I just follow the format of dvc status -c [outputs]. Besides, if we show

"alice_bob":  
        "changed outs": 
                "alice": "modified" 

We give misinformation that there is only one output "alice" in stage "alice_bob", here we need more emphasis on outputs than stages.

Should we push this to Stage::status() or Stage::_status_outs?
It has a different format from the current Stage::status() and Stage::_status_outs now. So we can't reuse them here.

Maybe we can discuss the output format more in #2180?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karajan1001, better to be consistent here. If we need better CLI output, we can handle that in dvc/commands. But, let's wait for others, let's see what they'll say.

It has a different format from the current Stage::status() and Stage::_status_outs now

Regarding status, we could add filter_info to it so that you could just do

Suggested change
if not filter_info:
status_info.update(stage.status(check_updates=True))
else:
for out in stage.filter_outs(filter_info):
status_info.update(out.status())
status_info.update(stage.status(filter_outs=filter_info))

But this would make sense if we made result consistent.

Copy link
Contributor Author

@karajan1001 karajan1001 Aug 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karajan1001, better to be consistent here. If we need better CLI output, we can handle that in dvc/commands. But, let's wait for others, let's see what they'll say.

It has a different format from the current Stage::status() and Stage::_status_outs now

Regarding status, we could add filter_info to it so that you could just do

But this would make sense if we made result consistent.

@skshetry
The question is dvc/commands didn't know targets are stages or outputs.

Actually I think the most elegant way is that collect_granular returns a list of outputs or stages or files, and they share the same API .status() which returns the result. But we need another object FilePathSlot which is at a finer granularity than outputs to get rid of filter_info.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree with @skshetry here,

"alice_bob":  
        "changed outs": 
                "alice": "modified" 

In that case we inform user that only one out has changed, but that does not mean that this particular stage has only this output. I would say that status primary function is to inform about changes, and there is no need to list all of its outputs, if they did not change.

Copy link
Contributor Author

@karajan1001 karajan1001 Aug 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree with @skshetry here,

"alice_bob":  
        "changed outs": 
                "alice": "modified" 

In that case we inform user that only one out has changed, but that does not mean that this particular stage has only this output. I would say that status primary function is to inform about changes, and there is no need to list all of its outputs, if they did not change.

@pared Thank you.
How about the status -c? Do they also need to follow this format, or just keep the current one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karajan1001 This is a good question.
So in case of our current test, if we commit modified alice we will get Data and pipelines up to date on dvc status
and we will get new: alice on dvc status -c. As this issue is about filtering the status I would keep output of dvc status -c as is and create issue discussing whether we should have consistent output format for both dvc status and dvc status -c.

tests/func/test_status.py Outdated Show resolved Hide resolved
tests/func/test_status.py Outdated Show resolved Hide resolved
karajan1001 and others added 2 commits August 25, 2020 09:08
Co-authored-by: Saugat Pachhai <suagatchhetri@outlook.com>
@pared
Copy link
Contributor

pared commented Aug 25, 2020

@karajan1001 is there anything that you would like to add here? Or can we drop the [WIP]? :)

@karajan1001
Copy link
Contributor Author

@karajan1001 is there anything that you would like to add here? Or can we drop the [WIP]? :)

@pared
I'm sorry, I don't know what is [WIP] and googled it a moment ago. Now I can say yes.

@pared
Copy link
Contributor

pared commented Aug 25, 2020

@karajan1001 no problem, it just felt like the PR is complete, thats why I asked :)

@pared pared changed the title [WIP] Outputs as target supporting for dvc status Outputs as target supporting for dvc status Aug 25, 2020
@pared pared merged commit fcca636 into iterative:master Aug 25, 2020
@skshetry
Copy link
Member

skshetry commented Aug 25, 2020

Would have been better to have a consistent format: #4433 (comment). πŸ€·β€β™‚οΈ

@pared
Copy link
Contributor

pared commented Aug 25, 2020

@skshetry sorry, my bad. Let's finish the discussion here and prepare follow up PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhances DVC feature is a feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

status: support outputs as targets [qa]
4 participants