Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example for dvc add --to-remote #2172

Merged
merged 5 commits into from
Feb 28, 2021
Merged

Conversation

isidentical
Copy link
Contributor

@isidentical isidentical commented Feb 9, 2021

You may disregard these recommendations if you used the Edit on GitHub button from dvc.org to improve a doc in place.

❗ Please read the guidelines in the Contributing to the Documentation list if you make any substantial changes to the documentation or JS engine.

🐛 Please make sure to mention Fix #issue (if applicable) in the description of the PR. This causes GitHub to close it automatically when the PR is merged.

Please choose to allow us to edit your branch when creating the PR.

Thank you for the contribution - we'll try to review it as soon as possible. 🙏

Resolves #2161

@shcheklein shcheklein temporarily deployed to dvc-landing-extend-add--bj6zt2 February 9, 2021 13:26 Inactive
@shcheklein shcheklein temporarily deployed to dvc-landing-extend-add--bj6zt2 February 10, 2021 12:14 Inactive
@jorgeorpinel

This comment has been minimized.

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments and suggested copy edits, but there's a big question here which would be great to answer first, found in #2172 (comment) see discussion in iterative/dvc/issues/5445. Thanks

UPDATE: This seems resolved (leaving things as-is for now).

Comment on lines +335 to +341
## Example: Transfer to remote storage

When you have a large dataset in an external location, you may want to track it
as if it was in your project, but without downloading it locally (for now). The
`--to-remote` option lets you do so, while storing a copy
[remotely](/doc/command-reference/remote) so it can be
[pulled](/doc/command-reference/plots) later. Let's initialize a DVC project,
Copy link
Contributor

@jorgeorpinel jorgeorpinel Feb 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're still explaining this almost exactly as in https://dvc.org/doc/command-reference/import-url#example-transfer-to-remote-storage but they're supposed to be completely different use cases. Is "you may want to track it as if it was in your project" clear and different enough?

How should we best differentiate them based on the discussions in iterative/dvc/issues/5445? Cc @shcheklein

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can defer this discussion. I am awaiting this PR to be merged so that I can work on to-cache docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would write a better intro in general here:

By default when you do import-url DVC downloads data into the workspace so that it can be saved into cache, and later into the remote storage. That's important to preserve it since we want to keep the project reproducible. In some situations though you might not have enough space on the machine you are running import-url, but you still want this data to be saved into remote storage, you still want this data be accessible through regular commands like dvc pull (e.g. to run the pipeline on another machine that has enough space in cache, or when a large shared cache is being used, etc). In those cases, to "bootstrap" the project it's handy to use --to-remote ....

@jorgeorpinel we can take it over and rewrite it a bit, but let's not block @isidentical , and if we do this let's try to do this asap please or even as a separate PR

Copy link
Contributor

@jorgeorpinel jorgeorpinel Feb 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defer this discussion. I am awaiting this PR to be merged
let's not block @isidentical

I did approve the PR along with my comment... (#2172 (review))

This comment was marked as resolved.

This comment was marked as resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. Also added checkbox to #2121 for now (will come back to that one)

This comment was marked as resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. guys I finally got to rewriting these examples a bit in #2302. Feel free to check it out.

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one last comment ☝️ but also approving since it has been discussed quite a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cmd: update add --to-remote similar to import-url and...
3 participants