-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example for dvc add --to-remote #2172
Conversation
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments and suggested copy edits, but there's a big question here which would be great to answer first, found in #2172 (comment) see discussion in iterative/dvc/issues/5445. Thanks
UPDATE: This seems resolved (leaving things as-is for now).
ec80304
to
0cc63e0
Compare
0cc63e0
to
32bf48c
Compare
## Example: Transfer to remote storage | ||
|
||
When you have a large dataset in an external location, you may want to track it | ||
as if it was in your project, but without downloading it locally (for now). The | ||
`--to-remote` option lets you do so, while storing a copy | ||
[remotely](/doc/command-reference/remote) so it can be | ||
[pulled](/doc/command-reference/plots) later. Let's initialize a DVC project, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're still explaining this almost exactly as in https://dvc.org/doc/command-reference/import-url#example-transfer-to-remote-storage but they're supposed to be completely different use cases. Is "you may want to track it as if it was in your project" clear and different enough?
How should we best differentiate them based on the discussions in iterative/dvc/issues/5445? Cc @shcheklein
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can defer this discussion. I am awaiting this PR to be merged so that I can work on to-cache docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I would write a better intro in general here:
By default when you do import-url
DVC downloads data into the workspace so that it can be saved into cache, and later into the remote storage. That's important to preserve it since we want to keep the project reproducible. In some situations though you might not have enough space on the machine you are running import-url
, but you still want this data to be saved into remote storage, you still want this data be accessible through regular commands like dvc pull
(e.g. to run the pipeline on another machine that has enough space in cache, or when a large shared cache is being used, etc). In those cases, to "bootstrap" the project it's handy to use --to-remote
....
@jorgeorpinel we can take it over and rewrite it a bit, but let's not block @isidentical , and if we do this let's try to do this asap please or even as a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer this discussion. I am awaiting this PR to be merged
let's not block @isidentical
I did approve the PR along with my comment... (#2172 (review))
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. Also added checkbox to #2121 for now (will come back to that one)
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. guys I finally got to rewriting these examples a bit in #2302. Feel free to check it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one last comment ☝️ but also approving since it has been discussed quite a bit.
❗ Please read the guidelines in the Contributing to the Documentation list if you make any substantial changes to the documentation or JS engine.
🐛 Please make sure to mention
Fix #issue
(if applicable) in the description of the PR. This causes GitHub to close it automatically when the PR is merged.Please choose to allow us to edit your branch when creating the PR.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏
Resolves #2161