-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add docs regaring --to-remote option for add/import-url #2091
Merged
Merged
Changes from 32 commits
Commits
Show all changes
40 commits
Select commit
Hold shift + click to select a range
f64f601
Initial pre-texts regarding straight to remote
isidentical 98cf237
Add an import-url example
isidentical 114ba8d
More mentions to --to-remote
isidentical cbdf546
More description regarding --to-remote
isidentical 820cbd6
checkout => pull
isidentical 4c83bdf
Address some reviews
isidentical aaa0273
Reference to the example in the docs
isidentical 02f9ade
remove brackets
isidentical 66c8710
-j for import-url/add
isidentical c11ef07
apply suggestions from jorge
isidentical 0b79d10
Reorder parameters according to the core
isidentical 2ea1f22
Apply a bunch more suggestions
isidentical 3ff3d01
Update content/docs/command-reference/add.md
jorgeorpinel a4cbe61
Update content/docs/command-reference/add.md
jorgeorpinel 4fb63eb
Update content/docs/command-reference/import-url.md
jorgeorpinel d07166d
Update content/docs/command-reference/add.md
jorgeorpinel c249ee6
Update content/docs/command-reference/add.md
jorgeorpinel b16d407
Update content/docs/command-reference/import-url.md
jorgeorpinel 570f38c
Update content/docs/command-reference/import-url.md
jorgeorpinel 6c8a592
Update content/docs/command-reference/import-url.md
jorgeorpinel 5737bd2
Restyled by prettier
restyled-commits 96d767f
proper initalization
isidentical 133a939
suggestions
isidentical 6c7f65a
rebase
isidentical 194a764
Update content/docs/command-reference/import-url.md
jorgeorpinel 0dd63c7
Update content/docs/command-reference/import-url.md
jorgeorpinel 8e66b2b
Update content/docs/command-reference/import-url.md
jorgeorpinel a473848
Update content/docs/command-reference/import-url.md
jorgeorpinel e5b9d4e
Update content/docs/command-reference/import-url.md
jorgeorpinel d7ca231
Update content/docs/command-reference/import-url.md
jorgeorpinel 65ce340
Update content/docs/command-reference/import-url.md
jorgeorpinel c6351f3
Update content/docs/command-reference/import-url.md
jorgeorpinel ee24963
Update content/docs/command-reference/import-url.md
jorgeorpinel 89c1bb9
changes
isidentical f32473e
sync with master
isidentical 1d5ef74
Update content/docs/command-reference/import-url.md
jorgeorpinel 25b0cdf
Update content/docs/command-reference/import-url.md
jorgeorpinel c036a07
Update content/docs/command-reference/add.md
jorgeorpinel 46b5164
Update content/docs/command-reference/import-url.md
jorgeorpinel d58af5b
Update content/docs/command-reference/import-url.md
jorgeorpinel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,8 @@ | ||
# import-url | ||
|
||
Download a file or directory from a supported URL (for example `s3://`, | ||
`ssh://`, and other protocols) into the <abbr>workspace</abbr>, and track it (an | ||
import `.dvc` file is created). | ||
Track a file or directory found in an external location (`s3://`, `/local/path`, | ||
etc.), and download it to the local project, or make a copy in | ||
[remote storage](/doc/command-reference/remote). | ||
|
||
> See `dvc import` to download and tack data/model files or directories from | ||
> other <abbr>DVC repositories</abbr> (e.g. hosted on GitHub). | ||
|
@@ -11,6 +11,7 @@ import `.dvc` file is created). | |
|
||
```usage | ||
usage: dvc import-url [-h] [-q | -v] [--file <filename>] [--no-exec] | ||
[--to-remote] [-r <name>] [-j <number>] | ||
[--desc <text>] | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
url [out] | ||
|
||
|
@@ -22,8 +23,9 @@ positional arguments: | |
## Description | ||
|
||
In some cases it's convenient to add a data file or directory from an external | ||
location into the workspace, such that it can be updated later, if/when the | ||
external data source changes. Example scenarios: | ||
location into the workspace (or to | ||
[remote storage](/doc/command-reference/remote)), such that it can be updated | ||
later, if/when the external data source changes. Example scenarios: | ||
|
||
- A remote system may produce occasional data files that are used in other | ||
projects. | ||
|
@@ -37,6 +39,12 @@ external data source changes. Example scenarios: | |
having to manually copy files from the supported locations (listed below), which | ||
may require installing a different tool for each type. | ||
|
||
When you don't want to store the target data in your local system, you can still | ||
create an import `.dvc` file while transferring a file or directory directly to | ||
remote storage, by using the `--to-remote` option. See the | ||
[Import straight to remote](#example-transfer-to-remote-storage) example for more | ||
details. | ||
|
||
The `url` argument specifies the external location of the data to be imported. | ||
The imported data is <abbr>cached</abbr>, and linked (or copied) to the current | ||
working directory with its original file name e.g. `data.txt` (or to a location | ||
|
@@ -131,6 +139,13 @@ $ dvc run -n download_data \ | |
finish the operation(s)); or if the target data already exist locally and you | ||
want to "DVCfy" this state of the project (see also `dvc commit`). | ||
|
||
- `--to-remote` - Import an external target, but don't move it into the | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
workspace, nor cache it. [Transfer](#example-transfer-to-remote-storage) it | ||
directly to remote storage instead. Use `dvc pull` to get the data locally. | ||
isidentical marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- `-r <name>`, `--remote <name>` - name of the | ||
[remote storage](/doc/command-reference/remote) | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
isidentical marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- `--desc <text>` - user description of the data (optional). This doesn't | ||
affect any DVC operations. | ||
|
||
|
@@ -336,3 +351,46 @@ $ dvc repro | |
Running stage 'prepare' with command: | ||
python src/prepare.py data/data.xml | ||
``` | ||
|
||
## Example: Transfer to remote storage | ||
|
||
When you have a large dataset in an external location, you may want to import it | ||
to you project without downloading it to the local file system (for using it | ||
later/elsewhere). The `--to-remote` option lets you skip the download, while | ||
Comment on lines
+361
to
+365
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And copy over the Example as well (will need some some adapting). Thanks |
||
storing the imported data [remotely](/doc/command-reference/remote). Let's | ||
initialize a DVC project, and setup a remote: | ||
|
||
```dvc | ||
$ mkdir example # workspace | ||
$ mkdir /tmp/dvc-storage | ||
$ cd example | ||
$ git init | ||
$ dvc init | ||
$ dvc remote add myremote /tmp/dvc-storage | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
|
||
Now let's create an import `.dvc` file without downloading the target data, | ||
transferring it directly to remote storage instead: | ||
|
||
``` | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
$ dvc import-url https://data.dvc.org/get-started/data.xml data.xml \ | ||
--to-remote -r local_remote | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
... | ||
``` | ||
|
||
When you run the `import-url` with `--to-remote`, you pass as usual the remote | ||
location and the output filename, afterward if you haven't set a default | ||
[remote](/doc/command-reference/remote) yet, you can simply pass the name of the | ||
remote with `-r`/`--remote` flag and it will start the transfer and leave a DVC | ||
file as an only side effect on your workspace (everything else happens in the | ||
remote storage unit) | ||
isidentical marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Whenever anyone wants to actually download the imported data (for example from a | ||
system that can handle it), they can use `dvc pull` as usual: | ||
|
||
``` | ||
$ dvc pull data.xml.dvc -r tmp_remote | ||
|
||
A data.xml | ||
1 file added and 1 file fetched | ||
``` |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So does this option have any effect or throw an error if
--external
isn't used @isidentical ? Maybe we need an example after all.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I just tried this and
--external
seems to be needed. Should be mentioned here at least?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by
--external
is needed?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried
dvc add --to-remote /external/path
(a default remote exists) and it failed. You have to combine--external
too right? Unless I did something wrong. If I'm right, let's mention that requirement here ^And maybe we should consider automatically applying --external in --to-remote so there's no need to type both?