From 3a485da00c3a5890ac30dab7ad99f4f4ea350aee Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 3 Feb 2021 12:05:57 -0600 Subject: [PATCH 1/5] config: better s3.configpath explanation (after #2140) --- content/docs/command-reference/cache/dir.md | 6 +++--- content/docs/command-reference/config.md | 8 ++++---- content/docs/command-reference/remote/modify.md | 11 ++++++++--- 3 files changed, 15 insertions(+), 10 deletions(-) diff --git a/content/docs/command-reference/cache/dir.md b/content/docs/command-reference/cache/dir.md index d689c40806..fbc4579c8d 100644 --- a/content/docs/command-reference/cache/dir.md +++ b/content/docs/command-reference/cache/dir.md @@ -29,10 +29,10 @@ cache directory. ## Options -- `--global` - modify a global config file (e.g. `~/.config/dvc/config`) instead - of the project's `.dvc/config`. +- `--global` - modify the global config file (e.g. `~/.config/dvc/config`) + instead of the project's `.dvc/config`. -- `--system` - modify a system config file (e.g. `/etc/dvc/config`) instead of +- `--system` - modify the system config file (e.g. `/etc/dvc/config`) instead of `.dvc/config`. - `--local` - modify a local [config file](/doc/command-reference/config) diff --git a/content/docs/command-reference/config.md b/content/docs/command-reference/config.md index b39ee0731a..539692f3be 100644 --- a/content/docs/command-reference/config.md +++ b/content/docs/command-reference/config.md @@ -61,11 +61,11 @@ multiple projects and users, respectively: need to specify private config option values that you don't want to track and share with Git (credentials, private locations, etc). -- `--global` - modify a global config file (e.g. `~/.config/dvc/config`) instead - of the project's `.dvc/config`. Useful to apply config options to all your - projects. +- `--global` - modify the global config file (e.g. `~/.config/dvc/config`) + instead of the project's `.dvc/config`. Useful to apply config options to all + your projects. -- `--system` - modify a system config file (e.g. `/etc/dvc/config`) instead of +- `--system` - modify the system config file (e.g. `/etc/dvc/config`) instead of `.dvc/config`. Useful to apply config options to all the projects (all users) in the machine. May require superuser access e.g. `sudo dvc config --system ...` (Linux). diff --git a/content/docs/command-reference/remote/modify.md b/content/docs/command-reference/remote/modify.md index 6c1705e41c..d82d905fd3 100644 --- a/content/docs/command-reference/remote/modify.md +++ b/content/docs/command-reference/remote/modify.md @@ -134,14 +134,19 @@ these parameters, you could use the following options. $ dvc remote modify myremote credentialpath /path/to/creds ``` -- `configpath` - path to the AWS config file. The location defaults to - `~/.aws/config`. It supports S3-specific - [configuration values](https://docs.aws.amazon.com/cli/latest/topic/s3-config.html#configuration-values): +- `configpath` - path to the + [AWS CLI config file](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html). + The default AWS CLI config file path (e.g. `~/.aws/config`) is used if this + parameter isn't set. ```dvc $ dvc remote modify myremote --local configpath /path/to/config ``` + > Note that only the S3-specific + > [configuration values](https://docs.aws.amazon.com/cli/latest/topic/s3-config.html#configuration-values) + > are used. + - `endpointurl` - endpoint URL to access S3: ```dvc From c965762ca06e0c152f747f2863ac3c813b903206 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 3 Feb 2021 13:25:36 -0600 Subject: [PATCH 2/5] remote: review ssh// URLs and auth info. per https://github.com/iterative/dvc.org/pull/1801#discussion_r495416929 --- content/docs/command-reference/pull.md | 4 ++-- content/docs/command-reference/push.md | 4 ++-- .../docs/command-reference/remote/modify.md | 7 ++++--- .../docs/user-guide/external-dependencies.md | 19 ++++++++++--------- 4 files changed, 18 insertions(+), 16 deletions(-) diff --git a/content/docs/command-reference/pull.md b/content/docs/command-reference/pull.md index eb47e16993..cc1d31908b 100644 --- a/content/docs/command-reference/pull.md +++ b/content/docs/command-reference/pull.md @@ -229,9 +229,9 @@ already set up and you can use `dvc remote list` to check them. To remember how it's done, and set a context for the example, let's define a default SSH remote: ```dvc -$ dvc remote add -d r1 ssh://_username_@_host_/path/to/dvc/remote/storage +$ dvc remote add -d r1 ssh://user@example.com/path/to/dvc/remote/storage $ dvc remote list -r1 ssh://_username_@_host_/path/to/dvc/remote/storage +r1 ssh://user@example.com/path/to/dvc/remote/storage ``` > DVC supports several diff --git a/content/docs/command-reference/push.md b/content/docs/command-reference/push.md index 89a36dbb4a..e295b6fec2 100644 --- a/content/docs/command-reference/push.md +++ b/content/docs/command-reference/push.md @@ -116,7 +116,7 @@ To use `dvc push` (without options), a default ```dvc $ dvc remote add --default r1 \ - ssh://_username_@_host_/path/to/dvc/cache/directory + ssh://user@example.com/path/to/dvc/cache/directory ``` > For existing projects, remotes are usually already set up. You @@ -124,7 +124,7 @@ $ dvc remote add --default r1 \ > > ```dvc > $ dvc remote list -> r1 ssh://_username_@_host_/path/to/dvc/cache/directory +> r1 ssh://user@example.com/path/to/dvc/cache/directory > ``` Push entire data cache from the current workspace to diff --git a/content/docs/command-reference/remote/modify.md b/content/docs/command-reference/remote/modify.md index d82d905fd3..9be2872327 100644 --- a/content/docs/command-reference/remote/modify.md +++ b/content/docs/command-reference/remote/modify.md @@ -515,7 +515,8 @@ more information. ### Click for SSH - `url` - remote location, in a regular - [SSH format](https://tools.ietf.org/id/draft-salowey-secsh-uri-00.html#sshsyntax): + [SSH format](https://tools.ietf.org/id/draft-salowey-secsh-uri-00.html#sshsyntax). + Note that this can already the `user` parameter, embedded into the URL: ```dvc $ dvc remote modify myremote url \ @@ -528,7 +529,7 @@ more information. > Note that your server's SFTP root might differ from its physical root (`/`). -- `user` - username to access the remote. +- `user` - username to access the remote: ```dvc $ dvc remote modify --local myremote user myuser @@ -539,7 +540,7 @@ more information. 1. `user` parameter set with this command (found in `.dvc/config`); 2. User defined in the URL (e.g. `ssh://user@example.com/path`); 3. User defined in `~/.ssh/config` for this host (URL); - 4. Current user + 4. Current system user - `port` - port to access the remote. diff --git a/content/docs/user-guide/external-dependencies.md b/content/docs/user-guide/external-dependencies.md index 76381e088b..91566f75cd 100644 --- a/content/docs/user-guide/external-dependencies.md +++ b/content/docs/user-guide/external-dependencies.md @@ -37,6 +37,9 @@ certain `dvc remote` types. Currently, the following protocols are supported: Let's take a look at defining and running a `download_file` stage that simply downloads a file from an external location, on all the supported location types. +> See the [Remote alias example](#example-using-dvc-remote-aliases) for info. on +> using remote locations that require manual authentication setup. +
### Click for Amazon S3 @@ -88,7 +91,7 @@ $ dvc run -n download_file \ $ dvc run -n download_file \ -d ssh://user@example.com/path/to/data.txt \ -o data.txt \ - scp user@example.com:/path/to/data.txt data.txt + scp ssh://user@example.com:/path/to/data.txt data.txt ``` ⚠️ DVC requires both SSH and SFTP access to work with remote SSH locations. @@ -144,10 +147,9 @@ $ dvc run -n download_file \ ## Example: Using DVC remote aliases You may want to encapsulate external locations as configurable entities that can -be managed independently. This is useful if multiple dependencies (or stages) -reuse the same location, or if its likely to change in the future. And if the -location requires authentication, you need a way to configure it in order to -connect. +be managed independently. This is useful if the connection requires +authentication, if multiple dependencies (or stages) reuse the same location, or +if the URL is likely to change in the future. [DVC remotes](/doc/command-reference/remote) can do just this. You may use `dvc remote add` to define them, and then use a special URL with format @@ -157,12 +159,11 @@ dependency. Let's see an example using SSH. First, register and configure the remote: ```dvc -$ dvc remote add myssh ssh://myserver.com -$ dvc remote modify --local myssh user myuser -$ dvc remote modify --local myssh password mypassword +$ dvc remote add myssh ssh://user@example.com +$ dvc remote modify --local myssh password 'mypassword' ``` -> Please refer to `dvc remote add` for more details like setting up access +> Please refer to `dvc remote modify` for more details like setting up access > credentials for the different remote types. Now, use an alias to this remote when defining the stage: From 1e08ff8fe4912dad2eb57342c73c1289d9e2d2be Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 3 Feb 2021 13:46:32 -0600 Subject: [PATCH 3/5] cmd: note ~/ paths are for Linux in remote modify --- content/docs/command-reference/remote/modify.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/content/docs/command-reference/remote/modify.md b/content/docs/command-reference/remote/modify.md index 9be2872327..c844ff66ff 100644 --- a/content/docs/command-reference/remote/modify.md +++ b/content/docs/command-reference/remote/modify.md @@ -539,7 +539,8 @@ more information. 1. `user` parameter set with this command (found in `.dvc/config`); 2. User defined in the URL (e.g. `ssh://user@example.com/path`); - 3. User defined in `~/.ssh/config` for this host (URL); + 3. User defined in the SSH config file (e.g. `~/.ssh/config`) for this host + (URL); 4. Current system user - `port` - port to access the remote. @@ -552,7 +553,8 @@ more information. 1. `port` parameter set with this command (found in `.dvc/config`); 2. Port defined in the URL (e.g. `ssh://example.com:1234/path`); - 3. Port defined in `~/.ssh/config` for this host (URL); + 3. Port defined in the SSH config file (e.g. `~/.ssh/config`) for this host + (URL); 4. Default SSH port 22 - `keyfile` - path to private key to access the remote. @@ -657,8 +659,8 @@ by HDFS. Read more about by expanding the WebHDFS section in - `hdfscli_config` - path to a `HdfsCLI` cfg file. WebHDFS access depends on `HdfsCLI`, which allows the usage of a configuration file by default located - in `~/.hdfscli.cfg`. In the file, multiple aliases can be set with their own - connection parameters, like `url` or `user`. If using a cfg file, + in `~/.hdfscli.cfg` (Linux). In the file, multiple aliases can be set with + their own connection parameters, like `url` or `user`. If using a cfg file, `webhdfs_alias` can be set to specify which alias to use. ```dvc From c67ec08f9adcb587517efcadfc8a53eb0cb17dfe Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Thu, 11 Feb 2021 08:26:43 -0600 Subject: [PATCH 4/5] typo in start/pipelines --- content/docs/start/data-pipelines.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/start/data-pipelines.md b/content/docs/start/data-pipelines.md index 863808c75f..8abda6710f 100644 --- a/content/docs/start/data-pipelines.md +++ b/content/docs/start/data-pipelines.md @@ -222,7 +222,7 @@ This should be a good point to commit the changes with Git. These include ## Reproduce -The whole point of creating this `dvc.yaml` pipeline file is an ability to +The whole point of creating this `dvc.yaml` pipelines file is an ability to reproduce the pipeline: ```dvc From eac96f86799f7fd7920bccb36f77d027297379d1 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 1 Mar 2021 06:02:01 -0600 Subject: [PATCH 5/5] typos --- content/docs/command-reference/remote/modify.md | 3 ++- content/docs/user-guide/external-dependencies.md | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/content/docs/command-reference/remote/modify.md b/content/docs/command-reference/remote/modify.md index 2dd03de0fa..cb32f4a7a5 100644 --- a/content/docs/command-reference/remote/modify.md +++ b/content/docs/command-reference/remote/modify.md @@ -506,7 +506,8 @@ more information. - `url` - remote location, in a regular [SSH format](https://tools.ietf.org/id/draft-salowey-secsh-uri-00.html#sshsyntax). - Note that this can already the `user` parameter, embedded into the URL: + Note that this can already include the `user` parameter, embedded into the + URL: ```dvc $ dvc remote modify myremote url \ diff --git a/content/docs/user-guide/external-dependencies.md b/content/docs/user-guide/external-dependencies.md index 91566f75cd..1d471955df 100644 --- a/content/docs/user-guide/external-dependencies.md +++ b/content/docs/user-guide/external-dependencies.md @@ -91,7 +91,7 @@ $ dvc run -n download_file \ $ dvc run -n download_file \ -d ssh://user@example.com/path/to/data.txt \ -o data.txt \ - scp ssh://user@example.com:/path/to/data.txt data.txt + scp user@example.com:/path/to/data.txt data.txt ``` ⚠️ DVC requires both SSH and SFTP access to work with remote SSH locations.