Add a new data remote.
Depending on your storage type, you may also need
dvc remote modify
to provide credentials and/or configure other remote parameters.
See also default, list, modify, and remove commands to manage data remotes.
usage: dvc remote add [-h] [--global] [--system] [--local] [-q | -v]
[-d] [-f] name url
positional arguments:
name Name of the remote.
url URL. (See supported URLs in the examples below.)
name
and url
are required. url
specifies a location to store your data. It
can point to a cloud storage service, an SSH server, network-attached storage,
or even a directory in the local file system. (See all the supported remote
storage types in the examples below.) If url
is a relative path, it will be
resolved against the current working directory, but saved relative to the
config file location (see LOCAL example below). Whenever possible, DVC will
create a remote directory if it doesn't exists yet. (It won't create an S3
bucket though, and will rely on default access settings.)
If you installed DVC via
pip
and plan to use cloud services as remote storage, you might need to install these optional dependencies:[s3]
,[azure]
,[gdrive]
,[gs]
,[oss]
,[ssh]
. Alternatively, use[all]
to include them all. The command should look like this:pip install "dvc[s3]"
. (This example installsboto3
library along with DVC to support S3 storage.)
This command creates a section in the DVC project's
config file and optionally assigns a default
remote in the core section if the --default
option is used:
['remote "myremote"']
url = /tmp/dvc-storage
[core]
remote = myremote
DVC supports the concept of a default remote. For the commands that accept a
--remote
option (dvc pull
, dvc push
, dvc status
, dvc gc
, dvc fetch
),
the default remote is used if that option is not used.
Use dvc config
to unset/change the default remote as so:
dvc config -u core.remote
.
-
--global
- save remote configuration to the global config (e.g.~/.config/dvc/config
) instead of.dvc/config
. -
--system
- save remote configuration to the system config (e.g./etc/dvc.config
) instead of.dvc/config
. -
--local
- modify a local config file instead of.dvc/config
. It is located in.dvc/config.local
and is Git-ignored. This is useful when you need to specify private config options in your config that you don't want to track and share through Git (credentials, private locations, etc). -
-d
,-default
- commands that require a remote (such asdvc pull
,dvc push
,dvc fetch
) will be using this remote by default to upload or download data (unless their-r
option is used). -
-f
,--force
- overwrite existing remote with newurl
value. -
-h
,--help
- prints the usage/help message, and exit. -
-q
,--quiet
- do not write anything to standard output. Exit with 0 if no problems arise, otherwise 1. -
-v
,--verbose
- displays detailed tracing information.
The following are the types of remote storage (protocols) supported:
💡 Before adding an S3 remote, be sure to Create a Bucket.
$ dvc remote add myremote s3://bucket/path
By default DVC expects your AWS CLI is already
configured.
DVC will be using default AWS credentials file to access S3. To override some of
these settings, use the parameters described in dvc remote modify
.
We use the boto3
library to communicate with AWS. The following API methods
are performed:
list_objects_v2
,list_objects
head_object
download_file
upload_file
delete_object
copy
So, make sure you have the following permissions enabled:
s3:ListBucket
s3:GetObject
s3:PutObject
s3:DeleteObject
To communicate with a remote object storage that supports an S3 compatible API
(e.g. Minio,
DigitalOcean Spaces,
IBM Cloud Object Storage etc.) you
must explicitly set the endpointurl
in the configuration:
For example:
$ dvc remote add myremote s3://mybucket/path/to/dir
$ dvc remote modify myremote endpointurl https://object-storage.example.com
See
dvc remote modify
for a full list of S3 API parameters.
S3 remotes can also be configured entirely via environment variables:
$ export AWS_ACCESS_KEY_ID="<my-access-key>"
$ export AWS_SECRET_ACCESS_KEY="<my-secret-key>"
$ dvc remote add myremote "s3://bucket/myremote"
For more information about the variables DVC supports, please visit boto3 documentation
$ dvc remote add myremote azure://my-container-name/path
$ dvc remote modify --local myremote connection_string "my-connection-string"
The connection string contains access to data and is inserted into the
.dvc/config
file. Therefore, it is safer to add the connection string with the--local
option, enforcing it to be written to a Git-ignored config file. Seedvc remote modify
for a full list of Azure storage parameters.
The Azure Blob Storage remote can also be configured entirely via environment variables:
$ export AZURE_STORAGE_CONNECTION_STRING="<my-connection-string>"
$ export AZURE_STORAGE_CONTAINER_NAME="my-container-name"
$ dvc remote add myremote "azure://"
For more information on configuring Azure Storage connection strings, visit here.
-
connection string
- this is the connection string to access your Azure Storage Account. If you don't already have a storage account, you can create one following these instructions. The connection string can be found in the "Access Keys" pane of your Storage Account resource in the Azure portal.💡Make sure the value is quoted to prevent shell from misprocessing the command.
-
container name
- this is the top-level container in your Azure Storage Account under which all the files for this remote will be uploaded. If the container doesn't already exist, it will be created automatically.
Please check out
Setup a Google Drive DVC Remote for
a full guide on configuring Google Drives for use as DVC remote storage,
including obtaining the necessary credentials, and how to form gdrive://
URLs.
$ dvc remote add -d myremote gdrive://root/path/to/folder
$ dvc remote modify myremote gdrive_client_id <client ID>
$ dvc remote modify myremote gdrive_client_secret <client secret>
Note that GDrive remotes are not "trusted" by default. This means that the
verify
option is enabled on this type of storage, so DVC recalculates the checksums of
files upon download (e.g. dvc pull
), to make sure that these haven't been
modified.
$ dvc remote add myremote gs://bucket/path
See also
dvc remote modify
for a full list of GC object storage parameters.
First you need to setup OSS storage on Aliyun Cloud and then use an S3 style URL for OSS storage and make the endpoint value configurable. An example is shown below:
$ dvc remote add myremote oss://my-bucket/path
To set key id, key secret and endpoint (or any other OSS parameter), use
dvc remote modify
. Example usage is show below. Make sure to use the --local
option to avoid committing your secrets into Git:
$ dvc remote modify myremote --local oss_key_id my-key-id
$ dvc remote modify myremote --local oss_key_secret my-key-secret
$ dvc remote modify myremote oss_endpoint endpoint
You can also set environment variables and use them later, to set environment variables use following environment variables:
$ export OSS_ACCESS_KEY_ID="my-key-id"
$ export OSS_ACCESS_KEY_SECRET="my-key-secret"
$ export OSS_ENDPOINT="endpoint"
Testing your OSS storage using docker
Start a container running an OSS emulator, and setup the environment variables, for example:
$ git clone https://github.com/nanaya-tachibana/oss-emulator.git
$ docker image build -t oss:1.0 oss-emulator
$ docker run --detach -p 8880:8880 --name oss-emulator oss:1.0
$ export OSS_BUCKET='my-bucket'
$ export OSS_ENDPOINT='localhost:8880'
$ export OSS_ACCESS_KEY_ID='AccessKeyID'
$ export OSS_ACCESS_KEY_SECRET='AccessKeySecret'
Uses default key id and key secret when they are not given, which gives read access to public read bucket and public bucket.
$ dvc remote add myremote ssh://user@example.com/path/to/dir
See also
dvc remote modify
for a full list of SSH parameters.
ssh
and sftp
(GNU/Linux).
Note that your server's SFTP root might differ from its physical root (
/
). (On Linux, see theChrootDirectory
config option in/etc/ssh/sshd_config
.) In these cases, the path component in the SSH URL (e.g./path/to/dir
above) should be specified relative to the SFTP root instead. For example, on some Sinology NAS drives, the SFTP root might be in directory/volume1
, in which case you should use path/path/to/dir
instead of/volume1/path/to/dir
.
$ dvc remote add myremote hdfs://user@example.com/path/to/dir
See also
dvc remote modify
for a full list of HDFS parameters.
$ dvc remote add myremote https://example.com/path/to/dir
pull
andfetch
import-url
andget-url
- As an external dependency
A "local remote" is a directory in the machine's file system.
While the term may seem contradictory, it doesn't have to be. The "local" part refers to the machine where the project is stored, so it can be any directory accessible to the same system. The "remote" part refers specifically to the project/repository itself. Read "local, but external" storage.
Using an absolute path (recommended):
$ dvc remote add myremote /tmp/my-dvc-storage
$ cat .dvc/config
...
['remote "myremote"']
url = /tmp/my-dvc-storage
...
Note that the absolute path
/tmp/my-dvc-storage
is saved as is.
Using a relative path:
$ dvc remote add myremote ../my-dvc-storage
$ cat .dvc/config
...
['remote "myremote"']
url = ../../my-dvc-storage
...
Note that
../my-dvc-storage
has been resolved relative to the.dvc/
dir, resulting in../../my-dvc-storage
.
Add an Amazon S3 remote as the default (via -d
option), and modify its
region.
💡 Before adding an S3 remote, be sure to Create a Bucket.
$ dvc remote add -d myremote s3://mybucket/myproject
Setting 'myremote' as a default remote.
$ dvc remote modify myremote region us-east-2
The project's config file (.dvc/config
) now looks like this:
['remote "myremote"']
url = s3://mybucket/myproject
region = us-east-2
[core]
remote = myremote
The list of remotes should now be:
$ dvc remote list
myremote s3://mybucket/myproject
You can overwrite existing remotes using -f
with dvc remote add
:
$ dvc remote add -f myremote s3://mybucket/mynewproject
List remotes again to view the updated remote:
$ dvc remote list
myremote s3://mybucket/mynewproject