Description
Currently, dstack does not allow users to create volumes that are guaranteed to attach to multiple instances (multi-attach can work implicitly for backends that have multi-attach enabled by default such as runpod). The proposal is to let users specify multi-attach as a requirement in volume configuration:
type: volume
name: my-multi-attach-volume
backend: aws
region: eu-central-1
size: 500GB
multiattach: true
dstack would then create a volume of an appropriate type and setup so that attaching the volume to multiple instances in read-write mode is guaranteed to work.
Backends support for multi-attach
Multi-attach is supported in most clouds that provide network storage. Block storage services (EBS, GCP Disks, etc) have limited multi-attach capabilities and require cluster management software such as Pacemaker and cluster file system such as GFS2. Regular file systems (XFS, EXT4) may lead to data/fs corruption even for read-only access. Major clouds offer managed network file systems as a more general-purpose storage for multi-attach (AWS EFS, AWS FSx for Lustre, GCP Filestore, Azure NetApp Files). EFS is inferior to EBS performance-wise, but others such as GCP Filestore are comparable with block storages. dstack can support both multi-attach block storages and network file systems eventually, but NFSs seem more suitable as a default implementation for multi-attach volumes.
- AWS. Supports EBS Multi-Attach for io1 (only in three regions) and io2 (in all regions). Offers EFS (general-purpose NFS) and Amazon FSx for Lustre (high-performance).
- GCP. Persistent Disk can be attached to multiple VMs in read-only mode. Supports multi-attach read-write but for two VMs at most. Filestore (NFS) offers a versatile alternative to Persistent Disk with comparable performance. It's the only viable read-write storage for multi-device TPU Pods.
- Azure. Multi-attach is supported via Shared Disks. Azure NetApp Files is an NFS service comparable to Filestore.
- OCI. Block Volumes can be attached to multiple instances. OCI File Storage is an NFS service comparable to Filestore.
- RunPod. Supports multi-attach by default and already available in dstack.
- Lambda. Supports multi-attach by default.