Performance of Jailed space #111

thien-lm · 2024-10-13T12:37:59Z

In theory, seems that the jailed space will have poor performance.
Did anyone face that issue when the number of workers in Slurm cluster inrease ?

rdjjke · 2024-10-16T10:46:20Z

Hello thien-lm,

Thank you for the question!

The impact on performance largely depends on the storage solution you're using.

In general, shared storage tends to have higher latencies for I/O operations compared to non-shared options, though it can offer much higher overall throughput.

We tested three shared storage solutions in practice:

NFS share
Nebius shared filesystem
GlusterFS

Here’s a breakdown based on two common usage scenarios:

Single-threaded access to many small files (< several MiB) with frequent metadata operations
Examples: Installing apt packages, running executables, cloning small git repositories, editing files, etc.
While there’s no severe lag with any of these solutions, these operations tend to be slower than on non-shared storage.
- NFS share: Provides the lowest latency; in small clusters (up to ~20 nodes), the experience is almost identical to local disk usage. However, latency increases linearly with larger clusters.
- Nebius shared filesystem: Latency is noticeably higher, especially when installing large packages or cloning sizable repositories, but it remains usable. Importantly, performance stays consistent regardless of cluster size. Basic tasks like editing files or executing Slurm commands are handled smoothly.
- GlusterFS: Similar to the Nebius shared filesystem, though it generally offers slightly lower latency and faster metadata operations.
Multi-threaded access to large files (at least several MiB in size)
Examples: Downloading and reading datasets, checkpointing processes, and loading from checkpoints.
- NFS share: Performance is poor when handling concurrent reads/writes on the same files from multiple processes or hosts. Clusters with over 50-100 nodes may struggle to maintain stability.
- Nebius shared filesystem: Excels in handling parallel access, providing strong throughput (up to ~24Gb/s per host and up to 800Gb/s overall).
- GlusterFS: Similar to Nebius shared filesystem, though with somewhat lower throughput (sorry, I don't know exact numbers).

Since Soperator is primarily designed for ML model training, distributed filesystems like GlusterFS or the Nebius shared filesystem are well-suited for the most demanding tasks such as checkpointing and dataset loading. These operations benefit the most from high throughput, while tasks like installing software are typically less frequent and it's not a big deal if they take 2-3 times longer. If you use PyTorch, you can also set higher num_workers and prefetch_factor for its DataLoader to make it work really fast.

Additionally, Soperator allows for flexible storage customization. For example, the "Jail" storage can be an NFS share, while "Jail submounts" can be backed by distributed filesystems. These submounts can leverage any storage type supported by your Kubernetes cluster (e.g., ephemeral or persistent, local or shared, disk-based or in-memory, S3 or OCI) to meet specific use cases.

Some files and directories are non-shared by default: all virtual filesystems including /tmp, /var/run, some low-level GPU libraries, and some config files (though these ones are identical).

Some links:

Comparison of import pytorch performance on NFS vs. distributed filesystems as the cluster size scales: https://www.semianalysis.com/p/ai-neocloud-playbook-and-anatomy?open=false#§drivers-user-experience-and-software
Nebius shared filesystem specifications: https://docs.nebius.com/compute/storage/types/#filesystems

asteny added the docker Pull requests that update Docker code label Nov 12, 2024

asteny assigned rdjjke Nov 12, 2024

asteny added documentation Improvements or additions to documentation and removed docker Pull requests that update Docker code labels Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance of Jailed space #111

Performance of Jailed space #111

thien-lm commented Oct 13, 2024

rdjjke commented Oct 16, 2024

Performance of Jailed space #111

Performance of Jailed space #111

Comments

thien-lm commented Oct 13, 2024

rdjjke commented Oct 16, 2024