Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of Jailed space #111

Open
thien-lm opened this issue Oct 13, 2024 · 1 comment
Open

Performance of Jailed space #111

thien-lm opened this issue Oct 13, 2024 · 1 comment
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@thien-lm
Copy link

In theory, seems that the jailed space will have poor performance.
Did anyone face that issue when the number of workers in Slurm cluster inrease ?

@rdjjke
Copy link
Collaborator

rdjjke commented Oct 16, 2024

Hello thien-lm,

Thank you for the question!

The impact on performance largely depends on the storage solution you're using.

In general, shared storage tends to have higher latencies for I/O operations compared to non-shared options, though it can offer much higher overall throughput.

We tested three shared storage solutions in practice:

  1. NFS share
  2. Nebius shared filesystem
  3. GlusterFS

Here’s a breakdown based on two common usage scenarios:

  1. Single-threaded access to many small files (< several MiB) with frequent metadata operations
    Examples: Installing apt packages, running executables, cloning small git repositories, editing files, etc.
    While there’s no severe lag with any of these solutions, these operations tend to be slower than on non-shared storage.
    • NFS share: Provides the lowest latency; in small clusters (up to ~20 nodes), the experience is almost identical to local disk usage. However, latency increases linearly with larger clusters.
    • Nebius shared filesystem: Latency is noticeably higher, especially when installing large packages or cloning sizable repositories, but it remains usable. Importantly, performance stays consistent regardless of cluster size. Basic tasks like editing files or executing Slurm commands are handled smoothly.
    • GlusterFS: Similar to the Nebius shared filesystem, though it generally offers slightly lower latency and faster metadata operations.
  2. Multi-threaded access to large files (at least several MiB in size)
    Examples: Downloading and reading datasets, checkpointing processes, and loading from checkpoints.
    • NFS share: Performance is poor when handling concurrent reads/writes on the same files from multiple processes or hosts. Clusters with over 50-100 nodes may struggle to maintain stability.
    • Nebius shared filesystem: Excels in handling parallel access, providing strong throughput (up to ~24Gb/s per host and up to 800Gb/s overall).
    • GlusterFS: Similar to Nebius shared filesystem, though with somewhat lower throughput (sorry, I don't know exact numbers).

Since Soperator is primarily designed for ML model training, distributed filesystems like GlusterFS or the Nebius shared filesystem are well-suited for the most demanding tasks such as checkpointing and dataset loading. These operations benefit the most from high throughput, while tasks like installing software are typically less frequent and it's not a big deal if they take 2-3 times longer. If you use PyTorch, you can also set higher num_workers and prefetch_factor for its DataLoader to make it work really fast.

Additionally, Soperator allows for flexible storage customization. For example, the "Jail" storage can be an NFS share, while "Jail submounts" can be backed by distributed filesystems. These submounts can leverage any storage type supported by your Kubernetes cluster (e.g., ephemeral or persistent, local or shared, disk-based or in-memory, S3 or OCI) to meet specific use cases.

Some files and directories are non-shared by default: all virtual filesystems including /tmp, /var/run, some low-level GPU libraries, and some config files (though these ones are identical).

Some links:

@asteny asteny added the docker Pull requests that update Docker code label Nov 12, 2024
@asteny asteny added documentation Improvements or additions to documentation and removed docker Pull requests that update Docker code labels Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants