Request to Add Option to Disable mmap in transformers | Loading models is taking too much time through due to mmap on storage over network case. #33366

mrrfr · 2024-09-07T09:00:34Z

System Info

Ubuntu 22.04
Troch 2.4.0
Cuda 12.4
Transformers 4.44.2
Python 3.11
Diffusers 0.30.2

Who can help?

I will say mainly @ArthurZucker but its more general issue as its involving the base class of transformers pretrained model.

Here's an issue explanation :

I am currently using the transformers library to load CLIPTextModel in a Kubernetes environment where I mount an S3 bucket via the S3 CSI driver as a persistent volume to access models. While accessing large files (around 30 GB), I am experiencing severe performance issues, and after investigating, I believe the root cause is related to the forced usage of mmap when loading model weights.

It seems that the current implementation in this section of the code forces the use of mmap without providing an option to disable it. This behavior is highly problematic in storage-over-network use cases, as each mmap call introduces significant latency and performance bottlenecks due to the overhead of network access.

I think the feature was introduced here => #28331

It would be extremely useful if there were a flag or option to disable mmap usage when loading models, allowing users to load the files directly into memory instead. This would enable users like me, to avoid the network-bound performance issues.

I've already tried to find a workaround playing with env variable to disable mmap, but the issue is that i loss so much performance.

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

It's quite hard to reproduce as u need to have AWS Account and CSI Driver. But I belive this issue can be reproduced on any storage over network case.

Anyway here the doc for the driver i used, if needed it can be deployed quite fast on a k8s cluster with the doc https://github.com/awslabs/mountpoint-s3-csi-driver?tab=readme-ov-file

Here u can find a deployement manifest https://github.com/awslabs/mountpoint-s3-csi-driver/blob/main/examples/kubernetes/static_provisioning/static_provisioning.yaml

To reproduce you just have to put the models on the S3 bucket, and try to load them through CLIPTextModel.from_pretrained.

Expected behavior

Loading should be fast.

The text was updated successfully, but these errors were encountered:

LysandreJik · 2024-09-09T11:49:16Z

Hey @mrrfr, this is the case only for files that are saved in the .bin format, which are unsafe. Would it be possible for you to use .safetensors files, which are safer and don't use mmap to load?

github-actions · 2024-10-08T08:04:00Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

mrrfr added the bug label Sep 7, 2024

LysandreJik added the Core: Modeling Internals of the library; Models. label Sep 9, 2024

github-actions bot closed this as completed Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request to Add Option to Disable mmap in transformers | Loading models is taking too much time through due to mmap on storage over network case. #33366

Request to Add Option to Disable mmap in transformers | Loading models is taking too much time through due to mmap on storage over network case. #33366

mrrfr commented Sep 7, 2024

LysandreJik commented Sep 9, 2024

github-actions bot commented Oct 8, 2024

Request to Add Option to Disable mmap in transformers | Loading models is taking too much time through due to mmap on storage over network case. #33366

Request to Add Option to Disable mmap in transformers | Loading models is taking too much time through due to mmap on storage over network case. #33366

Comments

mrrfr commented Sep 7, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

LysandreJik commented Sep 9, 2024

github-actions bot commented Oct 8, 2024