You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Request to Add Option to Disable mmap in transformers | Loading models is taking too much time through due to mmap on storage over network case.
#33366
I will say mainly @ArthurZucker but its more general issue as its involving the base class of transformers pretrained model.
Here's an issue explanation :
I am currently using the transformers library to load CLIPTextModel in a Kubernetes environment where I mount an S3 bucket via the S3 CSI driver as a persistent volume to access models. While accessing large files (around 30 GB), I am experiencing severe performance issues, and after investigating, I believe the root cause is related to the forced usage of mmap when loading model weights.
It seems that the current implementation in this section of the code forces the use of mmap without providing an option to disable it. This behavior is highly problematic in storage-over-network use cases, as each mmap call introduces significant latency and performance bottlenecks due to the overhead of network access.
It would be extremely useful if there were a flag or option to disable mmap usage when loading models, allowing users to load the files directly into memory instead. This would enable users like me, to avoid the network-bound performance issues.
I've already tried to find a workaround playing with env variable to disable mmap, but the issue is that i loss so much performance.
Information
The official example scripts
My own modified scripts
Tasks
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
It's quite hard to reproduce as u need to have AWS Account and CSI Driver. But I belive this issue can be reproduced on any storage over network case.
Hey @mrrfr, this is the case only for files that are saved in the .bin format, which are unsafe. Would it be possible for you to use .safetensors files, which are safer and don't use mmap to load?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
Who can help?
I will say mainly @ArthurZucker but its more general issue as its involving the base class of transformers pretrained model.
Here's an issue explanation :
I am currently using the transformers library to load CLIPTextModel in a Kubernetes environment where I mount an S3 bucket via the S3 CSI driver as a persistent volume to access models. While accessing large files (around 30 GB), I am experiencing severe performance issues, and after investigating, I believe the root cause is related to the forced usage of mmap when loading model weights.
It seems that the current implementation in this section of the code forces the use of mmap without providing an option to disable it. This behavior is highly problematic in storage-over-network use cases, as each mmap call introduces significant latency and performance bottlenecks due to the overhead of network access.
I think the feature was introduced here => #28331
It would be extremely useful if there were a flag or option to disable mmap usage when loading models, allowing users to load the files directly into memory instead. This would enable users like me, to avoid the network-bound performance issues.
I've already tried to find a workaround playing with env variable to disable mmap, but the issue is that i loss so much performance.
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
It's quite hard to reproduce as u need to have AWS Account and CSI Driver. But I belive this issue can be reproduced on any storage over network case.
Anyway here the doc for the driver i used, if needed it can be deployed quite fast on a k8s cluster with the doc https://github.com/awslabs/mountpoint-s3-csi-driver?tab=readme-ov-file
Here u can find a deployement manifest https://github.com/awslabs/mountpoint-s3-csi-driver/blob/main/examples/kubernetes/static_provisioning/static_provisioning.yaml
To reproduce you just have to put the models on the S3 bucket, and try to load them through
CLIPTextModel.from_pretrained
.Expected behavior
Loading should be fast.
The text was updated successfully, but these errors were encountered: