DVC runs faster on system with more cores, how do I replicate this on a system with few cores? #10481
Replies: 3 comments 14 replies
-
@alita-moore are you running it on different cloud instances? can it be IO performance difference? can you check the number of threads running btw? I think |
Beta Was this translation helpful? Give feedback.
-
Hi. We only hash one single file at a time, so |
Beta Was this translation helpful? Give feedback.
-
@alita-moore, we have done some optimizations on dvc==3.54.0. It would be great if you could try. Also, we have re-added support for |
Beta Was this translation helpful? Give feedback.
-
I have a lot of files to save in dvc, and it seems that when I run commands such as
dvc add
on a system with 200 cores / threads it adds data at about 500-1000 files / second, but if I run the same command on a computer with 16 cores / threads it adds at a about 50 file/s. I tried to change thecore.checksum_jobs
config but it didn't change anything.Basically, how can I get speed to be equivalent or better without having to use a 200 core machine (which is expensive) to do basic dvc commands?
NOTE: the cpu usage is low in both cases, so I'm guessing this has something to do with multi-threading?
Beta Was this translation helpful? Give feedback.
All reactions