-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow backup speed (~12MB/s) #102
Comments
Encryption is usually not the bottleneck, data compression is. How many CPU cores does your system have? If you have two cores "50% cpu usage" might mean that Attic uses 100% of one core. Future versions of Attic will hopefully be able to utilize more than one cpu core and also allow disabling on data compression. Btw for comparison, what kind of throughput do you get on your system if you backup your data with zip instead of Attic? |
Just ran a test with Zip instead and got about 20-25 MB/s with 100% cpu usage (on one core out of two) It would be nice have an option to use another compression (or none) because most of my files are already compressed. Or maybe it would be possible to check if a file is compressing well and if not skip compressing it. That way we would have the best of both worlds. |
Ok, so the cpu is not the bottleneck... Are you able to try the GIT version which includes some performance improvements? |
I'll give it a try but it won't be for a few days. Thanks for the fast reply's :) |
I have just tested with the git version as of now (Attic 0.13_8_g21e03af-py3.2-linux-x86_64.egg) |
Weird, I can't really understand what's stopping Attic from using 100% of the cpu when we know that the disk io is fast enough (at least when using zip). Here's a small patch you could try which disables xattr and acl metadata collection. But that's a bit of a long shot: Btw, are you able to re-run your zip benchmark and this time make sue the page cache is empty by running the following command before you start: sync; echo 3 > /proc/sys/vm/drop_caches |
Okay just tested with your patch applied and it dosen't look like it has made a difference. Write speed is stil the same and about the same cpu usage as the very first test. I have tried a test on another computer with an ssd. write speed was about 20MB/s and 70% cpu usage on one core. I ran a test on the same computer where it backups to a ramdisk and got 15-25MB/s write and about 90% cpu usage on one core. So even with a ramdisk it dosen't max the cpu as zip does |
I see very similar results. Around 12MB/s through Attic, and zip is around 25-35MB/s. CPU usage on dual core is around 35%. |
Not maxing out 1 cpu core can only mean source or destination medium lets cpu wait to much (I/O wait), e.g. due to necessary hdd seeks and rotational latency. HDDs are usually maxed out at ~100 IOPS. Try a SSD? |
Just as a performance note: My experimental merge-all branch code does 64MB/s with encryption (no compression) and 75MB/s without encryption (and also no compression). On my laptop (i5-4200, SSD). So there's no fundamental performance problem in attic, just parameters need to get set right for the target environment. I could imagine it being faster, though - esp. if the CPU is not maxed out and there is not much latency (due to SSD usage). |
So, I have been playing around with attic to see how it deals with large $ git rev-parse @
64e8ea72ac26f1c0fdbae8cf652b78e23564fbbc
$ attic init /mnt/attic/jborg
$ /usr/bin/time -v attic create --stats /mnt/attic/jborg::first $BIGFOLDERS This system have a core i7-2600k, 16 GB of ram and the disks Im testing both to Looking at Running simple performance benchmarks I get: $ dd if=/dev/zero of=testfile bs=1G count=30 oflag=direct
32212254720 bytes (32 GB) copied, 149.231 s, 216 MB/s
$ dd if=/a/bigfile of=/dev/null
150000000 bytes (150 MB) copied, 0.749974 s, 200 MB/s Thus, neither my CPU nor disks is utilized yet the performance is very slow. Doing the same benchmarks on attic/merge-all yields 20 MB/s both read and write I'm more than happy to run more benchmarks, if required Hope it helps |
I guess with a single-threaded / single-process attic, there is some I/O wait time when it really just waits for I/O to complete and does nothing else (this happens for read latency as well as for writes when doing an fsync). Also, there is no dispatching of compute intensive stuff to different cores. And it also won't start some I/O while computing some hash, compressing something or encrypting it. I am currently trying to make it multi-threaded to see how fast it gets. Of course Python's GIL limits this a bit, but maybe not as much as one would think: I/O operations and computations in C extensions that release the GIL are not a problem, so we might be lucky... |
Using attic 0.16 I can achieve 16 MB/s, using 75% CPU. Copying from and to the same source and destination, with rsync, I can achieve 100 MB/s. I would like to backup 50-100 TB's of data, so the speed is of course very important. |
@MartinAyla you should probably check out https://github.com/attic/merge/issues/4 (the problem is more severe on ordinary attic). 50-100TB of data might require quite a bit of RAM. |
Are people seeing IO-wait ( When attic is cpu-limited, I have an improvement that should avoid IO-wait caused by writes. Or rather, by fsync(). It can save quite a few percentage points. I will submit it some time when I'm not about to go to bed :). fsync() after each segment write is suboptimal! It means you stop (cpu) processing to wait for the physical disk write. And the default segment size is 5MB. (I noticed Reads should also cause some IO-wait. I think they could be prefetched them, although since the backed-up files will vary in size it won't be quite as nice to implement. |
@sourcejedi A while ago I found the same stuff as you did. fread is causing iowait (obviously) and also fsync is causing iowait. in a single threaded application, that time is basically lost, it just sits there waiting, that is why I work on multithreaded code since a while (it's not finished yet, though). |
Hey, really nice program you have created. I have just one problem. I backup to an external harddisk but I only get speeds around 12MB/s max. I use encryption for the backup but tried a repo without encryption. I even tried backing up from one internal drive to an other. Still get the same speed. My CPU usage is around 50% so don't think that's the problem either..
Appreciate any feedback.
The text was updated successfully, but these errors were encountered: