-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rsync watcher is really slow #3249
Comments
I can confirm this - it takes a minute or more to detect a file change before rsync-auto fires for myself and a coworker under OS X 10.9.2. It's also possible to save a file a few times before an rsync has fired and then multiple rsyncs happen in quick succession when it "catches up." It seems that OS X might batch up these events before sending them to rsync-auto? |
Supposedly vagrant watches fsevents for this. You can use fseventer to see how quickly these appear on osx's side. |
Yes - I can confirm this. I added a |
Interestingly, while this is happening ruby from the vagrant rsync-auto process shows up in my iStat Menus and Activity Monitor as using between 98% and 102% CPU. |
dtruss shows that the rsync-auto process is doing an |
This makes sense - it is listen computing md5 hashes for all the files. Would it be possible to try upgrading the bundled version of listen to at least 2.6.0? According to guard/listen#184 (comment), it may lazily load MD5 hashes better past this version. |
That thread is concerning. It seems to struggle with 20k files? I have
|
Hm this is very odd. Upgrading to 2.6.0 will certainly help. I'll take a look and see what parts may be slow. @patrickheeney rsync specifically optimizes file read/write performance at the expense of syncing latency. However, this is a lot of sync latency so I'll take a look. |
In looking at how guard and guard-shell work, I noticed that new files don't get picked up, and it uses the md5 comparison to ensure that file contents actually change before syncing, whereas a tool like http://www.fernlightning.com/doku.php?id=software:fseventer:start picks up file creation and writes in a matter of milliseconds. It would be nice if there were a configuration for Listen:Adapter:Darwin that would call its callbacks unconditionally upon getting fsevents representing file creation or writes to files without checking the checksum. Perhaps Vagrant could then expose that option. |
@mitchellh Ya totally get the syncing latency, I was just not sure what an acceptable range would be. I am somewhat spoiled it seems with gulpjs and grunt in that everything happens in a less than a second which is faster than I can alt+tab+refresh. It seems as though the amount of files it watches has a crucial impact so I definitely hope we can narrow this down and have something configurable. For example my frontend guys are not going to be editing a lot of backend code so doing something like this |
I've had really good luck with a script just like this: require 'rb-fsevent'
options = {:latency => 1.5, :no_defer => false }
fsevent = FSEvent.new
fsevent.watch "/path/to/my/project", options do |directories|
if directories.select { |i| !i.match("\.(git|idea|svn|hg|cvs)") }.empty?
puts "Discarding these changes: #{directories.inspect}"
else
puts "Detected change inside: #{directories.inspect}"
system("vagrant rsync")
end
end
fsevent.run I know it isn't as cross-platform as listen, but that works very well with a 28k+ file directory (and also has the side-effect of showing off the amount of fsevents that happen in the various .git directories in a project as I'm working on it.) |
Throwing it out there. Can this problem not be solved with the rsync daemon? Bootstrap the rsync daemon on the box and then run rsync host side for a speedier transfer. I'm not too familiar with the advantages of the daemon so feel free to say if this isn't going to work out. |
As mentioned in #3159 it seems the main issue with Guard/Listen is that it's a blacklist instead of a whitelist like grunt watch does (see. https://gist.github.com/arnaudbreton/9517344), which explains the perf. difference when using the last. The solution offered by @patrickheeney with the path option is a good option, combined with a way to specify these path directly in the |
@arnaudbreton My solution was the band-aid approach. It should speed things up but it appears there are deeper issues here that @smerrill discovered. Ideally both would be addressed. |
@ThePixelDeveloper The problem is on the host side, where monitoring for changes is quite slow. The rsync daemon might help to lower transactional CPU usage, though, so it would probably be worth looking at as a new option for rsynced folders in another issue. |
I work with @smerrill and have tweaked the script a bit to read from my Vagrantfile. #!/usr/bin/env ruby
require 'rb-fsevent'
options = {:latency => 1.5, :no_defer => false }
pathRegex = /^(?!\s*#).*config.vm.synced_folder\s*"(.*?)"\s*,\s*".*?"\s*,.*?type:\s*"rsync".*/
paths = Array.new
File.open('Vagrantfile').each_line do |line|
paths << File.expand_path($1) if pathRegex.match(line)
end
puts "Watching: #{paths}"
fsevent = FSEvent.new
fsevent.watch paths, options do |directories|
if not directories.select { |i| !i.match("\.(git|idea|svn|hg|cvs)") }.empty?
puts "Detected change inside: #{directories.inspect}"
system("vagrant rsync")
end
end
fsevent.run I've dropped this into the directory with my Vagrantfile. I am running it via |
I actually tried using the rsync daemon while trying to fix the latency issues, and for me it didn't seem to matter from benchmarks, with the latency hovering around 1 second over either rsync or rsync+ssh with 35,000 files. Surprisingly the daemon actually takes a little longer when dealing with 75,000+ (2-3s vs 1-2s) and I'm not sure why. I ended up using an simple inotify-tools wrapper to watch the current directory and run the rsync command on any change. This works really well for me with 35,000 files and finally solves our NFS woes. I think another bottleneck here is that when the number of files grows, you end up having watchers on all the files, then when those detect a change, an rsync command runs that tries to detect changes again and sync the entire tree. Something like Unison might work better here on large trees because (someone correct me if I'm wrong) it has the same watches BUT does not try to resync the entire tree on a change but only sends the affected files. |
Ah ok, glad someone else looked into the daemon. I took a similar solution to many of the other people here. I created a nodejs listener script that listens for changes and pings off a request to a simple server in the VM which does a rebuild of the assets. Works really well, even though it's an icky hack. |
@dougmarcey and I have released an alpha implementation of a lighter-weight rsync-auto command that uses the same rb-fsevent and rb-inotify libraries under the covers: https://github.com/smerrill/vagrant-gatling-rsync . It's been moderately tested on OS X and more lightly tested under Linux. It also outputs copious log messages if you run it with We'd love your feedback if you want to try it out. |
@smerrill Awesome, if this avenue turns out to be much more performant, I may end up wanting to merge your plugin into core! Wouldn't make sense for a 1.5.x so I'll keep continuing to try to improve the |
https://github.com/smerrill/vagrant-gatling-rsync works a lot better than the existing implementation. I'm hoping this can be merged into the core, and extended to allow two-way-sync (#3062 (comment)) I simply can't get the performance i'm looking for using VirtualBox synced folders or nfs. Hoping that bidirectional-rsync is the way forward. current workflow:
|
On Linux I was struggling with this. Ended up using https://github.com/hollow/inosync instead of rsync-auto which works brilliantly with ~35k files. |
I updated the listen gem requirement to 2.7.1, since that appears to help performance for OS X a bit. |
That's Listen creating a snapshot to detect complex changes, like moving whole trees, e.g. ... so listen compares the directories with it's internal record ("snapshot") and generates additional events - the removing of ALL the files that used to be in c:\program files ... and the addition of all the "new" files in "c:\documents and settings"\Program files.
This depends on a lot of things, e.g. what your editor ACTUALLY is doing (e.g. backups, swap files, moving, renaming, deleting, setting the files to read-only, pid files, ...). Later on (in listener.rb) the duplicates events are removed. In short - saving a single file can generate LOTS of file system events, so it depends what you're doing.
Add the logging to listener.rb (_wait_for_changes) - because it's until that point that the delay happens (so the real delay is between the change happening in change.rb and being forwarded to the callback in listen.rb). I don't know how Vagrant uses rsync exactly, so these may not apply, but you could try:
MD5 hashing is just a Mac workaround, because the file timestamps there are unreliable. So if it's doing MD5 hashing in Windows, it's a bug. |
Just tested listener.rb and the Windows adapter and those seem fine, both logging almost instantly. |
For anyone interested in Listen's performance and status: guard/listen#207 (comment) |
This is fixed in Listen v2.7.7 The "slowness" was caused by frequent task/thread switching with many sleeping mutexes/conditionals - now hundreds of thousands of files should be "indexed" within seconds (given the files are cached, of course). |
This is really great to hear. I'll include that in the next release of Vagrant and we should hear feedback pretty soon! |
👍 |
Thanks all! |
Just to say, you guys are pretty awesome. |
👍 |
@thasmo I have exactly the same problem and it is very annoying. I will tell you my "solution" (it is not a real one but it is sth). I copy the file I save from IDE (host) to guest every time I save it, automatically. I use PhpStorm which has a plugin called "File watcher". I created a custom watcher that has the parameters: Disable immediate syncronization moduleName (=project Name) must map the folder name inside my shared folder This does not handle deletes. In this case you must periodically do a vagrant rsync. You can also try : https://github.com/GM-Alex/vagrant-winnfsd (details here http://www.jankowfsky.com/blog/2013/11/28/nfs-for-vagrant-under-windows/) |
Any idea when the next release is planned? I look forward to this improvement :) |
Having the same issue on Mac, would be great to know when the next release with some if the here mentioned improvements is planned. |
seems that this works for me:
would be great if vagrant rsync would support a path that should be synced. -.-' |
+1, very small file set (~200 files) and am still seeing ~8 seconds between when I hit save and when rsync-auto picks up and pushes the file over. |
@Taytay You may test it that way: Install Vagrant from source but before installing the gem, try to modify the Listen version via the |
Yeah, I just looked into https://github.com/guard/listen and it seems version 2.7.7 should fix or at the least improve this. Anyone tried it out yet? @mitchellh are there plans to update this for the next vagrant release? Would be really useful. |
@mitchellh Is there a reason this is stalling and neither of the last two releases included the update to listen? |
@Globegitter Just being areful with updates on patch release. 1.7 will update listen. |
Closing this due to new version of listen. Please check 1.7 once released and if its still slow lets reopen to discuss. |
The latency is way down but this still keeps the CPU around 100%. gatling-rsync-auto is still the only viable sync tool. |
TL;DR - please report any "slowness" in guard/listen as a new issue - with a full debug output (or at least go through the troubleshooting section).@treyhyde - your case is probably special. Open an issue in https://github.com/guard/listen and link to a Gist with the output from using Listen works pretty much the same way as gatling-rsync-auto, so it's likely a difference in configuration. E.g. you could be watching log files or database files, which makes no sense - but can be exactly the reason you're getting 100% CPU usage. Also, you don't even mention whether you're on OSX or Linux, so I'm assuming you're using OSX (which has a few extra issues to know about). Again, the LISTEN_GEM_DEBUGGING environment variable is the ONLY way we can even attempt to guess what the problem it. |
I'm having slowness and I ran things with debug on. |
@tslater - open a new issue in https://github.com/guard/listen, and use |
Running version 1.5.1. The issue I have now is the rsync watcher is incredibly slow. It takes at least several minutes to detect that a file has changed and initiate rsync.
After about 10 minutes sometimes it goes rsync crazy and does 7 or 8 in a row.
Some others have issues #3159 (comment).
Also it may be helpful to have a whitelist of folders to watch. In my case I only need to watch the www/* folder as the rest are irrelevant:
(My original issue here was the fact that I had a packer .iso file and rsync was not outputting any progress so it looks like it was hanging for 15 minutes.)
The text was updated successfully, but these errors were encountered: