Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auditbeat's file_integrity module deadlocks under Windows #6864

Closed
adriansr opened this issue Apr 13, 2018 · 1 comment
Closed

Auditbeat's file_integrity module deadlocks under Windows #6864

adriansr opened this issue Apr 13, 2018 · 1 comment

Comments

@adriansr
Copy link
Contributor

adriansr commented Apr 13, 2018

When the file_integrity module is used under Windows to monitor large and/or busy filesystems, it can end up in a deadlock.

When using this configuration:

- module: file_integrity
  paths:
  - C:/windows
  - C:/windows/system32
  - C:/Program Files
  - C:/Users
  recursive: true
  hash_types: [sha256]

Auditbeat will not generate any events whatsoever.

The reason for this is that the Windows implementation of fsnotify uses a single goroutine to forward events to auditbeat and to install watches. Communication with this goroutine is done via channels. In this case, what's happening is that the queue on channel to forward events fills and the whole goroutine blocks, so its unable to install new watches.

This is because as watches are installed, events start to be received but auditbeat is not consuming them, as it first installs the watches and then starts the consumer.

Auditbeat use of fsnotify looks like this:

  # setup phase
  for dir in config.paths:
     installWatch(dir)
  # run phase
  consumeEvents()

The recursive feature makes the problem much likely to appear for two reasons:

  • it involves installing a watch for every subdirectory inside a watched directory. That's thousands of watches for C:\windows.
  • it does not only install watches during the setup phase, but also when a subdirectory is created inside a watched directory. This means consuming events also involves installing more watches.

The simpler way to fix it requires two changes:

  • Don't install any watches during setup phase. Use a goroutine inside consumeEvents() so that events are being consumed while watches are installed.
  • Modify fsnotify to use OS support for recursive watches. This makes installing watches for new subdirectories unnecessary and simplifies things.
@adriansr
Copy link
Contributor Author

This caused #6796

andrewkroh pushed a commit that referenced this issue Apr 19, 2018
* Add support for case-insensitive text search in system-tests

Allow system tests to search for text in the beat logfile using
case-insensitive search. This is necessary to match paths in
case-insensitive file systems, where the path logged may have different
capitalisation than the one used in the system test.

* Use custom version of fsnotify with recursion support

* Added system-test for the file_integrity module

* Auditbeat: Fix deadlock in fsnotify under Windows

Under Windows, directories to be watched need to be installed after the
event consumer loop is started. Otherwise there's a chance of a
potential deadlock, as fsnotify under Windows uses a single goroutine to
deliver events and install watches. If the channel it uses to deliver
events is full, it will block and won't be able to install further
watches.

* Auditbeat: Use OS-support for recursive watches

This patch enables OS-supported recursive watching in fsnotify if it is
available. Currently only supported under Windows via our custom
fsnotify fork.

Fixes #6864
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant