Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract out ext/fsevent_watch code to separate repo #38

Closed
lancejpollard opened this issue Dec 1, 2012 · 6 comments
Closed

Abstract out ext/fsevent_watch code to separate repo #38

lancejpollard opened this issue Dec 1, 2012 · 6 comments

Comments

@lancejpollard
Copy link

Any chance you guys would want to abstract out the Xcode project to a standalone repo so it could be used easily with node.js as well? That would be really helpful.

The node.js watcher use kqueue which is really slow on large directories, but rb-fsevent seems to be very fast. I am currently just streaming the output from a simple ruby script that gets spawned from node.js, like this:

It would be awesome if your C fsevent code could be used in node.js without requiring ruby. Let me know if there's anything I can do to help make this happen, I don't know C but am definitely willing to learn.

@ttilley
Copy link
Member

ttilley commented Dec 2, 2012

I was actually maintaining it as a standalone application for a bit. It is an actual bog-standard unix application complete with useful --help docs and multiple output formats depending on what you're attempting to do with it. The ruby code just wraps the subprocess. Check out [path_to_gem]/bin/fsevent_watch and you essentially have what you desire.

My (out of date) standalone repo: https://github.com/ttilley/fsevent_watch

Help output:

fsevent_watch 0.1.2
Compiled Nov 29 2012 04:27:57

A flexible command-line interface for the FSEvents API

Usage: fsevent_watch [OPTIONS]... [PATHS]...

  -h, --help                you're looking at it
  -V, --version             print version number and exit
  -s, --since-when=EventID  fire historical events since ID
  -l, --latency=seconds     latency period (default='0.5')
  -n, --no-defer            enable no-defer latency modifier
  -r, --watch-root          watch for when the root path has changed
  -F, --file-events         provide file level event data
  -f, --format=name         output format (classic, niw, 
                                           tnetstring, otnetstring)

The formats aren't very well documented... classic will get you the least level of detail, niw will get you as much data as one usually needs/wants, and tnetstring/otnetstring return full level of detail (and vary only in that otnetstring format is better optimized for streaming data).

Perhaps the formats would be best described by the code that produces them.

Classic format merely outputs the paths an event has been fired on, : delimited, followed by a newline to signify message completion:

// original output format for rb-fsevent
static void classic_output_format(size_t numEvents,
                                  char** paths)
{
  for (size_t i = 0; i < numEvents; i++) {
    fprintf(stdout, "%s:", paths[i]);
  }
  fprintf(stdout, "\n");
}

NIW format is named after the fork of rb-fsevent where it was introduced. It outputs event flags, event IDs, and paths as : delimited fields terminated by a newline and followed by a bare newline to signify message completion:

// output format used in the Yoshimasa Niwa branch of rb-fsevent
static void niw_output_format(size_t numEvents,
                              char** paths,
                              const FSEventStreamEventFlags eventFlags[],
                              const FSEventStreamEventId eventIds[])
{
  for (size_t i = 0; i < numEvents; i++) {
    fprintf(stdout, "%lu:%llu:%s\n",
            (unsigned long)eventFlags[i],
            (unsigned long long)eventIds[i],
            paths[i]);
  }
  fprintf(stdout, "\n");
}

tnetstring and otnetstring output serialized objects as tagged netstrings that can be deserialized into native form quite easily. The result of deserialization should be a hash containing an array of hashes. Each hash will have the keys path, flags, and id. The wrapping hash will have this array under the key events, as well as a numEvents key containing the number of events described by the structure. We don't need to care about signaling the end of a message... Declaring the length of a message in bytes is part of the tagged netstring format (and also why otnetstring is better for streaming, as it contains type information for the data this length describes upfront rather than requiring you to buffer the entire message before being told what this buffer of data describes).

static void tstring_output_format(size_t numEvents,
                                  char** paths,
                                  const FSEventStreamEventFlags eventFlags[],
                                  const FSEventStreamEventId eventIds[],
                                  TSITStringFormat format)
{
  CFMutableArrayRef events = CFArrayCreateMutable(kCFAllocatorDefault,
                             0, &kCFTypeArrayCallBacks);

  for (size_t i = 0; i < numEvents; i++) {
    CFMutableDictionaryRef event = CFDictionaryCreateMutable(kCFAllocatorDefault,
                                   0,
                                   &kCFTypeDictionaryKeyCallBacks,
                                   &kCFTypeDictionaryValueCallBacks);

    CFStringRef path = CFStringCreateWithBytes(kCFAllocatorDefault,
                       (const UInt8*)paths[i],
                       (CFIndex)strlen(paths[i]),
                       kCFStringEncodingUTF8,
                       false);
    CFDictionarySetValue(event, CFSTR("path"), path);

    CFNumberRef flags = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &eventFlags[i]);
    CFDictionarySetValue(event, CFSTR("flags"), flags);

    CFNumberRef ident = CFNumberCreate(kCFAllocatorDefault, kCFNumberLongLongType, &eventIds[i]);
    CFDictionarySetValue(event, CFSTR("id"), ident);

    CFArrayAppendValue(events, event);

    CFRelease(event);
    CFRelease(path);
    CFRelease(flags);
    CFRelease(ident);
  }

  CFMutableDictionaryRef meta = CFDictionaryCreateMutable(kCFAllocatorDefault,
                                0,
                                &kCFTypeDictionaryKeyCallBacks,
                                &kCFTypeDictionaryValueCallBacks);
  CFDictionarySetValue(meta, CFSTR("events"), events);

  CFNumberRef num = CFNumberCreate(kCFAllocatorDefault, kCFNumberCFIndexType, &numEvents);
  CFDictionarySetValue(meta, CFSTR("numEvents"), num);

  CFDataRef data = TSICTStringCreateRenderedDataFromObjectWithFormat(meta, format);
  fprintf(stdout, "%s", CFDataGetBytePtr(data));

  CFRelease(events);
  CFRelease(num);
  CFRelease(meta);
  CFRelease(data);
}

For more info on otnetstrings, see the original implementation in ruby (and the only other implementation I am aware of): https://github.com/rkh/otnetstring

@ttilley
Copy link
Member

ttilley commented Dec 2, 2012

The flags, even in tagged netstring format, are returned as a number. This isn't the most helpful I suppose (at least for the netstring object format where things were intended to map directly to a simple structure). The flags are as follows:

/*
 *  FSEventStreamEventFlags
 *  
 *  Discussion:
 *    Flags that can be passed to your FSEventStreamCallback function.
 */
enum {

  /*
   * There was some change in the directory at the specific path
   * supplied in this event.
   */
  kFSEventStreamEventFlagNone   = 0x00000000,

  /*
   * Your application must rescan not just the directory given in the
   * event, but all its children, recursively. This can happen if there
   * was a problem whereby events were coalesced hierarchically. For
   * example, an event in /Users/jsmith/Music and an event in
   * /Users/jsmith/Pictures might be coalesced into an event with this
   * flag set and path=/Users/jsmith. If this flag is set you may be
   * able to get an idea of whether the bottleneck happened in the
   * kernel (less likely) or in your client (more likely) by checking
   * for the presence of the informational flags
   * kFSEventStreamEventFlagUserDropped or
   * kFSEventStreamEventFlagKernelDropped.
   */
  kFSEventStreamEventFlagMustScanSubDirs = 0x00000001,

  /*
   * The kFSEventStreamEventFlagUserDropped or
   * kFSEventStreamEventFlagKernelDropped flags may be set in addition
   * to the kFSEventStreamEventFlagMustScanSubDirs flag to indicate
   * that a problem occurred in buffering the events (the particular
   * flag set indicates where the problem occurred) and that the client
   * must do a full scan of any directories (and their subdirectories,
   * recursively) being monitored by this stream. If you asked to
   * monitor multiple paths with this stream then you will be notified
   * about all of them. Your code need only check for the
   * kFSEventStreamEventFlagMustScanSubDirs flag; these flags (if
   * present) only provide information to help you diagnose the problem.
   */
  kFSEventStreamEventFlagUserDropped = 0x00000002,
  kFSEventStreamEventFlagKernelDropped = 0x00000004,

  /*
   * If kFSEventStreamEventFlagEventIdsWrapped is set, it means the
   * 64-bit event ID counter wrapped around. As a result,
   * previously-issued event ID's are no longer valid arguments for the
   * sinceWhen parameter of the FSEventStreamCreate...() functions.
   */
  kFSEventStreamEventFlagEventIdsWrapped = 0x00000008,

  /*
   * Denotes a sentinel event sent to mark the end of the "historical"
   * events sent as a result of specifying a sinceWhen value in the
   * FSEventStreamCreate...() call that created this event stream. (It
   * will not be sent if kFSEventStreamEventIdSinceNow was passed for
   * sinceWhen.) After invoking the client's callback with all the
   * "historical" events that occurred before now, the client's
   * callback will be invoked with an event where the
   * kFSEventStreamEventFlagHistoryDone flag is set. The client should
   * ignore the path supplied in this callback.
   */
  kFSEventStreamEventFlagHistoryDone = 0x00000010,

  /*
   * Denotes a special event sent when there is a change to one of the
   * directories along the path to one of the directories you asked to
   * watch. When this flag is set, the event ID is zero and the path
   * corresponds to one of the paths you asked to watch (specifically,
   * the one that changed). The path may no longer exist because it or
   * one of its parents was deleted or renamed. Events with this flag
   * set will only be sent if you passed the flag
   * kFSEventStreamCreateFlagWatchRoot to FSEventStreamCreate...() when
   * you created the stream.
   */
  kFSEventStreamEventFlagRootChanged = 0x00000020,

  /*
   * Denotes a special event sent when a volume is mounted underneath
   * one of the paths being monitored. The path in the event is the
   * path to the newly-mounted volume. You will receive one of these
   * notifications for every volume mount event inside the kernel
   * (independent of DiskArbitration). Beware that a newly-mounted
   * volume could contain an arbitrarily large directory hierarchy.
   * Avoid pitfalls like triggering a recursive scan of a non-local
   * filesystem, which you can detect by checking for the absence of
   * the MNT_LOCAL flag in the f_flags returned by statfs(). Also be
   * aware of the MNT_DONTBROWSE flag that is set for volumes which
   * should not be displayed by user interface elements.
   */
  kFSEventStreamEventFlagMount  = 0x00000040,

  /*
   * Denotes a special event sent when a volume is unmounted underneath
   * one of the paths being monitored. The path in the event is the
   * path to the directory from which the volume was unmounted. You
   * will receive one of these notifications for every volume unmount
   * event inside the kernel. This is not a substitute for the
   * notifications provided by the DiskArbitration framework; you only
   * get notified after the unmount has occurred. Beware that
   * unmounting a volume could uncover an arbitrarily large directory
   * hierarchy, although Mac OS X never does that.
   */
  kFSEventStreamEventFlagUnmount = 0x00000080, /* These flags are only set if you specified the FileEvents*/
                                        /* flags when creating the stream.*/
  kFSEventStreamEventFlagItemCreated = 0x00000100,
  kFSEventStreamEventFlagItemRemoved = 0x00000200,
  kFSEventStreamEventFlagItemInodeMetaMod = 0x00000400,
  kFSEventStreamEventFlagItemRenamed = 0x00000800,
  kFSEventStreamEventFlagItemModified = 0x00001000,
  kFSEventStreamEventFlagItemFinderInfoMod = 0x00002000,
  kFSEventStreamEventFlagItemChangeOwner = 0x00004000,
  kFSEventStreamEventFlagItemXattrMod = 0x00008000,
  kFSEventStreamEventFlagItemIsFile = 0x00010000,
  kFSEventStreamEventFlagItemIsDir = 0x00020000,
  kFSEventStreamEventFlagItemIsSymlink = 0x00040000
};

@ttilley
Copy link
Member

ttilley commented Dec 2, 2012

My suggestion for using it from node.js is to just say fuck it and wrap the subprocess. Creating an extension for node looks like a painful process that requires in-depth knowledge of libuv, v8, node, and various other details.

Updating the standalone repo has been added as a personal TODO item, but I'm not exactly the most reliable sort. Luckily, most of what you're asking for I have already done. ;)

@thibaudgg
Copy link
Member

Seems definitely like the way to go, having the same fsevent_watch shared by node.js & ruby sounds good!

@ttilley
Copy link
Member

ttilley commented Dec 3, 2012

@viatropos thoughts? questions?

@lancejpollard
Copy link
Author

@ttilley For now then I'm just going to download the bin/fsevent_watch executable and add that to a node project like you're suggesting. Thanks for the very informative posts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants