Skip to content

realpath() And FSEvents

Bryan Jones edited this page Jun 10, 2015 · 15 revisions

Background

FSEvents is an Apple API that notifies apps when files change in specific folders. Back in 2011, we began receiving reports from users that our apps were not responding to file-changes in certain folders.

When we examined the problem, it turned out that FSEvents was not notifying our apps about file-change events in these "broken" folders even though we asked for them. For 4 years, we assumed this was a bug in FSEvents and attempted to get Apple to fix it. Apple refused to acknowledge the problem as a bug.

The situation was made worse because the "broken" folders are rare and randomly change. One day a folder would work, the next it would not. Thus far, we have not determined what, exactly, "breaks" a folder.

However, in May 2015, we discovered that the root cause of the problem is NOT FSEvents itself, but rather a system function deep in OS X called realpath(). FSEvents uses this system function.


Why FSEvents Fails

FSEvents is case-sensitive. If you tell it to watch the path /Users/Folder, but change a file in the path /Users/folder, you will NOT receive a notification about that file-change event because, to FSEvents, these are two separate paths.

Behind the scenes, the OS X kernel writes file-events to the special output stream at /dev/fsevents. The FSEvents daemon (fseventsd) is responsible for reading this stream of events and notifying listeners about events they care about. The breakdown in FSEvents happens because the kernel writes a path with different capitalization than the FSEvents framework uses.


The Failure Process

Suppose a folder exists on disk at the path: /Users/Folder. When you create an FSEvents stream and pass this path, the FSEvents framework calls a system function named realpath() to "canonicalize" the path. This function resolves symlinks, relative path components, etc.

However: realpath() also walks through each path component and calls a lower-level function named getattrlist to obtain the actual name of the folder. This function, in turn, calls vnop_getattrlist into the non-open-source HFS+ driver. This is the root of the problem

For some folders, this call returns the wrong capitalization. For example, if you run realpath() on /Users/Folder it might return /users/Folder.

Everything else on OS X, including the kernel's output to /dev/fsevents, Apple's higher-level path APIs such as NSURL's -initWithFilePath and NSFileManager's path APIs all return a path with the same capitalization. Only the path returned by the realpath() function has a different capitalization.

So, the bottom line is that the kernel ends up writing events to /dev/fsevents using the path /Users/folder and FSEvents (because it "cleans" the path using the realpath() function) is listening to the path /users/folder. So FSEvents will never send any events for this path.


The Failure, Illustrated

This is an actual example. Here, the "broken" folder is com.apple.CloudDocs, deep inside my ~/Library folder. The kernel and all of the higher-level path APIs return a capitalized CloudDocs. The realpath() function, however, returns a lowercase clouddocs:

Note: in the image above, the last line of Xcode console output is obtained by calling FSEventStreamCopyPathsBeingWatched() on the stream. The paths reported below the dashed line are coming directly from the FSEvents framework.


Verification

To verify the problem, we looked at Apple's current implementation of the realpath() function here: http://www.opensource.apple.com/source/Libc/Libc-1044.1.2/stdlib/FreeBSD/realpath.c

It differs massively from the old BSD realpath() function, which is here: http://www.opensource.apple.com/source/Libc/Libc-498.1.7/stdlib/FreeBSD/realpath.c

Notice that the BSD version does not call down into the filesystem drivers to verify the name of each path component. It simply resolves ., .., and symlinks.

So, using @rentzsch's "mach_override" library (https://github.com/rentzsch/mach_override), we FORCED the FSEvents framework to use the old BSD version of the realpath() function rather than Apple's current implementation. This solved the problem, proving that it is indeed a failure in realpath() that causes this. (More exactly, it's a failure in one of the underlying calls that realpath() makes. See the next section for details.)


Apple Verifies This Is A Bug

After explaining this situation to a Developer Technical Support Engineer named Kevin Elliott at Apple, he verified that this a filesystem-corruption bug and gave me additional information:

The Original, Speculated Cause

The HFS+ format stores the names of files and folders in two different structures: HFSPlusCatalogThread and HFSPlusCatalogKey and some sequence of unknown events caused the letter-casing for these broken folders to differ between these two storage structures—one has /folder and the other has /Folder. All of Apple's higher-level APIs go through a shared lower-level API layer that accesses one of these storage structures. realpath(), on the other hand, accesses the other through the vnop_getattrlist call, which physically locks that catalog file and reads the folder name directly from disk. This explains why realpath() returns one case but all other APIs (and the kernel itself) return a different case.

The Real Cause

Kevin filed a Radar on this (as did I) and both Radars were closed as duplicates of a deeper bug that Apple has been aware of for some time. The actual problem is not in HFS, but rather the VFS layer. The basic idea is that a request for a given name value can succeed but return a different value than the original request. In some cases, the VFS layer is caching the original name passed in from user-space rather than using the correct name returned by the filesystem. HFS+ is one place this occurs, but it's a bigger deal in non-ASCII languages in UTF-8, where the filesystem can return a completely different binary representation for some filenames.

There is no word on when this issue might be addressed. This is the first time, however, that anyone has shown that the issue breaks FSEvents.


Tools To Diagnose & Debug This Problem

FSLogger

This command-line tool simply logs the exact output from the kernel to /dev/fsevents. It was originally written by Amit Singh. I have made a few edits required to build the project on OS 10.10. The source code is available here: http://incident57.com/fseventsbug/fslogger.zip Simply build the project with Xcode, then run the resulting command line tool as sudo (required for access to /dev/fsevents) This will allow you to see the paths that the kernel is writing for each event.

Locating Broken Folders

@andreyvit has a tool that can scan for broken folders, available here: https://github.com/andreyvit/find-fsevents-bugs Download the source, run make and then run the resulting command line tool like this:

find-fsevents-bugs /Users/bdkjones

Replace "bdkjones" with the name of your home folder.

IMPORTANT: You MUST pass a correctly-capitialized path to this tool, or it will report many false positives. Use the capitalization you see in the Finder. The tool works by comparing the output of realpath() with the path returned by resolving an FSRef alias. When a folder name has different capitalization in each case, it is broken.

Note: broken folders are rare and we do not currently know what causes them. Dropbox appears to exacerbate the issue, as does any folder that is frequently written to via network syncing (such as the iCloud folder in the illustrated example above).

Note 2: broken folders also randomly "repair" themselves. Once you find one, be careful about writing to the folder, opening apps that access it, or running disk repair operations. You may lose the broken folder and won't have anything to test against! On the other hand, some folders are very stubbornly broken and the only absolute way to fix them is to rename the broken folder, create a new folder with the old name, then move all the contents from the renamed broken folder into the new folder. We think this works because it creates an entirely new inode and filesystem object on disk.


An Apple-Approved Workaround

Using the low-level rename() function, change the name of the broken folder to anything else, such as folder-broken and then use the same function to change it back to the original name. To determine the original name with the correct letter-casing, use the high-level APIs such as NSFileManager or NSURL, which go a different route than realpath() to retrieve file information off disk.

Note: You must rename the exact broken folder. In other words, renaming a subfolder of a broken folder will not fix the problem. Example:

Path 1: /Users/john/Documents/project/subfolder
Path 2: /Users/john/documents/project/subfolder

In this case, you must rename Documents. If the broken folder is one that would require privilege-escalation to rename (such as /Users), Apple recommends presenting a message to the user with instructions to do that in the Terminal.

Note: You must rename the broken folder to a temporary name and back. Although rename() will allow you to specify the same path for the to and from parameters and return a "success" result when called this way, no rename operation actually hits the disk and so the filesystem objects are not modified.

Determining Which Folders Are Broken

To do this, first retrieve the path of a folder using NSURL's -initWithFilePath: method. This method should always supply the "correct" letter-casing. Next, call realpath() on the path you got from NSURL. Split each path into an array of path components separated by /. Then, walk through each component and perform a case-sensitive compare between the component as specified by NSURL and as specified by realpath(). If a component differs, that's a broken folder you must rename.


Another, More Risky Workaround

@andreyvit has created a simple, one-file fix available here: https://github.com/andreyvit/FSEventsFix

This workaround takes Apple's current implementation of realpath(), but skips the part where it consults the low-level filesystem drivers to verify each path component name. It then uses Facebook's Fishhook project (https://github.com/facebook/fishhook) to force FSEvents to use our modified version of realpath() instead of the one that ships as part of OS X.

Kevin, the Apple Engineer, has strongly warned us that he does not recommend replacing realpath() with our own implementation. Among the reasons why:

  1. It is not possible to replace the function for just FSEvents. If we swap it, it's swapped for every system framework. While FSEvents may work correctly, there's no telling what other frameworks may do.

  2. Apple's implementation was designed to handle non-ASCII characters in file/folder names, which may be one of the reasons the function calls down to the filesystem to verify exact names.

  3. HFS+, unlike other filesystems, supports hard links to directories.

In short, there is likely a very good reason Apple's current realpath() implementation calls down to the filesystem and it is not safe to assume we can skip that. There are likely many edge cases where a replacement realpath() would break down.


Credits

Diagnosing this issue has been a 5-year process. You can see all of the original people involved in this thread: https://github.com/thibaudgg/rb-fsevent/issues/10

@andreyvit deserves special credit for writing a workaround. @rentzsch's and Facebook's function-swapping libraries proved invaluable in confirming the problem. And Kevin Elliott, a DTS Engineer at Apple deserves an enormous Thank You for sticking with me through a dozen plus emails and finally believing that something was broken in OS X!

Clone this wiki locally