Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fixed lfs_dir_fetchmatch not understanding overwritten tags
Sometimes small, single line code change hides behind it a complicated story. This is one of those times. If you look at this diff, you may note that this is a case of lfs_dir_fetchmatch not correctly handling a tag that invalidates a callback used to search for some condition, in this case a search for a parent, which is invalidated by a later dir tag overwritting the previous dir pair. But how can this happen? Dir-pair-tags are only overwritten during relocations (when a block goes bad or exceeds the block_cycles config option for dynamic wear-leveling). Other dir operations create new directory entries. And the only lfs_dir_fetchmatch condition that relies on overwrites (as opposed to proper deletes) is when we need to find a directory's parent, an operation that only occurs during a _different_ relocation. And a false _positive_, can only happen if we don't have a parent. Which is really unlikely when we search for directory parents! This bug and minimal test case was found by Matthew Renzelmann. In a unfortunate series of events, first a file creation causes a directory split to occur. This creates a new, orphaned metadata-pair containing our new file. However, the revision count on this metadata-pair indicates the pair is due for relocation as a part of wear-leveling. Normally, this is fine, even though this metadata-pair has no parent, the lfs_dir_find should return ENOENT and continue without error. However, here we get hit by our fetchmatch bug. A previous, unrelated relocation overwrites a pair which just happens to contain the block allocated for a new metadata-pair. When we search for a parent, lfs_dir_fetchmatch incorrectly finds this old, outdated metadata pair and incorrectly tells our orphan it's found its parent. As you can imagine the orphan's dissapointment must be immense. So an unfortunately timed dir split triggers a relocation which incorrectly finds a previously written parent that has been outdated by another relocation. As a solution we can outdate our found tag if it is overwritten by an exact match during lfs_dir_fetchmatch. As a part of this I started adding a new set of tests: tests/test_relocations, for aggressive relocations tests. This is already by appended to by another PR. I suspect relocations is relatively under-tested and is becoming more important due to recent improvements in wear-leveling.
- Loading branch information
Hi Geky, I found a bug about infinite loop, related to this line, and create an issue #624 . If not call lfs_alloc_ack here, this infinite loop will not happen.
My colleague make a pull request #620 , which removes lfs_alloc_ack . But I don't think the repairment is the best. Could I ask if calling lfs_alloc_ack here is necessary for fixing 'fs_dir_fetchmatch not understanding overwritten tags' ? @geky