-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rework record logic #308
rework record logic #308
Conversation
if [nil, '', '.'].include?(dirname) && basename == '.' | ||
@paths[dir.to_s]['.'] = Hash[children.map { |child| [child, {} ] } ] | ||
else | ||
@paths[dir.to_s][rel_path] ||= {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think the children here is needed because they'll be iterated during _fast_build
anyway.
Also Directory::scan
would touch each entry populating new directories and their contents.
Also might be a mistake here either way as it only does the children for the root folder?
Lastly the bug with the moving of a folder not showing the old folder in the removed list is still present as the folder itself is not added to its parent directory's hash. This was missing from my patch too. I did a test where in add_dir
it adds the directory hash as well as adding itself to the hash of the parent directory and that seemed to work.
So in essencing adding directory takes two actions:
- Set up hash for contents
- Add it to its parent directory's contents hash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_fast_build
should set things up so that '.' actually contains entries when using dir_entries
- that's what add_dir
is for here. This is for the previous
to return something in Directory.scan
on the first call.
Also might be a mistake here either way as it only does the children for the root folder?
That's why I marked it with TODOs - it should be part of the initial build only.
Lastly the bug with the moving of a folder not showing the old folder in the removed list is still present
This needs a separate acceptance test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above I feel there's separation of thought between '.' and other directories that I don't understand why need treating differently.
Other folders are not populated here, they are populated as each entry is examined in _fast_try_file for example. So why does '.' need populating here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@driskell - I'm just on the fence about it, because so many other things are wrong here. It simplifies the code, but complicates the internal structure - which can get in the way if Record
is to handle directories. The Record
tests need rework anyway.
Just assume that '.' is fine for now - and I take back what I said about it.
1 similar comment
73bacf3
to
a9e8bb9
Compare
a9e8bb9
to
9281b11
Compare
120f26d
to
67f50e7
Compare
1 similar comment
I haven't forgotten entirely about this. Just very busy and vacation time! I hope to find some time to look through this and give it a trial run soon. Thanks for all the hard work. |
@driskell - I think the next step would be adding directory timestamps. And with TCP out of the picture, it should be easier to make any patches you need (like the ones you already have). |
root['.'] = entries | ||
else | ||
root[rel_path] ||= {} | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails to add the directory name to the parent folder's entry list.
For example, if we have a simple directory containing one folder, path
, and within that, folder
, and within that a file called README
, the resulting tree would be:
'.' => {'path' => {}}
'path' => {}
'path/folder' => {'README' => {<file-data>}}
It should be:
'.' => {'path' => {}}
'path' => {'folder' => {}}
'path/folder' => {'README' => {<file-data>}}
Otherwise when Directory::scan
is processing deletion for path
, it will never recurse folder
and not actually raise any deletion events, which it should do for README
This seems to work, and I'll look at making a test unless you're able to find it easier?
def add_dir(dir, rel_path)
rel_path = '.' if [nil, '', '.'].include? rel_path
dirname, basename = Pathname(rel_path).split.map(&:to_s)
basename = '.' if [nil, '', '.'].include? basename
root = (@paths[dir.to_s] ||= {})
dirname = '.' if [nil, '', '.'].include?(dirname)
entries = (root[dirname] || {})
entries.merge!(basename => {}) if basename != '.'
root[dirname] = entries
end
I'll leave you with the above for the moment while I test out the recursion processing. I've pulled my own recursion descoping onto the top of your changes and so far so good. Just needs tests I guess! |
I have one question though: Should events for empty directories that are created and removed be produced? I'm not 100% sure if the linux side of things does or if it is meant to. If it's not meant to - then that's another feature for another day. |
Would you advise starting this again on the new 3.x master branch? |
Absolutely! I pretty much want to abandon all active work on 2.x - the upgrade for users should be painless (except for TCP, which has to be redesigned anyway), so there's nothing to lose. Also, refactoring and fixing (including adding specs) should be easier, since each Record object represents a single directory (e.g.
Ultimately, I'd like to create a separate gem to handle dealing with events in a generic way. The idea is: the backend adapter generates an "invalidation" event, saying e.g. "directory foo has changed and everything inside it" or "something was added to directory foo" or more specific, e.g. "file bar's content was changed". And based on the invalidation, the memory "snapshot" of the filesystem would be invalidated and updated (better backends will have more specific events, which means less invalidating, which means faster updates). So ideally, the Record object should always represent what's physically on the disk - and events alone should be sufficient enough for the Record to at least know what to scan and update (so if at any time the internal Record is different from what's on the disk - and there are no pending events - then there's a bug somewhere). The sequence would be:
The idea is for Listen to respond just to added/removed/changed files currently. Whether it detects added/removed/changed directories is more in the area of optimization. Ideally, the end result should reflect what's on the filesystem, so even if the directories aren't "reported" yet, the closer to the filesystem things end up, the better (because it allows optimizing and bugfixing). In short - once Listen finishes handling adapter events, the Record should match what's on the filesystem, or it's a bug. Let me know if you need anything or you have something working for Listen 3 - anything that can be considered a bug will have high priority anyway (I'll want to add tests and release ASAP). Oh, and since Celluloid is no longer a dependency, you can refactor as much as you like (the _fast prefixes in method names no longer make sense, since no locking is necessary and processing is sequential too). And since TCP is gone, you can access the filesystem at any time you want for any reason (the goal is correctness first - then we can use this correctness to properly optimize later). |
Nice. I'll start some work on merging in these changes into the master branch and I'll formulate a list of specs that need creating to prevent regression on the cases I know about, and ultimately document them! For the invalidation method - that sounds perfect - though I would ensure there is a :invalidate_directory as well as :invalidate_tree. Though I believe I would be accurate in saying the only backend that would need :invalidate_tree is the polling backend and all other backends correctly state where in a tree the change is. I'll put something forward when I can. Though holidays are coming so may be a couple weeks! I'll keep a watch on the repository too in case any other changes occur that might conflict or supersede the plan we've outlined here. |
Yes - very much so. In fact, the types of invalidation events depend entirely on whatever the backend provides. E.g. Polling is pretty much "invalidate everything". FSEvent has a "files" mode, which I believe isn't the default because is spams with a mass of events - but if it was enabled, the invalidations would be much more fine-grained than otherwise. Invalidations would abstract away the adapters - and it would be up to the Record to decide whether it's worth micro-optimizing or just scanning the whole tree. They would also help avoiding breaking things, since I wouldn't have to care how OSX adapter works, as long as it provide an invalidation with enough "hints" on what could've changed. And they would help understand other drivers too, e.g. an fsevent on a dir doesn't say much, but saying that a directories contents (children - files or dirs) have changed allows me to simulate the same scenario under Linux - without needing to understand how the OSX adapter works in specific scenarios. So invalidations would be a "common language" among drivers - so adapters would benefit from each other's optimizations (but without the adapter-specific complexity). I don't plan any drastic changes in Listen (3.x was already too much work), so take your time with this - I'll probably only be fixing bugs if they are discovered. |
A lot of work has been put into this. The current code is way too difficult to test, so it would be a waste of your time to improve this. I need to switch to a new architure in Listen first: #381 But the outcome should be: much easier to optimize and maintain things on OSX. The upside is: if you're even still interested in this, the new architecture will allow new features, such as:
|
@e2 please rebase on master if you think this work is worth continuing :) |
@e2 could you revive this PR? |
Fixed tests and Record#build logic for #307