Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What does server state contain exactly? #3

Open
dimakuv opened this issue Mar 12, 2021 · 6 comments
Open

What does server state contain exactly? #3

dimakuv opened this issue Mar 12, 2021 · 6 comments

Comments

@dimakuv
Copy link

dimakuv commented Mar 12, 2021

From the README and diagrams, it's hard to understand what kind of Server state is kept. Also, it's not obvious what is the relationship between Dentries and Handles.

From what I understand:

  • Server keeps all known dentries; they are never removed
    • Each dentry has a canonical path + some file metadata
    • Each dentry has a list of associated handles (? I imagine this is needed to propagate things like "file was removed")
  • Server keeps all known handles
    • Handles are removed when closed by all clients
    • Each handle references a corresponding dentry (which may be in "negative" state if file was removed)

Also:

  • Several clients may use the same handle (depicted on the second diagram)
    • All their accesses to this handle will be synchronized by the server (including the position pointer)
  • Several clients may use two handles backed by the same dentry (not depicted on diagrams, but I guess similar to the first diagram)
    • All their accesses to these handles with the same underlying dentry will be synchronized by the server (but not the position pointer)
  • Clients that use handles backed by different dentries are never synchronized by the server (except for corner cases of rename and sendfile and maybe some more).

Is this understanding correct?

@dimakuv
Copy link
Author

dimakuv commented Mar 12, 2021

Does it also mean that forked children don't need to obtain Dentries from the parent (which currently happens in Graphene)? Because we have a centralized storage of all Dentries in Server.

@pwmarcz
Copy link
Owner

pwmarcz commented Mar 12, 2021

From the README and diagrams, it's hard to understand what kind of Server state is kept.

My idea was that the server doesn't store the full objects, just locks, along with the information that needs to be synchronized. The diagrams actually contain full information being exchanged: these are object IDs, and (in case of file handles) the position.

For Dentries especially, actually no data needs to be kept: the assumption is that a dentry can be invalidated, and the client will reload it once it actually needs it. I'm not sure if that's a good idea, but it seems in line with the fact that dentries are only a cache, not source of truth.

To go through your post:

Server keeps all known dentries; they are never removed

Server keeps only "sync handles" / locks[1] for known dentries, i.e. information which client holds them. They are removed once the last client forgets them, or exits. (source code: deleting unused handle)

Each dentry has a canonical path + some file metadata

Yes, but that metadata can be stored on client only, and reloaded from the filesystem if necessary.

Each dentry has a list of associated handles (? I imagine this is needed to propagate things like "file was removed")

Good point... This design doesn't handle unlinking files that are open. I guess a dentry could have a handle_count, and it would be decreased on handle close?

Server keeps all known handles

Same as above: there's no need to keep/synchronize information about all handles, only locking and the attributes that can be changed by client.

Handles are removed when closed by all clients

Yes. When the last client closes a handle with a given ID, it's automatically removed by the server.

Each handle references a corresponding dentry (which may be in "negative" state if file was removed)
Several clients may use the same handle (depicted on the second diagram)
All their accesses to this handle will be synchronized by the server (including the position pointer)
Several clients may use two handles backed by the same dentry

Yes.

Clients that use handles backed by different dentries are never synchronized by the server (except for corner cases of rename and sendfile and maybe some more).

Yes, except for some initial communication with a server, to establish that they are using a given handle and dentry.

Does it also mean that forked children don't need to obtain Dentries from the parent (which currently happens in Graphene)? Because we have a centralized storage of all Dentries in Server.

Right now, they still need to, because the server actually doesn't store full dentry data. But maybe that's a mistake, and any object (handle/dentry) should actually be fully rebuildable from server data? I thought that's unnecessary complexity, but we already have it in the checkpointing system.

@pwmarcz
Copy link
Owner

pwmarcz commented Mar 12, 2021

So yeah... it looks like I need to think about how it all fits with fork / checkpointing system.

I would encourage you to take a look at the source (or try running it), but I realize you probably don't have too much time today. In any case, thank you very much for the questions, they're very helpful!

@mkow
Copy link

mkow commented Mar 18, 2021

Good point... This design doesn't handle unlinking files that are open. I guess a dentry could have a handle_count, and it would be decreased on handle close?

Hmm, if you don't have inodes, how would you handle the following?

  1. A creates and opens asdf.txt.
  2. A unlinks asdf.txt, but still keeps the FD.
  3. B creates and opens asdf.txt.
  4. A i B should work on different files at this point.

@pwmarcz
Copy link
Owner

pwmarcz commented Mar 19, 2021

So, first of all: regardless of internal representation, how do we handle this for a file mounted from the host? We cannot delete it immediately.

I know of a solution for a similar problem in FUSE: when deleting a file that is open, rename it to <file>.fuse_hiddenXXXX, and only really delete it when we close all handles to it.

In terms of internal represenation, I think we can do a similar thing with dentries: for instance, mark the old dentry as "hidden" and superseded by the new one. When opening a new file, you would traverse this link, same as you traverse a symlink.

I agree it sounds pretty hacky, and might make a good case for introducing inodes. On the other hand, I'm still not sure if it's justification enough, as it would make the server state more complicated.

@mkow
Copy link

mkow commented Mar 20, 2021

I'm fine with this hack, just please handle this scenario correctly :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants