-
-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vfs
API and memory_vfs
implementation
#639
Add vfs
API and memory_vfs
implementation
#639
Conversation
Signed-off-by: Klim Tsoutsman <klim@tsoutsman.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall: great work imo 👌
/// Gets the node at the `path`. | ||
/// | ||
/// The path consists of a file system id, followed by a colon, followed by the | ||
/// path. For example, `tmp:/a/b` or `nvme:/foo/bar`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of colon here could be debated, I'm not particularly against it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said in the PR, I just copied Redox but this is definitely up for debate if anyone has any better ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we used the first path node name as fs name (/root-fs/usr/bin/gcc
or /usb-drive-2/photos/camping-trip-2016
for instance), that would ease filesystem enumeration but could also result in confusement and vulnerabilities (php's path traversal issues come to mind).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd probably prefer to have a more Unix-style set of mount points, but for simplicity's sake it doesn't have to be arbitrary. We could have all FSes mounted at /
or /mnt/
, for example.
/// A file system path. | ||
#[derive(Clone, Copy, Debug, Eq, Hash, PartialEq)] | ||
pub struct Path<'a> { | ||
inner: &'a str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could allow filesystems to use slashes in node names if we used an iterable here directly rather than a consolidated path, a vec for instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing slashes in node names sounds like asking for trouble. Especially because the std
fs API uses string paths and so we'd somehow have to escape slashes in node names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we cannot allow slashes in fs node names, that's a fairly typical limitation.
Signed-off-by: Klim Tsoutsman <klim@tsoutsman.com>
Signed-off-by: Klim Tsoutsman <klim@tsoutsman.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start so far. Agreed that we obviously do need a way to register and support multiple distinct filesystem instances. I also like the idea of forcing interior mutability from the primary VFS interface. There is one major issue I see that requires a redesign: we should not and cannot use Arc
everywhere.
Aside from being inefficient, copious usage of Arc
only really works in the context of an in-memory FS; a real physical FS wouldn't be able to use Arc
s to represent a relationship between nodes when stored on disk, e.g., a directory entry containing multiple file entries. For that, inodes are commonly used, but we don't have to explicitly include the concept of inodes in the VFS layer. Instead, we just need an abstract way to represent the concept of one entry that refers to one or more other entries.
So i'd recommend that we create an actual concrete outer type for File
and Directory
instead of using trait objects like Arc<dyn File>
and Arc<dyn Directory>
, and then the inner fields of that struct are dependent on what the real filesystem needs to represent the inter-node relationships. For a simple heap-based fs (like the currentheapfile
), the inner field of the top-level Directory
could indeed by an Arc
reference to a set of directory entries, which would make sense because it's an efficient way to represent those relationships when you know everything is in heap memory.
Other than that, I left a few small comments on existing discussions.
Since you're focused on other things right now, I'd be happy to take a crack at extending this PR with a new attempt at the design I've described above. Feedback also welcome here or on Discord.
Hm, yea that's definitely an issue. Even if we have concrete outer types, eventually we will have to use trait objects. Generics won't work because the type of filesystem returned by something like I don't have any particularly good ideas at the moment so definitely feel free to extend the PR. |
I've been thinking about this a bit, and I've come up with a potential solution. We can't guarantee that the nodes returned from If we do something like this: pub trait FileSystemImplementation {
type File: File;
type FileRef: FileRef<File = Self::File>;
type Directory: Directory;
fn root(&self) -> Self::Directory;
}
pub trait FileRef {
type File: File;
fn upgrade(self) -> Option<Arc<dyn File>>;
}
pub struct FileSystem {
inner: Box<
dyn FileSystemImplementation<
File = dyn File,
FileRef = dyn FileRef<File = dyn File>,
Directory = dyn Directory,
>,
>,
opened: HashMap<Path, OpenFile>,
}
pub struct OpenFile {
inner: Arc<dyn File>,
}
pub trait File {
// ...
}
pub trait Directory {
// ...
}
pub struct Path {
// ..
} We can enforce that files aren't deleted if they are open. File system operations would be done on the
Although this implementation also extensively uses Hypothetically, we can also just return a |
I've played around with this some more, and I don't think it will be possible till traits with GATs are object safe. |
This PR reworks the file system API. It also includes a port of
memfs
. The new API should also be simple to integrate withfatfs
and other physical file systems.The API uses forest mounting (i.e. Windows flavour), which is explained in
vfs
's crate-level documentation. I chose this mounting method because it's simpler to implement.Key differences from
fs_node
:FileSystem
trait - The critical thing lacking fromfs_node
is a clear distinction between different file systems, which is what this trait and the mount points aim to solve.insert_file
andinsert_directory
(and their entry counterparts) take a path (or name) and return a new empty node. This means consumers can't add nodes from one file system into another, e.g. memory file into a FAT directory. Copying should be implemented by creating a new node and then writing all the bytes from the source file into the destination file.Some issues:
.
,..
,/
get_node
- I chose thefile_system:/path/to/file
because it's simple and is similar to what Redox uses, but I don't have any excellent reasons to use it..
and..
in paths relies on thetrait_upcasting
feature, which is incomplete. This should be resolved soon Stablizetrait_upcasting
feature rust-lang/rust#101718 (aaaaaaand it's closed :))