Skip to content

Commit

Permalink
Remove the device and inode numbers from the API.
Browse files Browse the repository at this point in the history
As discussed [here], remove the fields which correspond to `st_dev`, `st_ino`,
and `d_ino` in POSIX from the stat and directory entry structs.

 - Device numbers assume the existence of a global device number space,
   which creates implicit relationships between otherwise unrelated
   components.

 - Not all filesystem implementations have these numbers. And some that
   do have these numbers require extra implementation cost to retrieve them.

 - These numbers leak potentially sensitive or identifying information from the
   underlying filesystem implementation.

In their place, provide some functions, `is-same-object`,
`metadata-hash`, and `metadata-hash-at`, for explicitly testing whether two
handles are the same file or have the same metadata, respectively. This doesn't
cover all possible use cases for device and inode numbers, but we can
add more functions as need arises.

[here]: #65 (comment)
  • Loading branch information
sunfishcode committed May 15, 2023
1 parent 3a05fcf commit 6c45b13
Show file tree
Hide file tree
Showing 2 changed files with 103 additions and 48 deletions.
82 changes: 56 additions & 26 deletions example-world.md
Original file line number Diff line number Diff line change
Expand Up @@ -429,9 +429,6 @@ filesystem. This does not apply to directories.
<h4><a name="link_count"><code>type link-count</code></a></h4>
<p><code>u64</code></p>
<p>Number of hard links to an inode.
<h4><a name="inode"><code>type inode</code></a></h4>
<p><code>u64</code></p>
<p>Filesystem object serial number that is unique within its file system.
<h4><a name="filesize"><code>type filesize</code></a></h4>
<p><code>u64</code></p>
<p>File size or length of a region within a file.
Expand Down Expand Up @@ -595,11 +592,6 @@ merely for alignment with POSIX.</p>
<p><code>u32</code></p>
<p>A stream of directory entries.
<p>This <a href="https://github.com/WebAssembly/WASI/blob/main/docs/WitInWasi.md#Streams">represents a stream of <code>dir-entry</code></a>.</p>
<h4><a name="device"><code>type device</code></a></h4>
<p><code>u64</code></p>
<p>Identifier for a device containing a file system. Can be used in
combination with `inode` to uniquely identify a file or directory in
the filesystem.
<h4><a name="descriptor_type"><code>enum descriptor-type</code></a></h4>
<p>The type of a filesystem object referenced by a descriptor.</p>
<p>Note: This was called <code>filetype</code> in earlier versions of WASI.</p>
Expand Down Expand Up @@ -644,14 +636,6 @@ any of the other types specified.
<h5>Record Fields</h5>
<ul>
<li>
<p><a name="directory_entry.inode"><a href="#inode"><code>inode</code></a></a>: option&lt;<a href="#inode"><a href="#inode"><code>inode</code></a></a>&gt;</p>
<p>The serial number of the object referred to by this directory entry.
May be none if the inode value is not known.
<p>When this is none, libc implementations might do an extra <a href="#stat_at"><code>stat-at</code></a>
call to retrieve the inode number to fill their <code>d_ino</code> fields, so
implementations which can set this to a non-none value should do so.</p>
</li>
<li>
<p><a name="directory_entry.type"><code>type</code></a>: <a href="#descriptor_type"><a href="#descriptor_type"><code>descriptor-type</code></a></a></p>
<p>The type of the file referred to by this directory entry.
</li>
Expand Down Expand Up @@ -749,14 +733,6 @@ with the filesystem.
<h5>Record Fields</h5>
<ul>
<li>
<p><a name="descriptor_stat.device"><a href="#device"><code>device</code></a></a>: <a href="#device"><a href="#device"><code>device</code></a></a></p>
<p>Device ID of device containing the file.
</li>
<li>
<p><a name="descriptor_stat.inode"><a href="#inode"><code>inode</code></a></a>: <a href="#inode"><a href="#inode"><code>inode</code></a></a></p>
<p>File serial number.
</li>
<li>
<p><a name="descriptor_stat.type"><code>type</code></a>: <a href="#descriptor_type"><a href="#descriptor_type"><code>descriptor-type</code></a></a></p>
<p>File type.
</li>
Expand Down Expand Up @@ -1047,7 +1023,11 @@ opened for writing.</p>
</ul>
<h4><a name="stat"><code>stat: func</code></a></h4>
<p>Return the attributes of an open file or directory.</p>
<p>Note: This is similar to <code>fstat</code> in POSIX.</p>
<p>Note: This is similar to <code>fstat</code> in POSIX, except that it does not return
device and inode information. For testing whether two descriptors refer to
the same underlying filesystem object, use <a href="#is_same_object"><code>is-same-object</code></a>. To obtain
additional data that can be used do determine whether a file has been
modified, use <a href="#metadata_hash"><code>metadata-hash</code></a>.</p>
<p>Note: This was called <code>fd_filestat_get</code> in earlier versions of WASI.</p>
<h5>Params</h5>
<ul>
Expand All @@ -1059,7 +1039,9 @@ opened for writing.</p>
</ul>
<h4><a name="stat_at"><code>stat-at: func</code></a></h4>
<p>Return the attributes of a file or directory.</p>
<p>Note: This is similar to <code>fstatat</code> in POSIX.</p>
<p>Note: This is similar to <code>fstatat</code> in POSIX, except that it does not
return device and inode information. See the <a href="#stat"><code>stat</code></a> description for a
discussion of alternatives.</p>
<p>Note: This was called <code>path_filestat_get</code> in earlier versions of WASI.</p>
<h5>Params</h5>
<ul>
Expand Down Expand Up @@ -1383,3 +1365,51 @@ be used.</p>
<ul>
<li><a name="drop_directory_entry_stream.this"><code>this</code></a>: <a href="#directory_entry_stream"><a href="#directory_entry_stream"><code>directory-entry-stream</code></a></a></li>
</ul>
<h4><a name="is_same_object"><code>is-same-object: func</code></a></h4>
<p>Test whether two descriptors refer to the same filesystem object.</p>
<p>In POSIX, this corresponds to testing whether the two descriptors have the
same device (<code>st_dev</code>) and inode (<code>st_ino</code> or <code>d_ino</code>) numbers.
wasi-filesystem does not expose device and inode numbers, so this function
may be used instead.</p>
<h5>Params</h5>
<ul>
<li><a name="is_same_object.other"><code>other</code></a>: <a href="#descriptor"><a href="#descriptor"><code>descriptor</code></a></a></li>
</ul>
<h5>Return values</h5>
<ul>
<li><a name="is_same_object.0"></a> <code>bool</code></li>
</ul>
<h4><a name="metadata_hash"><code>metadata-hash: func</code></a></h4>
<p>Return a hash of the metadata associated with a filesystem object referred
to by a descriptor.</p>
<p>This returns a hash of the last-modification timestamp and file size, and
may also include the inode number, device number, birth timestamp, and
other metadata fields that may change when the file is modified or
replaced.</p>
<p>Implementations are encourated to provide the following properties:</p>
<ul>
<li>If the file is not modified or replaced, the computed hash value should
usually not change.</li>
<li>If the object is modified or replaced, the computed hash value should
usually change.</li>
<li>The inputs to the hash should not be easily computable from the
computed hash.</li>
</ul>
<p>However, none of these is required.</p>
<h5>Return values</h5>
<ul>
<li><a name="metadata_hash.0"></a> result&lt;(<code>u64</code>, <code>u64</code>), <a href="#error_code"><a href="#error_code"><code>error-code</code></a></a>&gt;</li>
</ul>
<h4><a name="metadata_hash_at"><code>metadata-hash-at: func</code></a></h4>
<p>Return a hash of the metadata associated with a filesystem object referred
to by a directory descriptor and a relative path.</p>
<p>This performs the same hash computation as <a href="#metadata_hash"><code>metadata-hash</code></a>.</p>
<h5>Params</h5>
<ul>
<li><a name="metadata_hash_at.path_flags"><a href="#path_flags"><code>path-flags</code></a></a>: <a href="#path_flags"><a href="#path_flags"><code>path-flags</code></a></a></li>
<li><a name="metadata_hash_at.path"><code>path</code></a>: <code>string</code></li>
</ul>
<h5>Return values</h5>
<ul>
<li><a name="metadata_hash_at.0"></a> result&lt;(<code>u64</code>, <code>u64</code>), <a href="#error_code"><a href="#error_code"><code>error-code</code></a></a>&gt;</li>
</ul>
69 changes: 47 additions & 22 deletions wit/types.wit
Original file line number Diff line number Diff line change
Expand Up @@ -102,10 +102,6 @@ default interface types {
///
/// Note: This was called `filestat` in earlier versions of WASI.
record descriptor-stat {
/// Device ID of device containing the file.
device: device,
/// File serial number.
inode: inode,
/// File type.
%type: descriptor-type,
/// Number of hard links to the file.
Expand Down Expand Up @@ -166,14 +162,6 @@ default interface types {
/// Number of hard links to an inode.
type link-count = u64

/// Identifier for a device containing a file system. Can be used in
/// combination with `inode` to uniquely identify a file or directory in
/// the filesystem.
type device = u64

/// Filesystem object serial number that is unique within its file system.
type inode = u64

/// When setting a timestamp, this gives the value to set it to.
variant new-timestamp {
/// Leave the timestamp set to its previous value.
Expand All @@ -187,14 +175,6 @@ default interface types {

/// A directory entry.
record directory-entry {
/// The serial number of the object referred to by this directory entry.
/// May be none if the inode value is not known.
///
/// When this is none, libc implementations might do an extra `stat-at`
/// call to retrieve the inode number to fill their `d_ino` fields, so
/// implementations which can set this to a non-none value should do so.
inode: option<inode>,

/// The type of the file referred to by this directory entry.
%type: descriptor-type,

Expand Down Expand Up @@ -479,14 +459,20 @@ default interface types {

/// Return the attributes of an open file or directory.
///
/// Note: This is similar to `fstat` in POSIX.
/// Note: This is similar to `fstat` in POSIX, except that it does not return
/// device and inode information. For testing whether two descriptors refer to
/// the same underlying filesystem object, use `is-same-object`. To obtain
/// additional data that can be used do determine whether a file has been
/// modified, use `metadata-hash`.
///
/// Note: This was called `fd_filestat_get` in earlier versions of WASI.
stat: func(this: descriptor) -> result<descriptor-stat, error-code>

/// Return the attributes of a file or directory.
///
/// Note: This is similar to `fstatat` in POSIX.
/// Note: This is similar to `fstatat` in POSIX, except that it does not
/// return device and inode information. See the `stat` description for a
/// discussion of alternatives.
///
/// Note: This was called `path_filestat_get` in earlier versions of WASI.
stat-at: func(
Expand Down Expand Up @@ -794,4 +780,43 @@ default interface types {
/// Dispose of the specified `directory-entry-stream`, after which it may no longer
/// be used.
drop-directory-entry-stream: func(this: directory-entry-stream)

/// Test whether two descriptors refer to the same filesystem object.
///
/// In POSIX, this corresponds to testing whether the two descriptors have the
/// same device (`st_dev`) and inode (`st_ino` or `d_ino`) numbers.
/// wasi-filesystem does not expose device and inode numbers, so this function
/// may be used instead.
is-same-object: func(other: descriptor) -> bool

/// Return a hash of the metadata associated with a filesystem object referred
/// to by a descriptor.
///
/// This returns a hash of the last-modification timestamp and file size, and
/// may also include the inode number, device number, birth timestamp, and
/// other metadata fields that may change when the file is modified or
/// replaced.
///
/// Implementations are encourated to provide the following properties:
///
/// - If the file is not modified or replaced, the computed hash value should
/// usually not change.
/// - If the object is modified or replaced, the computed hash value should
/// usually change.
/// - The inputs to the hash should not be easily computable from the
/// computed hash.
///
/// However, none of these is required.
metadata-hash: func() -> result<tuple<u64, u64>, error-code>

/// Return a hash of the metadata associated with a filesystem object referred
/// to by a directory descriptor and a relative path.
///
/// This performs the same hash computation as `metadata-hash`.
metadata-hash-at: func(
/// Flags determining the method of how the path is resolved.
path-flags: path-flags,
/// The relative path of the file or directory to inspect.
path: string,
) -> result<tuple<u64, u64>, error-code>
}

0 comments on commit 6c45b13

Please sign in to comment.