Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LibOS] Add flock syscall #1416

Merged
merged 1 commit into from
Jun 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions Documentation/manifest-syntax.rst
Original file line number Diff line number Diff line change
Expand Up @@ -592,6 +592,21 @@ all), then the ``struct`` key must be an empty string or not exist at all::
encrypted or integrity-protected with a key pre-shared between Gramine and
the device.

Experimental flock (BSD-style locks) support
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

::

sys.experimental__enable_flock = [true|false]
(Default: false)

This syntax enables the ``flock`` system call in Gramine.

.. warning::
This syscall is still under development and may contain security
vulnerabilities. This is temporary; the syscall will be enabled by default in
the future after thorough validation and this syntax will be removed then.

SGX syntax
----------

Expand Down
69 changes: 45 additions & 24 deletions libos/include/libos_fs_lock.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@
*/

/*
* File locks. Currently only POSIX locks are implemented.
* File locks. Both POSIX locks (fcntl syscall) and BSD locks (flock syscall) are implemented via
* a common struct `libos_file_lock`. See `man fcntl` and `man flock` for details.
*/

#pragma once
Expand All @@ -22,8 +23,8 @@ struct libos_dentry;
int init_fs_lock(void);

/*
* File locks. Currently describes only POSIX locks (also known as advisory record locks). See `man
* fcntl` for details.
* File locks. Describes both POSIX locks aka advisory record locks (fcntl syscall) and BSD locks
* (flock syscall). See `man fcntl` and `man flock` for details.
*
* The current implementation works over IPC and handles all requests in the main process. It has
* the following caveats:
Expand All @@ -32,28 +33,36 @@ int init_fs_lock(void);
* lock is uncontested.
* - The main process has to be able to look up the same file, so locking will not work for files in
* local-process-only filesystems (tmpfs).
* - There is no deadlock detection (EDEADLK).
* - The lock requests cannot be interrupted (EINTR).
* - The locks work only on files that have a dentry (no pipes, sockets etc.)
* - The locks work only on files that have a dentry (no pipes, sockets etc.).
* - Only for POSIX (fcntl) locks: no deadlock detection (EDEADLK).
*/

enum libos_file_lock_family {
FILE_LOCK_UNKNOWN, /* this is only to catch uninitialized-variable errors */
FILE_LOCK_POSIX,
FILE_LOCK_FLOCK,
};

DEFINE_LISTP(libos_file_lock);
DEFINE_LIST(libos_file_lock);
struct libos_file_lock {
/* Lock family: FILE_LOCK_POSIX, FILE_LOCK_FLOCK */
enum libos_file_lock_family family;

/* Lock type: F_RDLCK, F_WRLCK, F_UNLCK */
int type;

/* First byte of range */
uint64_t start;

/* Last byte of range (use FS_LOCK_EOF for a range until end of file) */
uint64_t end;

/* PID of process taking the lock */
IDTYPE pid;

/* List node, used internally */
LIST_TYPE(libos_file_lock) list;

/* FILE_LOCK_POSIX fields */
uint64_t start; /* First byte of range */
uint64_t end; /* Last byte of range (use FS_LOCK_EOF for a range until end of file) */
IDTYPE pid; /* PID of process taking the lock */

/* FILE_LOCK_FLOCK fields */
uint64_t handle_id; /* Unique handle ID using which the lock is taken */
};

/*!
Expand All @@ -65,13 +74,19 @@ struct libos_file_lock {
*
* This is the equivalent of `fnctl(F_SETLK/F_SETLKW)`.
*
* If `file_lock->type` is `F_UNLCK`, the function will remove any locks held by the given PID for
* the given range. Removing a lock never waits.
* If `file_lock->type` is `F_UNLCK`, the function will remove locks as follows:
* - For POSIX (fcntl) locks, remove all POSIX locks held by the given PID for the given range.
* - For BSD (flock) locks, remove all BSD locks held by the given handle ID.
*
* Removing a lock never waits.
*
* If `file_lock->type` is `F_RDLCK` or `F_WRLCK`, the function will create a new lock for the given
* PID and range, replacing the existing locks held by the given PID for that range. If there are
* conflicting locks, the function either waits (if `wait` is true), or fails with `-EAGAIN` (if
* `wait` is false).
* If `file_lock->type` is `F_RDLCK` or `F_WRLCK`, the function will create a new lock as follows:
* - For POSIX (fcntl) locks, for the given PID and range, replace the existing POSIX locks held by
* the given PID for that range.
* - For BSD (flock) locks, replace the existing BSD locks held by the given handle ID.
*
* If there are conflicting locks, the function either waits (if `wait` is true), or fails with
* `-EAGAIN` (if `wait` is false).
*/
int file_lock_set(struct libos_dentry* dent, struct libos_file_lock* file_lock, bool wait);

Expand All @@ -84,14 +99,20 @@ int file_lock_set(struct libos_dentry* dent, struct libos_file_lock* file_lock,
*
* This is the equivalent of `fcntl(F_GETLK)`.
*
* The function checks if there are locks by other PIDs preventing the proposed lock from being
* placed. If the lock could be placed, `out_file_lock->type` is set to `F_UNLCK`. Otherwise,
* `out_file_lock` fields (`type`, `start, `end`, `pid`) are set to details of a conflicting lock.
* The function checks if there are conflicting locks:
* - For POSIX (fcntl) locks, check for other PIDs preventing the proposed lock from being placed.
* - For BSD (flock) locks, check for other handle IDs preventing the proposed lock from being
* placed.
*
* If the lock could be placed, `out_file_lock->type` is set to `F_UNLCK`. Otherwise,
* `out_file_lock` fields (`type`, `start, `end`, `pid`, `handle_id`) are set to details of a
* conflicting lock.
*/
int file_lock_get(struct libos_dentry* dent, struct libos_file_lock* file_lock,
struct libos_file_lock* out_file_lock);

/* Removes all locks for a given PID. Should be called before process exit. */
/* Removes all locks for a given PID. Applicable only for POSIX locks. Should be called before
* process exit. */
int file_lock_clear_pid(IDTYPE pid);

/*!
Expand Down
16 changes: 16 additions & 0 deletions libos/include/libos_handle.h
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,22 @@ struct libos_handle {
enum libos_handle_type type;
bool is_dir;

/* Unique ID. This field does not change, so reading it does not require holding any locks.
* Currently used only for `flock` system call. */
uint64_t id;
/*
* Specifies whether this handle was created by this process or inherited from the parent
* process. Used to perform an operation on the handle only once per Gramine instance (with
* multiple processes). Currently used to perform LOCK_UN operation in `flock` system call when
* the "last" file descriptor to this handle is closed (by "last" FD we assume here the last FD
* referring to this handle in the creator process).
*
* FIXME: Problematic case: P1 opens a handle, spawns P2 and terminates; in this case the
* operation (e.g. LOCK_UN) would be performed even though the handle is still opened in
* P2. Unfortunately, Gramine lacks system-wide tracking of handle FDs.
*/
bool created_by_process;

refcount_t ref_count;

struct libos_fs* fs;
Expand Down
5 changes: 5 additions & 0 deletions libos/include/libos_ipc.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

#include "avl_tree.h"
#include "libos_defs.h"
#include "libos_fs_lock.h"
#include "libos_handle.h"
#include "libos_internal.h"
#include "libos_thread.h"
Expand Down Expand Up @@ -278,10 +279,12 @@ int ipc_sync_confirm_close_callback(IDTYPE src, void* data, unsigned long seq);

struct libos_ipc_file_lock {
/* see `struct libos_file_lock` in `libos_fs_lock.h` */
enum libos_file_lock_family family;
int type;
uint64_t start;
uint64_t end;
IDTYPE pid;
uint64_t handle_id;

bool wait;
char path[]; /* null-terminated */
Expand All @@ -291,10 +294,12 @@ struct libos_ipc_file_lock_resp {
int result;

/* see `struct libos_file_lock` in `libos_fs_lock.h` */
enum libos_file_lock_family family;
int type;
uint64_t start;
uint64_t end;
IDTYPE pid;
uint64_t handle_id;
};

struct libos_file_lock;
Expand Down
36 changes: 35 additions & 1 deletion libos/src/bookkeep/libos_handle.c
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,7 @@ static int clear_posix_locks(struct libos_handle* handle) {
* closed, even if the process holds other handles for that file, or duplicated FDs for the
* same handle. */
struct libos_file_lock file_lock = {
.family = FILE_LOCK_POSIX,
.type = F_UNLCK,
.start = 0,
.end = FS_LOCK_EOF,
Expand Down Expand Up @@ -350,6 +351,18 @@ struct libos_handle* get_new_handle(void) {
}
INIT_LISTP(&new_handle->epoll_items);
new_handle->epoll_items_count = 0;

static uint32_t local_id_counter = 0;
uint32_t next_id_counter = __atomic_add_fetch(&local_id_counter, 1, __ATOMIC_RELAXED);
if (!next_id_counter) {
/* overflow of local_id_counter, this may lead to aliasing of different handles and is
* potentially a security vulnerability, so just terminate the whole process */
log_error("overflow when allocating a handle ID, not safe to proceed");
BUG();
}
new_handle->id = ((uint64_t)g_process.pid << 32) | next_id_counter;
new_handle->created_by_process = true;

return new_handle;
}

Expand Down Expand Up @@ -475,6 +488,24 @@ static void destroy_handle(struct libos_handle* hdl) {
free_mem_obj_to_mgr(handle_mgr, hdl);
}

static int clear_flock_locks(struct libos_handle* hdl) {
/* Clear flock (BSD) locks for a file. We are required to do that when the handle is closed. */
if (hdl && hdl->dentry && hdl->created_by_process) {
assert(hdl->ref_count == 0);
struct libos_file_lock file_lock = {
.family = FILE_LOCK_FLOCK,
.type = F_UNLCK,
.handle_id = hdl->id,
};
int ret = file_lock_set(hdl->dentry, &file_lock, /*block=*/false);
if (ret < 0) {
log_warning("error releasing locks: %s", unix_strerror(ret));
return ret;
}
}
return 0;
}

void put_handle(struct libos_handle* hdl) {
refcount_t ref_count = refcount_dec(&hdl->ref_count);

Expand All @@ -496,8 +527,10 @@ void put_handle(struct libos_handle* hdl) {
hdl->pal_handle = NULL;
}

if (hdl->dentry)
if (hdl->dentry) {
(void)clear_flock_locks(hdl);
put_dentry(hdl->dentry);
}

if (hdl->inode)
put_inode(hdl->inode);
Expand Down Expand Up @@ -735,6 +768,7 @@ BEGIN_CP_FUNC(handle) {
lock(&hdl->lock);
*new_hdl = *hdl;

new_hdl->created_by_process = false;
new_hdl->dentry = NULL;
refcount_set(&new_hdl->ref_count, 0);
clear_lock(&new_hdl->lock);
Expand Down
Loading