-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use /dev/uesrfaultfd
, when it exists, to create the file descriptor
#56
Conversation
Expose the USERFAULTFD_IOC ioctl number for `/dev/userfaultfd` device. This device is only present on kernels >= 6.1, but we expose it unconditionally so that userfaultfd-sys has the same exports for all kernels. Signed-off-by: Babis Chalios <bchalios@amazon.es>
Linux kernel 6.1 introduces a /dev/userfaultfd pseudo-device which allows creating a UFFD objects by issuing USERFAULTFD_IOC_NEW ioctl() to the device [1]. This way, a process does not need to have the CAP_SYS_PTRACE capability for creating UFFD objects that can handle kernel-triggered page faults. UFFDs created through this interface can always handle kernel-triggered page faults. Access to the device is granted through normal filesystem permissions to the device file. This commit changes the way we create Uffd objects, to first try to create the file descriptor through /dev/userfaultfd. If the file exists we will try to create the descriptor using ioctl in this device. If the device does not exist, we will fall back to the syscall. Signed-off-by: Babis Chalios <bchalios@amazon.es>
This implementation looks great, but I'm not familiar with this new kernel interface. I'll leave this PR open for a little bit so that hopefully someone who knows that interface better can take a look. |
Thanks for the response @pchickey! Just as additional context, kernel documentation about the API can be found here. Also, crosvm implements the same logic but within their project. For what is worth, I consumed the API in a PR I opened for Firecracker and everything seems to be working. |
Hey @pchickey. Did you maybe have time to take a look into this? Please, let me know if there's anything I can do to help with the review. Without this, we are observing issues in Firecracker with kernels 6.1, so it would be nice to have a solution upstream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're also facing problems with this issue and this solution LGTM 👍
Wow! Thanks I guess :) I see the CI is failing on |
Oh, I didn't see that until now. Yes, lets get the fix in for handling access denied before the release. |
Weird thing is that the test passed in the PR GitHub action run. I'll take a look tomorrow morning. |
Since kernel 6.1, Linux introduced a new interface to create userfault file descriptors, via a new
USERFAULTFD_IOC_NEW
ioctl()
issued to the/dev/userfaultd
newly added device. Creating the file descriptor using this interface does not require the process to have theCAP_SYS_PTRACE
capability, which is the case now if it does not define theUFFD_USER_MODE_ONLY
flag. Access to/dev/userfaultfd
is governed by normal file system permissions. Flags passed to theUSERFAULTFD_IOC_NEW
ioctl()
are the same as the ones passed to the system call.This PR changes the way we create the file descriptor. If
/dev/userfaultfd
is present, we will try to create the file descriptor issuing theioctl()
to it. Otherwise, it will fall back to calling the system call.