-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accessing a file's inode can lead to a NULL dereference #10737
Comments
@grwilson Do you think this issue can be related to this? https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe/+bug/1885265 We're experiencing regular kernel panics seemingly from NFS, but we suspect that this has something to do with ZFS (0.8.3 as of now) never saw the same with As a first step I posted this issue on nfs mailing list, and this reply backs our hunch up https://marc.info/?l=linux-nfs&m=159775983016484&w=2 |
@Sea-you this looks pretty different. With the issue I'm seeing we're running with ZFS root. Are you also using ZFS root or is ZFS only being used as the backend for NFSv4 filesystems? Are you able to get a crash dump? |
We have seen some new panics and also soft lockups reported that all appear to be related to this issue. Here are some of the stacks: Crashes:
Soft lockup:
From some debugging the problem was tracked down to commit mentioned above and can result in corrupted file reference counts. |
System information
Describe the problem you're observing
Periodic system crashes (Oops) when running our internal replication test. From our initial analysis we see that accessing a file's inode can lead to a NULL dereference.
Describe how to reproduce the problem
Right now the only way I've been able to reproduce this is by running an internal test suite in a loop.
Include any warning/errors/backtraces from the system logs
Doing a bisect, I've narrowed it down to this commit:
From the crash dumps I see that the file's inode can sometimes be NULL or is NULL at the time the instruction is executed and later populated as part of the crash.
Here's an example of both where they both crashed at the same instruction:
This is equivalent to this code in the kernel:
So we know that %rdx is our
struct file *
and thef_inode
member is at offset 0x20. Looking at 2 crashes we see both cases wheref_inode
is NULL or populated later:The text was updated successfully, but these errors were encountered: