Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux 4.7: fix deadlock during lookup on case-insensitive #5141

Merged
merged 1 commit into from
Sep 23, 2016

Conversation

tuxoko
Copy link
Contributor

@tuxoko tuxoko commented Sep 20, 2016

We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Signed-off-by: Chunwei Chen david.chen@osnexus.com

We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
@behlendorf
Copy link
Contributor

I was going to suggest this would be a good opportunity to enable all 14 casenorm tests in the ZFS Test Suite. However, based on the test suite results it might take a little work.

casenorm.txt

@tuxoko
Copy link
Contributor Author

tuxoko commented Sep 20, 2016

ERROR: lookup_file FïLëNÄmë unexpectedly exited 127 (File not found)

function lookup_file
{
    typeset name=$1

    $ZLOOK -l $TESTDIR $name >/dev/null 2>&1
}

I think it means that there's no such thing as $ZLOOK.

@behlendorf
Copy link
Contributor

That would make sense. It's definitely referring to zlook.c which was never ported to Linux, Nor was its kernel space counterpart zut.c.

We could port this probably fairly easily, it's not that much code. But that might be overkill, it's only used for these casenorm tests and there may be a simpler way to achieve the same thing.

@tuxoko
Copy link
Contributor Author

tuxoko commented Sep 20, 2016

@behlendorf
Yeah, I don't see why we can't just stat the file.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'd like to hear from @cgaspar that this in fact resolves the issue before merging it.

@satmandu
Copy link
Contributor

Is there a version of this patch that works for the 0.6.5.x branch?

@tuxoko
Copy link
Contributor Author

tuxoko commented Sep 22, 2016

@satmandu
This should work on 0.6.5.8
https://gist.github.com/tuxoko/aca3da99949a603a393818f8945a5010

@satmandu
Copy link
Contributor

@tuxoko That doesn't seem to work.

It compiles fine. But I'm getting the same hang as before trying to mount the pool.

@tuxoko
Copy link
Contributor Author

tuxoko commented Sep 22, 2016

@satmandu
Are you really using the the modules you built? Because you said you used a signed kernel, I thought ubuntu signed kernel will not load unsinged module.

Either way, please see if there's anything in dmesg after a couple minutes of hang. Or do sysrq-l, sysrq-t.

@satmandu
Copy link
Contributor

@tuxoko I stand corrected. This definitely works. Maybe it was a signed version issue. Installing a mainline kernel-ppa package with the patched file used for an install appears to allow proper booting and mounting. It would be great if this gets added to a stable point release so it can get pulled into ubuntu.

@behlendorf behlendorf merged commit d5b897a into openzfs:master Sep 23, 2016
@behlendorf behlendorf added this to the 0.6.5.9 milestone Sep 23, 2016
@behlendorf
Copy link
Contributor

@satmandu thanks for testing the fix. It's been merged to master and I added this PR to the 0.6.5.9 milestone so we remember to apply it to the next point release.

DeHackEd pushed a commit to DeHackEd/zfs that referenced this pull request Oct 19, 2016
We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Tested-by: satmandu
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes openzfs#5124
Closes openzfs#5141
Closes openzfs#5147
Closes openzfs#5148
stiell pushed a commit to stiell/zfs that referenced this pull request Oct 21, 2016
We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Tested-by: satmandu
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes openzfs#5124 
Closes openzfs#5141 
Closes openzfs#5147 
Closes openzfs#5148
DeHackEd pushed a commit to DeHackEd/zfs that referenced this pull request Oct 29, 2016
We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Tested-by: satmandu
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes openzfs#5124
Closes openzfs#5141
Closes openzfs#5147
Closes openzfs#5148
@satmandu
Copy link
Contributor

Any chance 0.6.5.9 might get tagged before 0.7.0? (Ubuntu seems to following 0.6.5.x at the moment, and I'd hate for this to not get into the next stable release.)

@behlendorf
Copy link
Contributor

@satmandu thanks for the reminder, this fix was already slatted for 0.6.5.9. See https://github.com/zfsonlinux/zfs/projects/4.

@satmandu
Copy link
Contributor

I saw that. 👍

@RJVB
Copy link

RJVB commented Dec 21, 2016

Question: I see that the fix adds a strcmp operation which ought to be unnecessary in the majority of cases presuming case-insensitive datasets are relatively rare. Isn't there a cheap test of the case sensitivity property one could do before blindly invoking strcmp?

@loli10K
Copy link
Contributor

loli10K commented Dec 21, 2016

Isn't there a cheap test of the case sensitivity property one could do before blindly invoking strcmp?

@RJVB doesn't that if (ppn) just above the strcmp() takes care of that? If i read the code correctly ppn is assigned only if zsb->z_case == ZFS_CASE_INSENSITIVE.

@RJVB
Copy link

RJVB commented Dec 21, 2016

Sorry, my bad, I should have read the full function instead of only the patch.

There's probably still room for improvement by checking a (precomputed) cheap hash first, but I can live with slower performance in the rare cases I need a case-insensitive FS :)

@satmandu
Copy link
Contributor

Sigh. I'd also be happy with finding a way to convert my filesystem from case-insensitive to case-sensitive without copying everything to another filesystem and back.

@RJVB
Copy link

RJVB commented Dec 21, 2016

Sadly you can only go from sensitive to insensitive with a guarantee of not losing anything due to aliasing. I don't know to what extent zfs refusal to do this conversion is only due to a decision that it shouldn't be supported or on the contrary, because it is not just a matter of how filenames are handled.

The nice thing with ZFS is that you don't need to copy things back and forth. Just create a new, case-sensitive dataset (as a child of the insensitive one?), mount it on a temporary location, move your data to that location, destroy the old dataset and remount the new one on the intended location. No need to fool around with partitions or external devices.

@satmandu
Copy link
Contributor

Thanks! That's on my list of things to try now when I have a moment. :)

behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Jan 20, 2017
We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Tested-by: satmandu
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes openzfs#5124
Closes openzfs#5141
Closes openzfs#5147
Closes openzfs#5148
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Feb 2, 2017
We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Tested-by: satmandu
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes openzfs#5124
Closes openzfs#5141
Closes openzfs#5147
Closes openzfs#5148
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Feb 2, 2017
We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Tested-by: satmandu
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes openzfs#5124
Closes openzfs#5141
Closes openzfs#5147
Closes openzfs#5148
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Feb 2, 2017
We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Tested-by: satmandu
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes openzfs#5124
Closes openzfs#5141
Closes openzfs#5147
Closes openzfs#5148
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Feb 2, 2017
We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Tested-by: satmandu
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes openzfs#5124
Closes openzfs#5141
Closes openzfs#5147
Closes openzfs#5148
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Feb 2, 2017
We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Tested-by: satmandu
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes openzfs#5124
Closes openzfs#5141
Closes openzfs#5147
Closes openzfs#5148
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Feb 2, 2017
We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Tested-by: satmandu
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes openzfs#5124
Closes openzfs#5141
Closes openzfs#5147
Closes openzfs#5148
Requires-builders: style
behlendorf pushed a commit that referenced this pull request Feb 3, 2017
We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Tested-by: satmandu
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5124
Closes #5141
Closes #5147
Closes #5148
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants