-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add runtime fallback for _optimized_copyfile(). #647
base: master
Are you sure you want to change the base?
Conversation
In some situations, as described by this thread on Gentoo forums: https://forums.gentoo.org/viewtopic-t-1088232-start-0.html the build system's kernel interface is not compatible with the portage _optimized_copyfile() implementation. Check for this specific edge-case, where _optimized_copyfile() fails with errno=22 (Invalid argument), invoke the fallback function and set a flag to indicate that following invocations will also call the fallback function. Also print a warning message at noiselevel=2 when this happens. In the case where errno=22 is caused by some other reason which the user should know about, the fallback function should fail in the same (or similar) way. Signed-off-by: Rafael Kitover <rkitover@gmail.com>
I prefer to handle fallbacks like this inside the C code. For example see 58d44d3 and dad9cce. Please strace the failure like in https://bugs.gentoo.org/641088#c8 so that we can see which syscall is failing. We need to understand how and why it fails so that we can respond appropriately. Sometimes the appropriate response involves reporting a filesystem problem like in bug 705536. If we ignore errno 22 without a specific reason, then it's likely to hide problems that should really be fixed. I suppose we could add a way to allow users to configure how specific errors are handled. |
The syscall that seems to be failing is
I am on the unstable KDE+systemd profile, just updated last night, linus kernel and git zfs. My
Of note is I tried running The strace invocation is: |
We've found zfs bugs before, like bug 635002, and this could be another one. Given those lseek results, the copy_file_range call should succeed.
The copy_file_range man page says that these are the reasons for it to fail with EINVAL, and none of them apply here:
|
Thank you, my next step will be to reproduce the specific error condition and I will report back here. |
Looks like openzfs/zfs#11151. |
That does look like the exact issue. I was going to test with an older kernel to make sure that fixes it, but I'm having all kinds of serious problems with my linux install right now. As far as this pull request goes, I agree that the user should know about breakage on their system, but if you look on those forum posts there are other situations where this problem arises. And it could be argued that installing software is a critical system function and it should be simple to temporarily alleviate these problems, to e.g. install a new version of whatever is causing the problem, for example. If you would like a different implementation for this, perhaps activated with a command line switch or environment variable, or whatever design would make sense to you, I would be happy to do that. Happy holidays! |
True.
Yes, I would like to add a FEATURES setting to enable the fallback, and would like to display a suggestion to enable this feature as a workaround when _optimized_copyfile fails. We can add a lazily initialized attribute to the portage.data module which will be true when the fallback is enabled via FEATURES setting, and _optimized_copyfile will be able to check the value of that attribute in order to trigger the new behavior. I've opened this bug report to track the issue: https://bugs.gentoo.org/760929
Happy holidays! |
I fixed my linux install issues and would like to report that I compiled linux kernel 5.9 with zfs stable 2.0.0 and the bug does not manifest. I will work on this FEATURES setting as you describe it. |
Thanks! @RinCat has also expressed interest in a feature like this. |
There's a new zfs issue report involving portage here: |
In some situations, as described by this thread on Gentoo forums:
https://forums.gentoo.org/viewtopic-t-1088232-start-0.html
the build system's kernel interface is not compatible with the portage
_optimized_copyfile() implementation.
Check for this specific edge-case, where _optimized_copyfile() fails
with errno=22 (Invalid argument), invoke the fallback function and set a
flag to indicate that following invocations will also call the fallback
function.
Also print a warning message at noiselevel=2 when this happens.
In the case where errno=22 is caused by some other reason which the user
should know about, the fallback function should fail in the same (or
similar) way.
Signed-off-by: Rafael Kitover rkitover@gmail.com