Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux: Fix openzfs kernel module loading on 6.11+ with CONFIG_RANDSTRUCT=y #16805

Merged
merged 1 commit into from
Nov 30, 2024

Conversation

IvanVolosyuk
Copy link
Contributor

@IvanVolosyuk IvanVolosyuk commented Nov 25, 2024

Fixes openzfs kernel module loading on 6.11+ with CONFIG_RANDSTRUCT=y

Motivation and Context

This change fixes #16620
It looks like detection logic for register_sysctl_sz() in kernel-register_sysctl_table.m4 doesn't quite work with CONFIG_RANDSTRUCT=y. It causes the same issue as before the original fix in 2633075#diff-cc1bb9d36ff95af7e87e3c4a8fc022ea302ecfdccbc9fa1e9663c9c7b51dc416R722

sysctl table check failed: kernel/spl/(null) No proc_handler

Description

The fix is to replace {0} with {{}} in the m4 script, which better matches the sentinels we use in spl-proc.c

How Has This Been Tested?

  • Verified that register_sysctl_sz is detected correctly on kernels v6.5, v6.10, v6.11.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@IvanVolosyuk IvanVolosyuk force-pushed the fix-sentinels branch 2 times, most recently from 9d832f5 to 02e8080 Compare November 25, 2024 13:31
@amotin
Copy link
Member

amotin commented Nov 25, 2024

I bisected the kernel commit where the issue starts to occur:

This is the commit when it stopped allow the sentinels. Instead I think you could dig deeper to find when they become optional, since that is where the KPI has changed, and I wonder if it was more documented back then. The later was just a compatibility window.

@amotin amotin added the Status: Code Review Needed Ready for review and testing label Nov 25, 2024
@snajpa
Copy link
Contributor

snajpa commented Nov 25, 2024

Kernel versions aren't reliable, stuff can be backported to older kernels (RHEL kernels are known for that), testing a kernel version within OpenZFS code to make a decision of which code to use is thus not a good idea. It always needs a specific configure test, which affects zfs_config.h, which is what you can then use in the source to disambiguate between the different variants.

(I don't speak m4 either, I always just go looking for something in the config/ dir that is the closest to what I'm trying to do and then just copy and modify it)

@snajpa
Copy link
Contributor

snajpa commented Nov 26, 2024

btw how about, instead of ifdefing everywhere, define a macro that will either fill in the optional guard or nothing? something like

#ifdef HAVE_PROC_GUARDS
#define OPTIONAL_PROC_GUARD {}
#else
#define OPTIONAL_PROC_GUARD
#endif

and then just place OPTIONAL_PROC_GUARD at the right spots

@IvanVolosyuk
Copy link
Contributor Author

IvanVolosyuk commented Nov 26, 2024

Thank you, guys, for the feedback. I dig a bit more into this and made a facepalm.

It turns out @robn already did what I was trying to do in a more elegant way:
2633075#diff-cc1bb9d36ff95af7e87e3c4a8fc022ea302ecfdccbc9fa1e9663c9c7b51dc416R722

The real problem to be solved is: why CONFG_RANDSTRUCT breaks the detection of register_sysctl_sz() in config/kernel-register_sysctl_table.m4
I can confirm that with RANDSTRUCT the HAVE_REGISTER_SYSCTL_SZ is undefined on v6.11.

@IvanVolosyuk
Copy link
Contributor Author

I have updated the commit and pull request description. It ended up being just a one-liner.

@IvanVolosyuk IvanVolosyuk changed the title Linux: Guard proc sentinel values using kernel version. Linux: Fix openzfs kernel module loading on 6.11+ with CONFIG_RANDSTRUCT=y Nov 26, 2024
@amotin amotin requested a review from robn November 26, 2024 15:49
@AllKind AllKind mentioned this pull request Nov 26, 2024
13 tasks
Copy link
Member

@robn robn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it works, then it works for me. Good stuff!

Do you happen to have the output from the failing register_sysctl_sz test? And which compiler you're using? I'm curious about how it failed, that this could fix it. I'm suspecting a version of #16206.

@IvanVolosyuk
Copy link
Contributor Author

Sure, @robn
I'm using gcc (Gentoo 13.3.1_p20241115 p1) 13.3.1 20241115
Here is the output from the test I was getting:

/sources/zfs/build/has_register_sysctl_sz/has_register_sysctl_sz.c: In function ‘main’:
/sources/zfs/build/has_register_sysctl_sz/has_register_sysctl_sz.c:103:74: error: invalid initializer
  103 |                 struct ctl_table test_table[] __attribute__((unused)) = {0};
      |                                                                          ^
/sources/zfs/build/has_register_sysctl_sz/has_register_sysctl_sz.c:103:74: note: (near initialization for ‘test_table[0].<anonymous>’)
make[3]: *** [scripts/Makefile.build:244: /sources/zfs/build/has_register_sysctl_sz/has_register_sysctl_sz.o] Error 1
make[3]: Target '/sources/zfs/build/has_register_sysctl_sz/' not remade because of errors.
make[2]: *** [scripts/Makefile.build:485: /sources/zfs/build/has_register_sysctl_sz] Error 2

@robn
Copy link
Member

robn commented Nov 26, 2024

Oh right, much dumber than I thought. The 0 is applied to the first element. If its a simple scalar, fine, if not, blam. Ok, we all good. Thanks for running it down :)

@IvanVolosyuk
Copy link
Contributor Author

Damn, I accidentally pushed 'Update branch' button and it created a merge commit to master on top of my changes. To get rid of this I just force pushed to go back to a good old state. So, should I just wait now for merge or I have to do something? e.g. manually rebase on top current master? That's my first commit for openzfs.

@amotin
Copy link
Member

amotin commented Nov 27, 2024

@IvanVolosyuk If your master was reasonably recent, that should be fine. Sure the newer the better, but within reason, to not over-stress CI with extra testing. Once all the CI tests complete successfully and there are enough qualified positive reviews, it will be merged.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Nov 27, 2024
@IvanVolosyuk
Copy link
Contributor Author

zloop run was successful before I rebased: https://github.com/openzfs/zfs/actions/runs/12028052804
Now it failed 2 times in a row.

@amotin
Copy link
Member

amotin commented Nov 27, 2024

zloop run was successful before I rebased. Now it failed 2 times in a row.

@IvanVolosyuk Unfortunately it is too random and so unstable in its results. It is not necessary that the reports are wrong, just likely not related to this PR. It is not a blocker here.

@amotin
Copy link
Member

amotin commented Nov 28, 2024

@IvanVolosyuk It seems github stuck in "Checking for ability to merge automatically…". I can't merge it. Could you rebase and force-push it?

Adjust the m4 function to mimic sentinel we use in spl-proc.c
This fixes the detection on kernels compiled with CONFIG_RANDSTRUCT=y

Closes: 16620
Signed-off-by: Ivan Volosyuk <Ivan.Volosyuk@gmail.com>
@IvanVolosyuk
Copy link
Contributor Author

Done, should be good now.

@github-actions github-actions bot removed the Status: Accepted Ready to integrate (reviewed, tested) label Nov 29, 2024
@amotin amotin added the Status: Accepted Ready to integrate (reviewed, tested) label Nov 29, 2024
@amotin amotin merged commit f29dcc2 into openzfs:master Nov 30, 2024
23 of 24 checks passed
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Dec 3, 2024
Adjust the m4 function to mimic sentinel we use in spl-proc.c
This fixes the detection on kernels compiled with CONFIG_RANDSTRUCT=y

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Pavel Snajdr <snajpa@snajpa.net>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ivan Volosyuk <Ivan.Volosyuk@gmail.com>
Closes: openzfs#16620
Closes: openzfs#16805
arter97 pushed a commit to arter97/zfs that referenced this pull request Dec 9, 2024
Adjust the m4 function to mimic sentinel we use in spl-proc.c
This fixes the detection on kernels compiled with CONFIG_RANDSTRUCT=y

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Pavel Snajdr <snajpa@snajpa.net>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ivan Volosyuk <Ivan.Volosyuk@gmail.com>
Closes: openzfs#16620
Closes: openzfs#16805
@cawilliamson
Copy link

On NixOS despite including this patch I still get the same boot error on the hardened 6.11 kernel.

Anyone else been looking into this and found a solution / workaround?

@AllKind
Copy link
Contributor

AllKind commented Dec 9, 2024

@cawilliamson
Please open an new issue in the issue tracker.

  • Including all the information in the issue template.
  • How the patch was applied.
  • And if possible the config.log from the configure process.

@IvanVolosyuk
Copy link
Contributor Author

IvanVolosyuk commented Dec 9, 2024

Did you run ./autogen.sh after applying the patch? The change is only used when run ./autogen.sh before ./configure as it changes the configure script itself.

@AllKind
Copy link
Contributor

AllKind commented Dec 9, 2024

@IvanVolosyuk The issue is now open at: #16849 (closed pull requests don't get much attention...)
I suspected something like you do. But I don't know anything about NixOS and a quick search on their build process lead to nothing. So I didn't comment any further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[6.11 w/ CONFIG_RANDSTRUCT=y] sysctl table check failed
7 participants