Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CONFIG_GRKERNSEC_HIDESYM infoleak warnings #8999

Closed
sempervictus opened this issue Jul 7, 2019 · 35 comments
Closed

CONFIG_GRKERNSEC_HIDESYM infoleak warnings #8999

sempervictus opened this issue Jul 7, 2019 · 35 comments
Labels
Status: Stale No recent activity for issue Type: Building Indicates an issue related to building binaries Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@sempervictus
Copy link
Contributor

sempervictus commented Jul 7, 2019

System information

Type Version/Name
Distribution Name Arch
Distribution Version Current
Linux Kernel 4.14.132 with ~20MB of additional patching
Architecture x86_64
ZFS Version 0.8.0 master

Describe the problem you're observing

When copying the ZFS sources into the kernel tree and configuring the build to directly integrate ZFS into the kernel binary (not as a module), both the zfs and zpool commands return:

# zfs list
The ZFS modules are not loaded.
Try running '/sbin/modprobe zfs' as root to load them.
# modprobe zfs
modprobe: FATAL: Module zfs not found in directory /lib/modules/4.14.132
# modinfo zfs
modinfo: ERROR: Module zfs not found.

This is new behavior (trying to finalize the jump from 0.7), and ztest does work somehow (running in the background presently testing away under zloop)

Describe how to reproduce the problem

Create a builtin ZFS structure within a kernel tree, configure the kernel with CONFIG_ZFS=y, build, install, build tools for the same version, run ztest -VVV, run zfs list while ztest is running.

Include any warning/errors/backtraces from the system logs

Nothing in dmesg.
Strace says:

strace zfs list 2>&1|grep ENO
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
access("/sys/module/zfs", F_OK)         = -1 ENOENT (No such file or directory)
access("/sys/module/zfs", F_OK)         = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en_US/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en.UTF-8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en.utf8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)

Which is odd because the sysfs directory was still there with the ZFS 0.7.13 builtin.

@sempervictus
Copy link
Contributor Author

The plot thickens. When i build with CONFIG_ZFS=m, the ZFS module is not being built or packaged somehow. After a rebuild as module, can't find the module, no sysfs entries, etc. Seems it might not be building the entire thing somehow. Rebuilding kernel with logs to see what the deuce is going on.

@sempervictus
Copy link
Contributor Author

So the failure was somehow silently passing and allowing the package/kernel to build, which is an issue in itself, but with build logging enabled it actually freaked out and died @

...-x86_64-build.log-In file included from ./include/zfs/spl/sys/sysmacros.h:31,
...-x86_64-build.log-                 from ./include/zfs/sys/zfs_context.h:36,
...-x86_64-build.log-                 from ./include/zfs/sys/lua/lua.h:13,
...-x86_64-build.log-                 from fs/zfs/zfs/zcp_get.c:20:
...-x86_64-build.log-fs/zfs/zfs/zcp_get.c: In function ‘get_special_prop’:
...-x86_64-build.log:./include/zfs/spl/sys/debug.h:89:28: error: invalid use of void expression
...-x86_64-build.log-   89 |   uint64_t _verify3_left = (uint64_t)(LEFT);  \
...-x86_64-build.log-      |                            ^
...-x86_64-build.log-fs/zfs/zfs/zcp_get.c:426:3: note: in expansion of macro ‘VERIFY3U’
...-x86_64-build.log-  426 |   VERIFY3U(strlcpy(strval, token, ZAP_MAXVALUELEN),
...-x86_64-build.log-      |   ^~~~~~~~
...-x86_64-build.log:./include/zfs/spl/sys/debug.h:89:28: error: invalid use of void expression

So in

88 #define VERIFY3U(LEFT, OP, RIGHT)       do {                            \
 89                 uint64_t _verify3_left = (uint64_t)(LEFT);              \
 90                 uint64_t _verify3_right = (uint64_t)(RIGHT);            \
 91                 if (!(_verify3_left OP _verify3_right))                 \
 92                     spl_panic(__FILE__, __FUNCTION__, __LINE__,         \
 93                     "VERIFY3(" #LEFT " "  #OP " "  #RIGHT ") "          \
 94                     "failed (%llu " #OP " %llu)\n",                     \
 95                     (unsigned long long) (_verify3_left),               \
 96                     (unsigned long long) (_verify3_right));             \
 97         } while (0)
 98 

the LEFT argument being cast as a uint64_t is not valid.
The get_special_prop

$ find .|grep zcp_get.c|xargs git blame|grep get_special_pro
d99a015343 (Chris Williamson 2018-02-08 09:16:23 -0700 301) get_special_prop(lua_State *state, dsl_dataset_t *ds, const char *dsname,
d99a015343 (Chris Williamson 2018-02-08 09:16:23 -0700 624) 	error = get_special_prop(state, ds, dataset_name, zfs_prop);
d99a015343 (Chris Williamson 2018-02-08 09:16:23 -0700 627) 		/* The value and source have been pushed by get_special_prop */

I need to read up on ZCP and its internals, figure out if this is something in my built system or in ZFS. Anyone else seeing this?

@sempervictus
Copy link
Contributor Author

@kpande, i imagine gentoo is GCC 9.1 @ this point?

@sempervictus
Copy link
Contributor Author

Thank you. Maybe an older toolchain will work...

@sempervictus
Copy link
Contributor Author

oh it doesnt, zfs failed to build, but as i mentioned the fact that the final image was built without linking in ZFS and friends is a whole other issue :)

@sempervictus
Copy link
Contributor Author

So i've got it building, and running zloop without blowing up, yet anyway.
Ended up hacking together this awful nugget of comments:

diff --git a/fs/zfs/zfs/zcp_get.c b/fs/zfs/zfs/zcp_get.c
index ed98f0d1025b..149f39967ae5 100644
--- a/fs/zfs/zfs/zcp_get.c
+++ b/fs/zfs/zfs/zcp_get.c
@@ -423,13 +423,13 @@ get_special_prop(lua_State *state, dsl_dataset_t *ds, const char *dsname,
        case ZFS_PROP_RECEIVE_RESUME_TOKEN: {
                char *token = get_receive_resume_stats_impl(ds);
 
-               VERIFY3U(strlcpy(strval, token, ZAP_MAXVALUELEN),
-                   <, ZAP_MAXVALUELEN);
+               //VERIFY3U(strlcpy(strval, token, ZAP_MAXVALUELEN),
+               //    <, ZAP_MAXVALUELEN);
                if (strcmp(strval, "") == 0) {
                        char *childval = get_child_receive_stats(ds);
 
-                       VERIFY3U(strlcpy(strval, childval, ZAP_MAXVALUELEN),
-                           <, ZAP_MAXVALUELEN);
+                       //VERIFY3U(strlcpy(strval, childval, ZAP_MAXVALUELEN),
+                       //    <, ZAP_MAXVALUELEN);
                        if (strcmp(strval, "") == 0)
                                error = ENOENT;
 

@c0d3z3r0
Copy link
Contributor

c0d3z3r0 commented Jul 9, 2019

ehm. wtf. linux 5.2.0-rc7, gcc 8.3.0-6, zfs-0.8.0-ga3c1a8a03 from my fork... works perfectly...

 12 # zfs
 13 make LOCALVERSION= prepare
 14 (
 15   cd /usr/src/zfs/zfs.git
 16   ./configure \
 17   --with-config=kernel \
 18   --with-linux=/usr/src/kernel/linux.git \
 19   --enable-linux-builtin
 20   ./copy-builtin /usr/src/kernel/linux.git
 21 )
 22 sed -i '/^Provides: linux-image-.*\$debarch$/ s/$/, zfs-modules\nConflicts:     zfs-dkms/' scripts/package/mkdebian
 23 ./scripts/config -e CONFIG_ZFS
 24 
 25 make LOCALVERSION= -j10 bindeb-pkg

@sempervictus
Copy link
Contributor Author

I think this is an issue with my build-stack, so closing out as "not upsteam's problem."

@gcs-github
Copy link

I just ran into the exact same build error while building ZFS built-in (not as a module) as described here: #8999 (comment)

Gentoo, GCC 9.1.0-r1 (default compiler on ~amd64 / Gentoo testing), kernel 4.14.132

The workaround mentioned in #8999 (comment) allows for the compiliation to proceed.

@behlendorf
Copy link
Contributor

In fact, this does appear to be our problem. I suspect your build environment has FORTIFY_SOURCE=2 set which is why you're seeing the error and others have not.. If you're still able to reproduce the issue would you mind applying this change to verify it resolves the issue.

The issue is the cast in the VERIFY3U which isn't correct since the strlcpy return type is signed (even if it can only return positive values). The token will always fit in strval so we can drop the VERIFY which is checking for a truncated string.

diff --git a/module/zfs/zcp_get.c b/module/zfs/zcp_get.c
index ed98f0d..0a5f0b8 100644
--- a/module/zfs/zcp_get.c
+++ b/module/zfs/zcp_get.c
@@ -423,13 +423,11 @@ get_special_prop(lua_State *state, dsl_dataset_t *ds, cons
        case ZFS_PROP_RECEIVE_RESUME_TOKEN: {
                char *token = get_receive_resume_stats_impl(ds);
 
-               VERIFY3U(strlcpy(strval, token, ZAP_MAXVALUELEN),
-                   <, ZAP_MAXVALUELEN);
+               (void) strlcpy(strval, token, ZAP_MAXVALUELEN);
                if (strcmp(strval, "") == 0) {
                        char *childval = get_child_receive_stats(ds);
 
-                       VERIFY3U(strlcpy(strval, childval, ZAP_MAXVALUELEN),
-                           <, ZAP_MAXVALUELEN);
+                       (void) strlcpy(strval, childval, ZAP_MAXVALUELEN);
                        if (strcmp(strval, "") == 0)
                                error = ENOENT;
 

@behlendorf behlendorf reopened this Jul 10, 2019
@behlendorf behlendorf added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jul 10, 2019
@gcs-github
Copy link

gcs-github commented Jul 11, 2019

Yes! My build environment does have _FORTIFY_SOURCE set to 2 and applying your patch resolves the problem for me.

@behlendorf
Copy link
Contributor

@gcs-github thanks! Could you also check if your kernel is built with CONFIG_FORTIFY_SOURCE.

behlendorf added a commit to behlendorf/zfs that referenced this issue Jul 11, 2019
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
will always fit in strval the VERIFY3U has been removed.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#8999
@gcs-github
Copy link

@gcs-github thanks! Could you also check if your kernel is built with CONFIG_FORTIFY_SOURCE.

It is!

@c0d3z3r0
Copy link
Contributor

@behlendorf my kernel config has FORTIFY_SOURCE=y, zfs in-tree has CPPFLAGS = -Wdate-time -D_FORTIFY_SOURCE=2 but I do not get any errors... o.O

@gcs-github
Copy link

gcs-github commented Jul 11, 2019

In light of @c0d3z3r0 's comment and since I hadn't realized that @sempervictus also had 20MB of additional patching (I read too quickly and wrongly assumed a vanilla setup): some extra patching also exists on my end:

Since you identified the problem as an erroneous cast, I suppose it was a good idea to fix it anyway, but still: I'm sorry. I'll make sure to try and reproduce any issue on vanilla kernel sources before reporting or confirming a problem in the future, unless the issue or the discussion is itself about the support of a specific kernel modfication.

@c0d3z3r0
Copy link
Contributor

c0d3z3r0 commented Jul 11, 2019

@gcs-github the kernel gcc patch is ok, I have that, too

I am pretty sure this is caused / revealed by the grsecurity patch, which both of you have

@behlendorf
Copy link
Contributor

No problem, regardless I think we're going to want to remove the VERIFY3U. Alternately, if we really want to keep the sanity check I suspect that changing it to a VERIFY3S would also resolve the build issue. Though, someone who can reproduce the issue would need to perform a test build.

@gcs-github
Copy link

Turns out, VERIFY3S causes the same invalid use of void expression error. Please let me know if there's anything further you'd like to have tested.

@behlendorf
Copy link
Contributor

@gcs-github I think that's it pending review feedback on the PR. Thanks!

@sempervictus
Copy link
Contributor Author

@gcs-github: same here, though the way that ZFS patch is generated is a bit quirky (configure is patched too, but that can only apply cleanly in the same build conditions as the ones under which autogen was run). So far, just to get it building/running, the hack above seems to do the trick.

@sempervictus
Copy link
Contributor Author

sempervictus commented Jul 14, 2019

Seems there might be more of these things lurking around, i'm still seeing this at runtime:

Jul 14 03:51:54 vm kernel:  [<ffffffff819b05d4>] ? dump_stack+0x71/0xad
Jul 14 03:51:54 vm kernel:  [<ffffffff819c424a>] ? pointer+0x47a/0x650
Jul 14 03:51:54 vm kernel:  [<ffffffff819c466c>] ? vsnprintf+0x24c/0x5d0
Jul 14 03:51:54 vm kernel:  [<ffffffff8138101e>] ? __dprintf+0xfe/0x160
Jul 14 03:51:54 vm kernel:  [<ffffffff8131e291>] ? metaslab_sync+0x991/0xba0
Jul 14 03:51:54 vm kernel:  [<ffffffff8134dd7a>] ? vdev_sync+0x6a/0x1d0
Jul 14 03:51:54 vm kernel:  [<ffffffff81330765>] ? spa_sync+0x5f5/0xf20
Jul 14 03:51:54 vm kernel:  [<ffffffff81344d4e>] ? spa_txg_history_init_io+0xfe/0x110
Jul 14 03:51:54 vm kernel:  [<ffffffff813485a5>] ? txg_sync_thread+0x2b5/0x480
Jul 14 03:51:54 vm kernel:  [<ffffffff813482f0>] ? txg_thread_exit.isra.0+0xa0/0xa0
Jul 14 03:51:54 vm kernel:  [<ffffffff8125ae5d>] ? thread_generic_wrapper+0x6d/0x80
Jul 14 03:51:54 vm kernel:  [<ffffffff8125adf0>] ? __thread_exit+0x10/0x10
Jul 14 03:51:54 vm kernel:  [<ffffffff81088f79>] ? kthread+0x119/0x130
Jul 14 03:51:54 vm kernel:  [<ffffffff81088e60>] ? kthread_park+0x90/0x90
Jul 14 03:51:54 vm kernel:  [<ffffffff8100136c>] ? ret_from_fork+0x2c/0x40

same thing for metaslab_alloc as metaslab_sync, apparently we're leaking raw pointers in the process of __dprintf-ing things.
Without unrolling all of those macros, its a bit hard to tell if i'm hitting the __dprintf in debug or libzpool/kernel. In-tree, the latter doesn't seem to exist, so i assume this is happening in zfs_debug.c. Where the passed-in format specifier is defined is a bit opaque as a result, but that's probably what needs cleaning up to not leak sensitive kernel addresses.

@sempervictus
Copy link
Contributor Author

I think a887d65 might be the culprit here since it switches from %p to %px. I sed -i'd all instances of %px with %p in the in-kernel tree, running a build/test cycle presently. Hopefully will know more by tonight.

@sempervictus
Copy link
Contributor Author

sempervictus commented Jul 14, 2019

Nope, starts up with

Jul 14 19:10:01 vm kernel: Call Trace:
Jul 14 19:10:01 vm kernel:  [<ffffffff819af044>] ? dump_stack+0x71/0xad
Jul 14 19:10:01 vm kernel:  [<ffffffff819c2cd6>] ? pointer+0x476/0x650
Jul 14 19:10:01 vm kernel:  [<ffffffff819c30fc>] ? vsnprintf+0x24c/0x5d0
Jul 14 19:10:01 vm kernel:  [<ffffffff8138062e>] ? __dprintf+0xfe/0x160
Jul 14 19:10:01 vm kernel:  [<ffffffff813204fb>] ? metaslab_alloc+0x20b/0x260
Jul 14 19:10:01 vm kernel:  [<ffffffff813b8f0d>] ? zio_dva_allocate+0x1fd/0x750
Jul 14 19:10:01 vm kernel:  [<ffffffff819dd409>] ? mutex_lock+0x9/0x30
Jul 14 19:10:01 vm kernel:  [<ffffffff81320154>] ? metaslab_class_throttle_reserve+0xd4/0x110
Jul 14 19:10:01 vm kernel:  [<ffffffff8125a758>] ? tsd_hash_search.isra.0+0x58/0xc0
Jul 14 19:10:01 vm kernel:  [<ffffffff8125a841>] ? tsd_get_by_thread+0x31/0x50
Jul 14 19:10:01 vm kernel:  [<ffffffff812583f6>] ? taskq_member+0x16/0x30
Jul 14 19:10:01 vm kernel:  [<ffffffff813b8b44>] ? zio_nowait+0xa4/0x140
Jul 14 19:10:01 vm kernel:  [<ffffffff813b998a>] ? zio_ddt_write+0x22a/0x2e0
Jul 14 19:10:01 vm kernel:  [<ffffffff813b4aa7>] ? zio_execute+0x87/0xf0
Jul 14 19:10:01 vm kernel:  [<ffffffff81259ad2>] ? taskq_thread+0x312/0x590
Jul 14 19:10:01 vm kernel:  [<ffffffff81091cf0>] ? wake_up_q+0x60/0x60
Jul 14 19:10:01 vm kernel:  [<ffffffff813b4a20>] ? zio_reexecute+0x3e0/0x3e0
Jul 14 19:10:01 vm kernel:  [<ffffffff812597c0>] ? taskq_thread_spawn+0x60/0x60
Jul 14 19:10:01 vm kernel:  [<ffffffff81089059>] ? kthread+0x119/0x130
Jul 14 19:10:01 vm kernel:  [<ffffffff81088f40>] ? kthread_park+0x90/0x90
Jul 14 19:10:01 vm kernel:  [<ffffffff8100136c>] ? ret_from_fork+0x2c/0x40

and then spams this on ever metaslab_alloc apparently:

Jul 14 19:10:03 vm kernel: Call Trace:
Jul 14 19:10:03 vm kernel:  [<ffffffff819af044>] ? dump_stack+0x71/0xad
Jul 14 19:10:03 vm kernel:  [<ffffffff819c2cd6>] ? pointer+0x476/0x650
Jul 14 19:10:03 vm kernel:  [<ffffffff819c30fc>] ? vsnprintf+0x24c/0x5d0
Jul 14 19:10:03 vm kernel:  [<ffffffff8138062e>] ? __dprintf+0xfe/0x160
Jul 14 19:10:03 vm kernel:  [<ffffffff813204fb>] ? metaslab_alloc+0x20b/0x260
Jul 14 19:10:03 vm kernel:  [<ffffffff813b8f0d>] ? zio_dva_allocate+0x1fd/0x750
Jul 14 19:10:03 vm kernel:  [<ffffffff819dd409>] ? mutex_lock+0x9/0x30
Jul 14 19:10:03 vm kernel:  [<ffffffff81320154>] ? metaslab_class_throttle_reserve+0xd4/0x110
Jul 14 19:10:03 vm kernel:  [<ffffffff8125a758>] ? tsd_hash_search.isra.0+0x58/0xc0
Jul 14 19:10:03 vm kernel:  [<ffffffff8125a841>] ? tsd_get_by_thread+0x31/0x50
Jul 14 19:10:03 vm kernel:  [<ffffffff812583f6>] ? taskq_member+0x16/0x30
Jul 14 19:10:03 vm kernel:  [<ffffffff813b8b44>] ? zio_nowait+0xa4/0x140
Jul 14 19:10:03 vm kernel:  [<ffffffff813b998a>] ? zio_ddt_write+0x22a/0x2e0
Jul 14 19:10:03 vm kernel:  [<ffffffff813b4aa7>] ? zio_execute+0x87/0xf0
Jul 14 19:10:03 vm kernel:  [<ffffffff81259ad2>] ? taskq_thread+0x312/0x590
Jul 14 19:10:03 vm kernel:  [<ffffffff81091cf0>] ? wake_up_q+0x60/0x60
Jul 14 19:10:03 vm kernel:  [<ffffffff813b4a20>] ? zio_reexecute+0x3e0/0x3e0
Jul 14 19:10:03 vm kernel:  [<ffffffff812597c0>] ? taskq_thread_spawn+0x60/0x60
Jul 14 19:10:03 vm kernel:  [<ffffffff81089059>] ? kthread+0x119/0x130
Jul 14 19:10:03 vm kernel:  [<ffffffff81088f40>] ? kthread_park+0x90/0x90
Jul 14 19:10:03 vm kernel:  [<ffffffff8100136c>] ? ret_from_fork+0x2c/0x40

So, lets see if

diff --git i/fs/zfs/zfs/metaslab.c w/fs/zfs/zfs/metaslab.c
index f6ce9c5b8bda..41ad2f819cd4 100644
--- i/fs/zfs/zfs/metaslab.c
+++ w/fs/zfs/zfs/metaslab.c
@@ -4659,7 +4659,7 @@ metaslab_alloc(spa_t *spa, metaslab_class_t *mc, uint64_t psize, blkptr_t *bp,
        ASSERT(ndvas > 0 && ndvas <= spa_max_replication(spa));
        ASSERT(BP_GET_NDVAS(bp) == 0);
        ASSERT(hintbp == NULL || ndvas <= BP_GET_NDVAS(hintbp));
-       ASSERT3P(zal, !=, NULL);
+       // ASSERT3P(zal, !=, NULL);
 
        for (int d = 0; d < ndvas; d++) {
                error = metaslab_alloc_dva(spa, mc, psize, dva, d, hintdva,

can address that.
Also seems my build is ignoring --disable-debug and --disable-debuginfo ... neat.

@sempervictus
Copy link
Contributor Author

@behlendorf the pointer printing thing should probably take into account whether upstream's kptr_restrict is enabled, and determine based on that whether to %p or %px. The VERIFY3P thing i'm a bit confused about (ASSERT3P being VERIFY3P with debug enabled or nothing without it), since i already converted it from %px to %p in the last patch, but thats the only place in metaslab_alloc i could find which would end up printing a formatted pointer.

@sempervictus
Copy link
Contributor Author

No, metaslab_alloc and metaslab_sync still spamming.
Re-running the configure and copy-builtin stuff with the --disable-debug* flags again.

@behlendorf
Copy link
Contributor

should probably take into account whether upstream's kptr_restrict is enabled

According to the documentation, %pK honors how kptr_restrict is set. We should probably consider switching to that from the unconditional %px. @shartse may have some thoughts about this.

No, metaslab_alloc and metaslab_sync still spamming.

Looks at the code, I don't see where this is coming from immediately. You could try ruling out all of the zfs_dbgmsg calls by setting zfs_dbgmsg_enable=0. The dprintf calls are already disabled by default.

@sempervictus
Copy link
Contributor Author

Thank you, %pK is the correct version of that. Strangely the zfs_dbgmsg_enable=0 bootparam is ignored, still spammed, but i did do the sed replacements of %px to %pK which seemed to help, but also ended up forcibly defining NDEBUG for the in-tree build as that somehow gets ignored.

@gcs-github
Copy link

gcs-github commented Jul 16, 2019

@sempervictus I'm reproducing your #8999 (comment) on ZFS 0.8.1 + grsec but it seems triggered by grsecurity's infoleak detection mechanism, not by ZFS debugging routines.

edit: Unless you meant that the debug code is itself what's causing the leaks.

@gcs-github
Copy link

@sempervictus a note re: #8999 (comment) ; one way you can work around the inflexibility (which is also what I did to start testing with zfs 0.8.1) is to simply remove the patch lines related to the configure script (and in the case of zfs 0.8.1, the ones related to the toplevel Makefile.in as well).

The relevant changes in the configure script seem to be generated through config/kernel.m4, which is already being patched appropriately, allowing you to run autoreconf (or eautoreconf if you're working from inside an ebuild) yourself in order to mint a fresh, updated configure script.

@behlendorf
Copy link
Contributor

@gcs-github I'm not familiar with the grsecurity's infoleak detection mechanism, do you know what changes would need to be made to ZFS code to make it happy? Could you refer me to any documentation? Patches welcome!

behlendorf added a commit that referenced this issue Jul 16, 2019
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #8999
Closes #9020
@gcs-github
Copy link

gcs-github commented Jul 16, 2019

@behlendorf Looking inside the code, I tracked down the dmesg infoleak warning to this code block: https://github.com/minipli/linux-unofficial_grsec/blob/linux-4.9.x-unofficial_grsec/lib/vsprintf.c#L1753-L1757 (this comes from minipli's 4.9 port of the last publicly-available grsecurity patch)

GRKERNSEC_HIDESYM is a grsecurity feature aiming to hide kernel symbols from unprivileged users. Its full Kconfig help text is available @ https://github.com/minipli/linux-unofficial_grsec/blob/linux-4.9.x-unofficial_grsec/grsecurity/Kconfig#L201-L224

I suppose the infoleak is therefore a symbol leak somehow.

I'm unaware of any developer-oriented documentation for grsecurity. I've only come across user-facing documentation so far.

For the sake of completeness, here's one of the dmesg traces I reproduced:

[32362.574147] grsec: kernel infoleak detected!  Please report this log to spender@grsecurity.net.
[32362.574151] CPU: 29 PID: 2602 Comm: txg_sync Tainted: G          I     4.14.133-gcsventures #1
[32362.574153] Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0009.060120151350 06/01/2015
[32362.574155]  0000000000000000 ffffffff832d5397 ffffffff832e9786 ffffc900451efbe0
[32362.574158]  ffff889dd3410848 ffffffff832e979a ffff0a00ffffff05 ffff0a00ffffff05
[32362.574160]  0000000000000004 3233313530313039 ffffffff83852420 ffffc900451efbb0
[32362.574163] Call Trace:
[32362.574189]  [<ffffffff832d5397>] dump_stack+0xa2/0x10b
[32362.574193]  [<ffffffff832e9786>] ? pointer+0x526/0x7e0
[32362.574195]  [<ffffffff832e979a>] pointer+0x53a/0x7e0
[32362.574197]  [<ffffffff832e8ca7>] vsnprintf+0x247/0x800
[32362.574218]  [<ffffffff815b2809>] __dprintf+0x129/0x1d0
[32362.574223]  [<ffffffff815263f7>] metaslab_sync+0xe37/0x10f0
[32362.574228]  [<ffffffff8156b8e3>] vdev_sync+0x93/0x320
[32362.574244]  [<ffffffff815449dc>] spa_sync+0x86c/0x1540
[32362.574248]  [<ffffffff8155ef61>] ? spa_txg_history_init_io+0x161/0x180
[32362.574249]  [<ffffffff81562e6e>] txg_sync_thread+0x34e/0x560
[32362.574252]  [<ffffffff81562b20>] ? txg_thread_exit.isra.0+0xd0/0xd0
[32362.574257]  [<ffffffff81423f08>] thread_generic_wrapper+0xb8/0xf0
[32362.574259]  [<ffffffff81423e50>] ? spl_taskq_fini+0xc0/0xc0
[32362.574262]  [<ffffffff811e388c>] kthread+0x19c/0x1e0
[32362.574264]  [<ffffffff811e36f0>] ? __kthread_create_on_node+0x260/0x260
[32362.574267]  [<ffffffff81001541>] ret_from_fork+0x4f/0x5e

@behlendorf behlendorf changed the title ZFS 0.8.x built directly into kernel (not as module) is not detected by some of the userspace bins (worked on 0.7.x) CONFIG_GRKERNSEC_HIDESYM infoleak warnings Jul 16, 2019
@behlendorf
Copy link
Contributor

Thanks for the link. It looks like switching to %pK should be sufficient, which I believe agrees with @sempervictus's #8999 (comment) comment above. I've gone ahead and repurposed this issue so we can track those warnings. The original build failure was resolved in master by 3b03ff2.

TulsiJain pushed a commit to TulsiJain/zfs that referenced this issue Jul 20, 2019
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#8999
Closes openzfs#9020
TulsiJain pushed a commit to TulsiJain/zfs that referenced this issue Jul 20, 2019
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#8999
Closes openzfs#9020
@gcs-github
Copy link

I went into my local copy of the ZFS module code and replaced most of the %p and %px occurences with %pK, for testing purposes. This successfully suppressed the dmesg traces I mentioned above.

What I'm not clear on is how those traces got triggered in the first place. %p and %px occurences in ZFS seem connected to debug messaging and problem reporting, so I would still have expected to see something related to ZFS in the dmesg output at runtime, but I haven't so far.

@sempervictus
Copy link
Contributor Author

So the %p/px replacements are one thing, but it seems there are some really weird hits to dprintf which aren't in the C code:

Jul 24 03:26:11 vm kernel:  [<ffffffff819b3144>] ? dump_stack+0x71/0xad
Jul 24 03:26:11 vm kernel:  [<ffffffff819c6dbb>] ? pointer+0x47b/0x660
Jul 24 03:26:11 vm kernel:  [<ffffffff819c71ec>] ? vsnprintf+0x24c/0x5d0
Jul 24 03:26:11 vm kernel:  [<ffffffff81383eba>] ? __dprintf+0x1ca/0x1e0
Jul 24 03:26:11 vm kernel:  [<ffffffff812cc4af>] ? dmu_buf_rele_array.part.0+0x2f/0x50
Jul 24 03:26:11 vm kernel:  [<ffffffff8132176f>] ? metaslab_condense+0xcf/0x2f0
Jul 24 03:26:11 vm kernel:  [<ffffffff81322129>] ? metaslab_sync+0x449/0x8b0
Jul 24 03:26:11 vm kernel:  [<ffffffff819e14c9>] ? mutex_lock+0x9/0x30
Jul 24 03:26:11 vm kernel:  [<ffffffff819e14c9>] ? mutex_lock+0x9/0x30
Jul 24 03:26:11 vm kernel:  [<ffffffff8135088a>] ? vdev_sync+0x6a/0x1d0
Jul 24 03:26:11 vm kernel:  [<ffffffff81331202>] ? spa_sync+0x612/0xf50
Jul 24 03:26:11 vm kernel:  [<ffffffff8134770e>] ? spa_txg_history_init_io+0xfe/0x110
Jul 24 03:26:11 vm kernel:  [<ffffffff8134af95>] ? txg_sync_thread+0x2b5/0x480
Jul 24 03:26:11 vm kernel:  [<ffffffff8134ace0>] ? txg_thread_exit.isra.0+0xa0/0xa0
Jul 24 03:26:11 vm kernel:  [<ffffffff8125a59d>] ? thread_generic_wrapper+0x6d/0x80
Jul 24 03:26:11 vm kernel:  [<ffffffff8125a530>] ? __thread_exit+0x10/0x10
Jul 24 03:26:11 vm kernel:  [<ffffffff81088d89>] ? kthread+0x119/0x130
Jul 24 03:26:11 vm kernel:  [<ffffffff81088c70>] ? kthread_park+0x90/0x90
Jul 24 03:26:11 vm kernel:  [<ffffffff8100136c>] ? ret_from_fork+0x2c/0x40

which i dont see anywhere in dmu_buf_rele_array.
The actual code which checks for the leak is public in the %pK discussion on the hardening ML:

if ((unsigned long)ptr > TASK_SIZE && *fmt != 'P' && *fmt != 'X' && *fmt != 'K' && s_usercopy_object(buf)) {
                 printk(KERN_ALERT "grsec: kernel infoleak detected! Please report this log to spender@grsecurity.net.\n");
                 dump_stack();
                 ptr = NULL;
}

Maybe some values interpreted as pointers are making it into vsnprintf somehow... i've expanded the check above to print the fmt passed into it to try and determine what's going on there, but for the meantime, on my actual work system, i've just commented out the guts of __dprintf since it cant be disabled when built-in to the kernel binary (not just in-tree as a module, but directly built-in).

For any other grsec users reading this, the reason to use built-in as opposed to module-based is that RAP and KERNEXEC require the BTS instruction for module instrumentation when SMAP isnt available, VMs dont have the relevant subsystem much less instruction, so this permits use of those plugins without SMAP in a VM.

@behlendorf behlendorf added the Type: Building Indicates an issue related to building binaries label Jul 25, 2019
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Aug 13, 2019
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#8999
Closes openzfs#9020
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Aug 21, 2019
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#8999
Closes openzfs#9020
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Aug 22, 2019
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#8999
Closes openzfs#9020
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Aug 23, 2019
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#8999
Closes openzfs#9020
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Sep 17, 2019
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#8999
Closes openzfs#9020
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Sep 18, 2019
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#8999
Closes openzfs#9020
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Sep 23, 2019
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#8999
Closes openzfs#9020
tonyhutter pushed a commit that referenced this issue Sep 26, 2019
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #8999
Closes #9020
@stale
Copy link

stale bot commented Aug 24, 2020

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Aug 24, 2020
@stale stale bot closed this as completed Nov 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Stale No recent activity for issue Type: Building Indicates an issue related to building binaries Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

5 participants
@behlendorf @gcs-github @sempervictus @c0d3z3r0 and others