Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: enabled binary tree optimization with no rules results in default action being ignored #370

Closed
kolyshkin opened this issue Mar 14, 2022 · 11 comments
Assignees
Milestone

Comments

@kolyshkin
Copy link
Contributor

Surely this is a corner case, and enabling binary tree optimization is obviously useless then there are no rules, but it still feels like a bug.

/*

This code demonstrates the following issue with libseccomp <= 2.5.3.

When the binary tree optimization is enabled (i.e. SCMP_FLTATR_CTL_OPTIMIZE
attribute value is set to 2), and no rules are added to the filter, the default
action seems to be ignored:

	$ gcc -lseccomp opt.c && ./a.out
	optimize 2
	load
	Bad system call (core dumped)

Disabling the binary tree optimization and/or adding a single rule
eliminates the issue:

	$ gcc -lseccomp -DOPT_LEVEL=1 opt.c && ./a.out
	optimize 1
	load
	release

	$ gcc -lseccomp -DADD_RULE opt.c && ./a.out
	optimize 2
	rule_add
	load
	release

*/

#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <seccomp.h>

#ifndef OPT_LEVEL
#define OPT_LEVEL 2
#endif

int main(int argc, char *argv[])
{
	int rc = -1;
	scmp_filter_ctx ctx;

	ctx = seccomp_init(SCMP_ACT_ALLOW);
	if (ctx == NULL) {
		fprintf(stderr, "seccomp_init failed\n");
		goto out;
	}

	printf("optimize %d\n", OPT_LEVEL);
	rc = seccomp_attr_set(ctx, SCMP_FLTATR_CTL_OPTIMIZE, OPT_LEVEL);
	if (rc < 0) {
		fprintf(stderr, "seccomp_attr_set: %s\n", strerror(-rc));
		goto out;
	}

#ifdef ADD_RULE
	printf("rule_add\n");
	rc = seccomp_rule_add(ctx, SCMP_ACT_ERRNO(ENOSYS), SCMP_SYS(sync), 0);
	if (rc < 0) {
		fprintf(stderr, "seccomp_rule_add: %s\n", strerror(-rc));
		goto out;
	}
#endif

	printf("load\n");
	rc = seccomp_load(ctx);
	if (rc < 0) {
		fprintf(stderr, "seccomp_load: %s\n", strerror(-rc));
		goto out;
	}

out:
	printf("release\n");
	seccomp_release(ctx);
	return -rc;
}
@pcmoore
Copy link
Member

pcmoore commented Mar 14, 2022

Oh, that's fun :/

Although like you said the practical impact of this should be very close to nil. Still, this could be potentially bad so let's go ahead and add it to the v2.5.4 milestone. Any objections?

@drakenclimber
Copy link
Member

Although like you said the practical impact of this should be very close to nil. Still, this could be potentially bad so let's go ahead and add it to the v2.5.4 milestone. Any objections?

Agreed.

@drakenclimber drakenclimber self-assigned this Mar 14, 2022
@pcmoore
Copy link
Member

pcmoore commented Mar 14, 2022

A bit more info which may be useful in chasing this down ...

I modified a copy of "01-sim-allow.c" to this:

#include <errno.h>
#include <unistd.h>

#include <seccomp.h>

#include "util.h"

int main(int argc, char *argv[])
{
	int rc;
	struct util_options opts;
	scmp_filter_ctx ctx = NULL;

	rc = util_getopt(argc, argv, &opts);
	if (rc < 0)
		goto out;

	ctx = seccomp_init(SCMP_ACT_ALLOW);
	if (ctx == NULL)
		return ENOMEM;

#if 1
	rc = seccomp_attr_set(ctx, SCMP_FLTATR_CTL_OPTIMIZE, 2);
	if (rc < 0)
		goto out;
#endif

	rc = util_filter_output(&opts, ctx);
	if (rc)
		goto out;

out:
	seccomp_release(ctx);
	return (rc < 0 ? -rc : rc);
}

When run without the binary tree optimization I get this:

% ./00-test -b | ../tools/scmp_bpf_disasm 
 line  OP   JT   JF   K
=================================
 0000: 0x20 0x00 0x00 0x00000004   ld  $data[4]
 0001: 0x15 0x00 0x04 0xc000003e   jeq 3221225534 true:0002 false:0006
 0002: 0x20 0x00 0x00 0x00000000   ld  $data[0]
 0003: 0x35 0x00 0x01 0x40000000   jge 1073741824 true:0004 false:0005
 0004: 0x15 0x00 0x01 0xffffffff   jeq 4294967295 true:0005 false:0006
 0005: 0x06 0x00 0x00 0x7fff0000   ret ALLOW
 0006: 0x06 0x00 0x00 0x00000000   ret KILL

When run with the binary tree optimization I get this:

% ./00-test -b | ../tools/scmp_bpf_disasm 
 line  OP   JT   JF   K
=================================
 0000: 0x20 0x00 0x00 0x00000004   ld  $data[4]
 0001: 0x15 0x00 0x04 0xc000003e   jeq 3221225534 true:0002 false:0006
 0002: 0x20 0x00 0x00 0x00000000   ld  $data[0]
 0003: 0x35 0x00 0x01 0x40000000   jge 1073741824 true:0004 false:0005
 0004: 0x15 0x00 0x01 0xffffffff   jeq 4294967295 true:0005 false:0006
 0005: 0x20 0x00 0x00 0x00000000   ld  $data[0]
 0006: 0x06 0x00 0x00 0x00000000   ret KILL

The x86_64/x32 check is correct on both, but in the binary tree case the syscall number is reloaded (line 0005) and the only return option is "KILL".

@pcmoore
Copy link
Member

pcmoore commented Mar 14, 2022

Well, as a quick-hack this "works":

diff --git a/src/gen_bpf.c b/src/gen_bpf.c
index c878f443..54c28c5e 100644
--- a/src/gen_bpf.c
+++ b/src/gen_bpf.c
@@ -1692,6 +1692,7 @@ static struct bpf_blk *_gen_bpf_arch(struct bpf_state *state,
                goto arch_failure;
        blk_cnt += blks_added;
 
+#if 0
        if (bintree_levels > 0) {
                _BPF_INSTR(instr, _BPF_OP(state->arch, BPF_LD + BPF_ABS),
                           _BPF_JMP_NO, _BPF_JMP_NO,
@@ -1705,6 +1706,7 @@ static struct bpf_blk *_gen_bpf_arch(struct bpf_state *state,
                b_bintree->acc_start = _ACC_STATE_UNDEF;
                b_bintree->acc_end = _ACC_STATE_OFFSET(_BPF_OFFSET_SYSCALL);
        }
+#endif
 
        /* additional ABI filtering */
        if ((state->arch->token == SCMP_ARCH_X86_64 ||

... no idea yet if it still works for the other cases.

@pcmoore
Copy link
Member

pcmoore commented Mar 14, 2022

Random observation, it looks like our binary trees may not always be properly balanced.

@pcmoore
Copy link
Member

pcmoore commented Mar 14, 2022

Ran out of time today, if no one else has time to look at it I'll try to take another go at it later this week (or next).

@drakenclimber
Copy link
Member

Random observation, it looks like our binary trees may not always be properly balanced.

Out of simplicity I designed it to fill up the left node before filling the right node. (Each node is 4 syscalls.) My rationale was that this optimization really only makes sense on really, really large filters. An imbalance of (up to) 4 syscalls would be small compared to walking the entire filter of 200+ syscalls.

@drakenclimber
Copy link
Member

Ran out of time today, if no one else has time to look at it I'll try to take another go at it later this week (or next).

No pressure either way. I technically feel like I own it - since it was my crazy idea :). But having another person get somewhat familiar with the code wouldn't be a bad thing either.

I should have time this week to check it out.

@drakenclimber
Copy link
Member

Haven't had a chance to test it, but I believe this is the fix:

$ git diff
diff --git a/src/gen_bpf.c b/src/gen_bpf.c
index c878f443a792..71317612103a 100644
--- a/src/gen_bpf.c
+++ b/src/gen_bpf.c
@@ -1348,6 +1348,9 @@ static int _get_bintree_levels(unsigned int syscall_cnt)
 {
        unsigned int i = 2, max_level = SYSCALLS_PER_NODE * 2;
 
+       if (syscall_cnt == 0)
+               return 0;
+
        while (max_level < syscall_cnt) {
                max_level <<= 1;
                i++;

@drakenclimber
Copy link
Member

drakenclimber commented Mar 15, 2022

$ git diff
diff --git a/src/gen_bpf.c b/src/gen_bpf.c
index c878f443a792..71317612103a 100644
--- a/src/gen_bpf.c
+++ b/src/gen_bpf.c
@@ -1348,6 +1348,9 @@ static int _get_bintree_levels(unsigned int syscall_cnt)
 {
        unsigned int i = 2, max_level = SYSCALLS_PER_NODE * 2;
 
+       if (syscall_cnt == 0)
+               return 0;
+
        while (max_level < syscall_cnt) {
                max_level <<= 1;
                i++;

This made both of the above reproducers work properly for me.

drakenclimber added a commit to drakenclimber/libseccomp that referenced this issue Mar 16, 2022
Handle the unlikely case where a user has chosen the
binary tree optimization but has zero syscalls in their
filter.

Fixes: seccomp#370
Fixes: a3732b3 ("bpf:pfc: Add optimization option to use a binary tree")
Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com>
drakenclimber added a commit to drakenclimber/libseccomp that referenced this issue Mar 16, 2022
Add a test that exercises the binary tree optimization but
the seccomp filter has zero syscalls in it.

Related-bug: seccomp#370
Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com>
drakenclimber added a commit to drakenclimber/libseccomp that referenced this issue Mar 16, 2022
Add a test that exercises the binary tree optimization but
the seccomp filter has zero syscalls in it.

Related-bug: seccomp#370
Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com>
@pcmoore
Copy link
Member

pcmoore commented Mar 17, 2022

$ git diff
diff --git a/src/gen_bpf.c b/src/gen_bpf.c
index c878f443a792..71317612103a 100644
--- a/src/gen_bpf.c
+++ b/src/gen_bpf.c
@@ -1348,6 +1348,9 @@ static int _get_bintree_levels(unsigned int syscall_cnt)
 {
        unsigned int i = 2, max_level = SYSCALLS_PER_NODE * 2;
 
+       if (syscall_cnt == 0)
+               return 0;
+
        while (max_level < syscall_cnt) {
                max_level <<= 1;
                i++;

This made both of the above reproducers work properly for me.

Sorry for the delay, but this looks good to me. Feel free to patch and merge. Thanks @drakenclimber.

Acked-by: Paul Moore <paul@paul-moore.com>

drakenclimber added a commit that referenced this issue Mar 18, 2022
Add a test that exercises the binary tree optimization but
the seccomp filter has zero syscalls in it.

Related-bug: #370
Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com>
Acked-by: Paul Moore <paul@paul-moore.com>
drakenclimber added a commit that referenced this issue Mar 18, 2022
Handle the unlikely case where a user has chosen the
binary tree optimization but has zero syscalls in their
filter.

Fixes: #370
Fixes: a3732b3 ("bpf:pfc: Add optimization option to use a binary tree")
Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com>
Acked-by: Paul Moore <paul@paul-moore.com>
(cherry picked from commit 2de3b87)
drakenclimber added a commit that referenced this issue Mar 18, 2022
Add a test that exercises the binary tree optimization but
the seccomp filter has zero syscalls in it.

Related-bug: #370
Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com>
Acked-by: Paul Moore <paul@paul-moore.com>
(cherry picked from commit 5731dd9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants