-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: clock_gettime and gettimeofday VDSOs on x32 may use x86-64 syscall #107
Comments
This doesn't look too difficult to do (famous last words). I'll see if I can put together a PR. |
Hi @cjwatson, thanks for the report. It's disappointing that the kernel isn't doing the Right Thing here with regards to VDSOs, but given the nature, and support, of the x32 ABI I can't say I'm very surprised. My initial thought on how to resolve this would be to create a x32 specific "rule_add(...)" function, see "src/arch-x86.c:x86_rule_add(...)" as an example. The x32 version would need to handle this in much the same way we handle the multiplexed/direct call socket calls on x86: adding one syscall really ends up adding multiple syscall rules, one for the VDSO and one for the actual x86_64 syscall. Don't forget to check the "strict" setting on the rule, if the strict flag is set you shouldn't add the x86_64 syscall rule. We may also need to disable, or workaround, the x86_64 ABI/syscall check (BADARCH) at the top of the generated filter. Ungh. Hopefully not, but something to be aware of when you start testing your code. |
Thanks for the initial thoughts; this is indeed basically the approach I've been pursuing. Of course my initial assessment was a bit optimistic. The basic problem is that having an architecture-specific I experimented with adding the concept of "early rules" that are inserted into the compiled filter before the architecture check, but realised that that doesn't actually help. We do still want the compiled filter to have strict architecture checking; it's just that we need the implicitly-created x86_64 rules to be under a check for the x86_64 architecture. So I think we need proper I considered reintroducing a So I think that we either need:
Does either of these options sound good, or am I on the wrong track entirely? I do at least have a regression test case for the desired behaviour now, so I can try out various options fairly easily. |
I may possibly mean |
No worries, a little optimism can be a good thing :)
Yes, I was afraid that would be a problem, but I hadn't looked at it too closely so I wasn't sure how problematic it would be in practice.
Yes, I want to limit the number of special cases we need to add to a filter. There are things like the TSKIP processing/checks, but I consider those to be "hacks" and not something we should strive for in the code base.
I will admit, this was my first thought when I started reading your reply, but as you point out, this would not be the correct behavior. However, I think you are on the right path, we just need to tweak the idea a bit, more on this below.
I don't believe we do anything to prevent users from loading the same filter multiple times, or loading it then manipulating it further and reloading it. Of course doing something like this would almost always be a Bad Idea, but we do allow it so I'd like to preserve it if at all possible. We might be able to leverage some of the transaction code to manipulate the filter, load it, and then rollback the changes. Although I can't say I would be very supportive of such code, but I guess it would be a possibility.
You're getting close to what I'm thinking ... What if we were to augment the db_filter struct so that we could add a "hidden" arch/ABI to the filter that would only allow the addition of rules from libseccomp internal functions? In other words, we would create a new ABI filter that would not be accessible via the libseccomp API (it wouldn't show up in the PFC output, only the BPF output), but we could add rules to it using the internal functions, e.g. an x32 specific rule_add() function. Of course if the user added the "hidden" arch/ABI to the filter then everything would be made visible, but until then it would remain hidden. What do you think about this approach? |
In an effort to get v2.6.0 out sooner than later, I'm going to suggest we push this out to v2.7.0; if you have any concerns or objections please drop a comment. |
On x32, the kernel VDSO that provides clock_gettime and gettimeofday sometimes falls back to the underlying syscall. Unfortunately, it falls back to the x86-64 variant of that syscall (https://bugs.debian.org/850047 is an example from a non-libseccomp context).
It would be possible for every libseccomp user that needs these syscalls to work around these by something like this (omitting error handling):
This seems cumbersome and easy to get wrong, though, and it seems like the kind of architecture-specific quirk that libseccomp is supposed to deal with for us. Would it be possible for libseccomp to handle this?
The text was updated successfully, but these errors were encountered: