-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
maps,programs: avoid path UTF-8 assumptions #742
Conversation
✅ Deploy Preview for aya-rs-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
@addisoncrump care to review this as well? |
d343ff6
to
f061a97
Compare
Sure, I'll take a peek in a bit. |
f061a97
to
e0bb296
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me.x
Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @dave-tucker and @tamird)
aya/src/programs/uprobe.rs
line 295 at r1 (raw file):
.filter_map(|mut line| { loop { if let [stripped @ .., c] = line {
Could be the following, but as you noted, it'll have a bounds check.
let line = line
.iter()
.rev()
.position(|&b| !b.is_ascii_whitespace())
.and_then(|pos| line.get(..line.len() - pos))
.unwrap_or(line);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 9 files reviewed, all discussions resolved (waiting on @ajwerner and @dave-tucker)
aya/src/programs/uprobe.rs
line 295 at r1 (raw file):
Previously, ajwerner wrote…
Could be the following, but as you noted, it'll have a bounds check.
let line = line .iter() .rev() .position(|&b| !b.is_ascii_whitespace()) .and_then(|pos| line.get(..line.len() - pos)) .unwrap_or(line);
I think you'd want to protect yourself from bugs by writing
let line = line
.iter()
.rev()
.position(|&b| !b.is_ascii_whitespace())
.map(|pos| line.get(..line.len() - pos).unwrap())
.unwrap_or(line);
at which point this is looking rather ugly.
e0bb296
to
09bee10
Compare
@@ -68,7 +68,7 @@ impl KProbe { | |||
/// | |||
/// The returned value can be used to detach from the given function, see [KProbe::detach]. | |||
pub fn attach(&mut self, fn_name: &str, offset: u64) -> Result<KProbeLinkId, ProgramError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly annoying use case, but it is possible to store non-UTF-8 names in symbol string tables, which occasionally shows up in e.g. source-protected code (and may be present in loaded kernel modules). If we want to support this we should probably use the <P: AsRef<Path>>
idiom, and I don't think it would break things for downstream. This holds for UProbe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an example of this:
$ cat evil.c
#include <stdio.h>
void thing() {
puts("hello!");
}
int main(int argc, char **argv) {
thing();
return 0;
}
$ gcc -O0 -c evil.c
$ objcopy --redefine-sym thing="$(printf '\x80')" evil.o
$ objdump -t evil.o
evil.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 evil.c
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .rodata 0000000000000000 .rodata
0000000000000000 g F .text 0000000000000011 �
0000000000000000 *UND* 0000000000000000 puts
0000000000000011 g F .text 0000000000000020 main
$ gcc evil.o -o evil
$ ./evil
hello!
aya/src/programs/uprobe.rs
Outdated
@@ -49,6 +49,39 @@ pub struct UProbe { | |||
pub(crate) kind: ProbeKind, | |||
} | |||
|
|||
trait OsStringExt { | |||
fn starts_with(&self, needle: &OsStr) -> bool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's a needle? I would have expected pattern
@@ -172,6 +204,7 @@ fn test_resolve_attach_path() { | |||
// Now let's resolve the path to libc. It should exist in the current process's memory map and | |||
// then in the ld.so.cache. | |||
let libc_path = resolve_attach_path(&"libc", Some(pid)).unwrap(); | |||
let libc_path = libc_path.to_str().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could panic, because you're making another assumption about UTF-8 paths here.
The assertion below should probably be either that the libc_path.file_name()
contains libc
.
} | ||
|
||
fn find_lib_in_proc_maps(pid: pid_t, lib: &str) -> Result<Option<String>, io::Error> { | ||
fn find_lib_in_proc_maps(pid: pid_t, lib: &Path) -> Result<Option<PathBuf>, io::Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib should be an &OsStr
- it's never used as a &Path
here.
@@ -68,7 +68,7 @@ impl KProbe { | |||
/// | |||
/// The returned value can be used to detach from the given function, see [KProbe::detach]. | |||
pub fn attach(&mut self, fn_name: &str, offset: u64) -> Result<KProbeLinkId, ProgramError> { | |||
attach(&mut self.data, self.kind, fn_name, offset, None) | |||
attach(&mut self.data, self.kind, Path::new(fn_name), offset, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub fn attach<T: AsRef<Path>(&mut self, fn_name: T, offset: u64) -> Result<KProbeLinkId, ProgramError> {
This won't break the API since both &str and &OsStr implement AsRef<Path>
.
Not sure there's a function we'd kprobe that isn't UTF-8, but 🤷 we may as well be thorough.
aya/src/programs/probe.rs
Outdated
@@ -167,6 +167,7 @@ fn create_probe_event( | |||
KRetProbe | URetProbe => 'r', | |||
}; | |||
|
|||
let fn_name = fn_name.to_string_lossy(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it OK to do lossy conversion here but not in other places?
Ultimately it looks like this name is being written to a file as bytes so a lossy conversion is unnecessary.
aya/src/programs/uprobe.rs
Outdated
} else { | ||
lib.to_string() | ||
}; | ||
pub fn resolve(&self, lib: &Path) -> Option<&Path> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I think lib
should be &OsStr
here too.
aya/src/programs/uprobe.rs
Outdated
@@ -127,31 +160,30 @@ impl UProbe { | |||
fn resolve_attach_path<T: AsRef<Path>>( | |||
target: &T, | |||
pid: Option<pid_t>, | |||
) -> Result<Cow<'_, str>, UProbeError> { | |||
) -> Result<Cow<'_, Path>, UProbeError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need Cow here? With the ldsocache changes to store PathBuf, 2/3 successful code paths yield a pathbuf, where the final one returns the input if the path is absolute.
key: String, | ||
value: String, | ||
key: OsString, | ||
value: OsString, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Value should be PathBuf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 9 files reviewed, 11 unresolved discussions (waiting on @addisoncrump, @ajwerner, and @dave-tucker)
aya/src/programs/kprobe.rs
line 70 at r2 (raw file):
Previously, addisoncrump (Addison Crump) wrote…
As an example of this:
$ cat evil.c #include <stdio.h> void thing() { puts("hello!"); } int main(int argc, char **argv) { thing(); return 0; } $ gcc -O0 -c evil.c $ objcopy --redefine-sym thing="$(printf '\x80')" evil.o $ objdump -t evil.o evil.o: file format elf64-x86-64 SYMBOL TABLE: 0000000000000000 l df *ABS* 0000000000000000 evil.c 0000000000000000 l d .text 0000000000000000 .text 0000000000000000 l d .rodata 0000000000000000 .rodata 0000000000000000 g F .text 0000000000000011 � 0000000000000000 *UND* 0000000000000000 puts 0000000000000011 g F .text 0000000000000020 main $ gcc evil.o -o evil $ ./evil hello!
Sounds reasonable, but I'll leave that for a future change to avoid mixing API and non-API changes.
aya/src/programs/kprobe.rs
line 71 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
pub fn attach<T: AsRef<Path>(&mut self, fn_name: T, offset: u64) -> Result<KProbeLinkId, ProgramError> {This won't break the API since both &str and &OsStr implement
AsRef<Path>
.
Not sure there's a function we'd kprobe that isn't UTF-8, but 🤷 we may as well be thorough.
It might break type inference if someone's .into()
ing the operand.
I think that's better, but I'd prefer to avoid API review here.
aya/src/programs/probe.rs
line 170 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
Why is it OK to do lossy conversion here but not in other places?
Ultimately it looks like this name is being written to a file as bytes so a lossy conversion is unnecessary.
The very next line calls str::replace
, so my feeling is that this need not match exactly. We can probably avoid the conversion but the code will become more complex. Do you know if we need to?
aya/src/programs/uprobe.rs
line 53 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
What's a needle? I would have expected
pattern
It's not a pattern though. See the argument name here: https://doc.rust-lang.org/stable/std/primitive.slice.html#method.starts_with
aya/src/programs/uprobe.rs
line 163 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
Do we still need Cow here? With the ldsocache changes to store PathBuf, 2/3 successful code paths yield a pathbuf, where the final one returns the input if the path is absolute.
This is a fair point, but what would you have this return? Result<Option<PathBuf>, UProbeError>
? Is that better?
aya/src/programs/uprobe.rs
line 207 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
This could panic, because you're making another assumption about UTF-8 paths here.
The assertion below should probably be either that thelibc_path.file_name()
containslibc
.
This is a test.
aya/src/programs/uprobe.rs
line 316 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
lib should be an
&OsStr
- it's never used as a&Path
here.
I tried to keep OsStr out of the various function signatures. The caller has a Path, it doesn't care that we covert to OsStr for "reasons".
aya/src/programs/uprobe.rs
line 333 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
Value should be
PathBuf
Why?
aya/src/programs/uprobe.rs
line 440 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
Again, I think
lib
should be&OsStr
here too.
See previous reply.
aya/src/sys/netlink.rs
line 745 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
Again, struggling to see how this needs to be part of this commit.
It can be a separate commit. I was just hunting for unnecessary lossy string conversions.
aya-obj/src/obj.rs
line 1650 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
I'm not sure how these changes are relevant to the commit?
It can be a separate commit. I was just hunting for unnecessary lossy string conversions.
09bee10
to
a88bd5d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 9 files reviewed, 11 unresolved discussions (waiting on @addisoncrump, @ajwerner, and @dave-tucker)
aya/src/sys/netlink.rs
line 745 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
It can be a separate commit. I was just hunting for unnecessary lossy string conversions.
Extracted into a parent commit.
aya-obj/src/obj.rs
line 1650 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
It can be a separate commit. I was just hunting for unnecessary lossy string conversions.
Extracted into a parent commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 9 files reviewed, 7 unresolved discussions (waiting on @addisoncrump, @ajwerner, and @tamird)
aya/src/programs/probe.rs
line 170 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
The very next line calls
str::replace
, so my feeling is that this need not match exactly. We can probably avoid the conversion but the code will become more complex. Do you know if we need to?
See:
https://docs.kernel.org/trace/kprobetrace.html
https://docs.kernel.org/trace/uprobetracer.html
Which describes the format of this file for kprobes/uprobes.
Atlet probe
... later in this function .. we should be using the path verbatim - not after a lossy conversion. Assuming it follows standard kernel convention. Reference: https://lwn.net/Articles/71472/
The event alias component, where fixed_fn_name is used transforms /bin/bash
into bin_bash
where it's used to build a unique id - aya_1234_bin_bash
. I think we also need to do further due diligence on whether the presence of https://doc.rust-lang.org/std/char/constant.REPLACEMENT_CHARACTER.html in this identifier would cause issues.
aya/src/programs/uprobe.rs
line 163 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
This is a fair point, but what would you have this return?
Result<Option<PathBuf>, UProbeError>
? Is that better?
I feel Result<Option<PathBuf>, UProbeError>
is better, but that's personal preference.
aya/src/programs/uprobe.rs
line 333 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
Why?
You can check the output of ldconfig -p
the keys are OsString, the values are PathBuf.
a88bd5d
to
c087d03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 9 files reviewed, 7 unresolved discussions (waiting on @addisoncrump, @ajwerner, and @dave-tucker)
aya/src/programs/probe.rs
line 170 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
See:
https://docs.kernel.org/trace/kprobetrace.html
https://docs.kernel.org/trace/uprobetracer.htmlWhich describes the format of this file for kprobes/uprobes.
At
let probe
... later in this function .. we should be using the path verbatim - not after a lossy conversion. Assuming it follows standard kernel convention. Reference: https://lwn.net/Articles/71472/The event alias component, where fixed_fn_name is used transforms
/bin/bash
intobin_bash
where it's used to build a unique id -aya_1234_bin_bash
. I think we also need to do further due diligence on whether the presence of https://doc.rust-lang.org/std/char/constant.REPLACEMENT_CHARACTER.html in this identifier would cause issues.
OK this was pretty involved, but done.
aya/src/programs/uprobe.rs
line 163 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
I feel
Result<Option<PathBuf>, UProbeError>
is better, but that's personal preference.
Looking at this again: it's not correct. 1/3 code paths returns an owned PathBuf, the other 2 return Path. Even if ldsocache held PathBuf, it would return borrowed Path. Left as-is.
aya/src/programs/uprobe.rs
line 333 at r2 (raw file):
Previously, dave-tucker (Dave Tucker) wrote…
You can check the output of
ldconfig -p
the keys are OsString, the values are PathBuf.
OsString and PathBuf are the same type, we should choose the one that's easier to work with. In this case it's easier to work with OsString because all the usages (e.g. the various calls to strip_suffix and friends) want the OsString semantics, not the Path semantics.
c087d03
to
4bfe288
Compare
Hey @alessandrod, this pull request changes the Aya Public API and requires your review. |
e2a91d2
to
8d11edc
Compare
@tamird, this pull request is now in conflict and requires a rebase. |
8d11edc
to
2f3b8b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 6 files at r2, 7 of 7 files at r5, 6 of 9 files at r6.
Reviewable status: 8 of 11 files reviewed, 7 unresolved discussions (waiting on @addisoncrump, @alessandrod, and @dave-tucker)
aya/src/programs/probe.rs
line 51 at r6 (raw file):
bytes.as_ref().split(|b| b == &b'\n').map(|mut line| { loop {
nit: while let
is a thing
while let [stripped @ .., c] = line {
if c.is_ascii_whitespace() {
line = stripped;
continue
}
break;
}
aya/src/programs/uprobe.rs
line 258 at r6 (raw file):
.split(|b| b == &b'\n') .filter_map(|mut line| { loop {
same nit from above: while let
exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 9 files at r6, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @addisoncrump, @alessandrod, and @dave-tucker)
2f3b8b0
to
3c1359e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @addisoncrump, @alessandrod, and @dave-tucker)
aya/src/programs/probe.rs
line 51 at r6 (raw file):
Previously, ajwerner wrote…
nit:
while let
is a thingwhile let [stripped @ .., c] = line { if c.is_ascii_whitespace() { line = stripped; continue } break; }
Done.
aya/src/programs/uprobe.rs
line 258 at r6 (raw file):
Previously, ajwerner wrote…
same nit from above:
while let
exists.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 9 files at r7, 1 of 9 files at r8.
Reviewable status: 3 of 11 files reviewed, 7 unresolved discussions (waiting on @addisoncrump, @alessandrod, and @dave-tucker)
@tamird, this pull request is now in conflict and requires a rebase. |
3c1359e
to
693c76d
Compare
693c76d
to
a8bdaba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but we need to also apply similar changes for function names in the future.
f2f58b1
to
8ad04c7
Compare
We can be strict in tests.
@tamird, this pull request is now in conflict and requires a rebase. |
8ad04c7
to
0bba9b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dismissed @addisoncrump and @dave-tucker from 7 discussions.
Reviewable status: 1 of 15 files reviewed, all discussions resolved
aya/src/programs/kprobe.rs
line 70 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
Sounds reasonable, but I'll leave that for a future change to avoid mixing API and non-API changes.
aya/src/programs/kprobe.rs
line 71 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
It might break type inference if someone's
.into()
ing the operand.I think that's better, but I'd prefer to avoid API review here.
This change is