aya,aya-obj,integration-test: add better support in `ProgramInfo` & `MapInfo` for old kernels #1007

tyrone-wu · 2024-08-04T23:50:10Z

This adds support for the field availability in ProgramInfo and MapInfo.

Srry this is long, let me know and I can try to TL;DR it.

For ProgramInfo:

Have program_type() to return bpf_prog_type instead of the integer representation.
For fields that can't be 0, return Option<NonZero*> type to indicate None as field being unavailable.
Have name_as_str() use the bpf_name() feature probe as an additional check for whether field is available. Besides invalid unicode, None is returned if probe does not detect feature.
For mad_ids():
- Now returns a result of Option<Vec<NonZeroU32>>, with None indicating field is unavailable.
- On kernels below v4.15, the init closure that fills in the map info causes an E2BIG error since check_uarg_tail_zero() expects the entire struct to be zero bytes. I initially worked around this by having bpf_prog_get_info_by_fd() do an additional syscall if the first failed with E2BIG.
- Now, map_ids() and bpf_prog_get_info_by_fd() both uses a new feature probe prog_info_map_ids() to detect if field available.
For gpl_compatible(), it now returns Option<bool>, with None indicating field is unavailable. It uses a new feature probe prog_info_gpl_compatible() to detect for field availability.

There was a redundant export of loaded_programs() which gave it 2 method of access: aya::loaded_programs and aya::programs::loaded_programs. To avoid confusion, it should now only be aya::programs::loaded_programs.

For MapInfo:

Have map_type() to return bpf_map_type instead of the integer representation.
For fields that can't be 0, return Option<NonZero*> type to indicate None as field being unavailable.
For name_as_str(), it uses the bpf_name() feature probe to detect for field availability. Although the probe performs the check on program name, map name was introduced in the same commit as program name so it should suffice.

Integration tests for ProgramInfo and MapInfo are updated to use kernel_assert/kernel_assert_eq macro.
The macro first performs the assertion check. If assertion passes, continue with test. If failed, then check host's kernel version.
If host is below the feature version, then continue with test and log it in stderr (can be observed with -- --nocapture). Otherwise, panic 😰.

I verified that this passes on versions (i use ubuntu mainline images for my integration tests):

4.13.0
4.15.0
5.15.0
6.1.0

netlify · 2024-08-04T23:50:24Z

✅ Deploy Preview for aya-rs-docs ready!

Built without sensitive environment variables

Name	Link
🔨 Latest commit	`fbb0930`
🔍 Latest deploy log	https://app.netlify.com/sites/aya-rs-docs/deploys/66d5def807650600083b06fb
😎 Deploy Preview	https://deploy-preview-1007--aya-rs-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

mergify · 2024-08-04T23:50:51Z

Hey @alessandrod, this pull request changes the Aya Public API and requires your review.

aya-obj/src/links.rs

aya/src/maps/info.rs

aya/src/programs/info.rs

alessandrod

Thanks for doing this work! Overall looks great, left some comments

mergify · 2024-09-01T13:07:56Z

@tyrone-wu, this pull request is now in conflict and requires a rebase.

aya/src/maps/info.rs

aya/src/programs/info.rs

aya/src/maps/info.rs

alessandrod

Super solid work as always, just a couple of comments then we're ready to go :)

Improves the existing integraiton tests for `loaded_programs()` and `loaded_maps()` in consideration for older kernels: - Opt for `SocketFilter` program in tests since XDP requires v4.8 and fragments requires v5.18. - For assertion tests, first perform the assertion, if the assertion fails, then it checks the host kernel version to see if it is above the minimum version requirement. If not, then continue with test, otherwise fail. For assertions that are skipped, they're logged in stderr which can be observed with `-- --nocapture`. This also fixes the `bpf_prog_get_info_by_fd()` call for kernels below v4.15. If calling syscall on kernels below v4.15, it can produce an `E2BIG` error because `check_uarg_tail_zero()` expects the entire struct to all-zero bytes (which is caused from the map info). Instead, we first attempt the syscall with the map info filled, if it returns `E2BIG`, then perform syscall again with empty closure. Also adds doc for which version a kernel feature was introduced for better awareness. The tests have been verified kernel versions: - 4.13.0 - 4.15.0 - 6.1.0

aya/src/maps/info.rs

aya-obj/src/obj.rs

Add conversion from u32 to program type, link type, and attach type. Additionally, remove duplicate match statement for u32 conversion to `BPF_MAP_TYPE_BLOOM_FILTER` & `BPF_MAP_TYPE_CGRP_STORAGE`. New error `InvalidTypeBinding<T>` is created to represent when a parsed/received value binding to a type is invalid. This is used in the new conversions added here, and also replaces `InvalidMapTypeError` in `TryFrom` for `bpf_map_type`.

Purpose of this commit is to add detections for whether a field is available in `ProgramInfo`. - For `program_type()`, we return the new enum `ProgramType` instead of the integer representation. - For fields that we know cannot be zero, we return `Option<NonZero*>` type. - For `name_as_str()`, it now also uses the feature probe `bpf_name()` to detect if field is available or not. - Two additional feature probes are added for the fields: - `prog_info_map_ids()` probe -> `map_ids()` field - `prog_info_gpl_compatible()` probe -> `gpl_compatible()` field With the `prog_info_map_ids()` probe, the previous implementation that I had for `bpf_prog_get_info_by_fd()` is shortened to use the probe instead of having to make 2 potential syscalls. The `test_loaded_at()` test is also moved into info tests since it is better related to the info tests. `aya::programs::Programs::prog_type(&self)` now returns `ProgramType` instead of the generated FFI from aya-obj. Also previously, `loaded_programs()` could be accessed either through `aya` or `aya::programs`. To avoid confusion and duplicate export of the item, the function should now only be exposed through `aya::programs`.

Adds detection for whether a field is available in `MapInfo`: - For `map_type()`, we treturn new enum `MapType` instead of the integer representation. - For fields that can't be zero, we return `Option<NonZero*>` type. - For `name_as_str()`, it now uses the feature probe `bpf_name()` to detect if field is available. Although the feature probe checks for program name, it can also be used for map name since they were both introduced in the same commit.

tamird · 2024-09-05T15:03:16Z

aya-log/src/lib.rs

        let map = program_info
            .map_ids()
            .map_err(Error::ProgramError)?
+            .expect("`map_ids` field in `bpf_prog_info` not available")


why is it ok to panic here?

Sorry, I definitely shouldn't be panic in the library. I'll have that resolved soon.

tamird · 2024-09-05T15:03:19Z

aya-log/src/lib.rs

            .find(|map_info| match map_info.name_as_str() {
                Some(name) => name == MAP_NAME,
                None => false,
            })
            .ok_or(Error::MapNotFound)?;
-        let map = MapData::from_id(map.id()).map_err(Error::MapError)?;
+        let map = MapData::from_id(map.id().unwrap().get()).map_err(Error::MapError)?;


why is it ok to unwrap here?

Sorry, I should have mapped that into an error. I'll have that resolved soon.

tamird · 2024-09-05T15:03:47Z

aya-log/src/lib.rs

@@ -137,19 +136,21 @@ impl EbpfLogger {
    ) -> Result<EbpfLogger, Error> {
        let program_info = loaded_programs()
            .filter_map(|info| info.ok())
-            .find(|info| info.id() == program_id)
+            .find(|info| info.id().is_some_and(|id| id.get() == program_id))


this is a more complicated (and less backwards-compatible) version of info.id() == Some(program_id)

Ahhhhh you're right. 😓

tamird · 2024-09-05T15:07:55Z

test/integration-test/src/utils.rs

+            } else {
+                // Continue with tests since host is not expected to have feat
+                eprintln!(
+                    r#"ignoring assertion at {}:{}


we shouldn't ignore the assertion - we should instead assert that the feature is indeed missing.

So I did some experimenting with bpf_obj_get_info_by_fd() after examining the code some more. I tested that I can use bpf_check_uarg_tail_zero() to check if a field gives E2BIG if I pass the struct with the field filled with some arbitrary value, however, this would require exposing bpf_obj_get_info_by_fd() for the test crate to access.

tamird · 2024-09-05T15:09:36Z

aya/src/programs/info.rs

+            Ok(Some(
+                map_ids
+                    .into_iter()
+                    .map(|id| NonZeroU32::new(id).unwrap())


this should be .expect with a string that mentions the version detection above.

Now that I think about it, I think I might of turned the API into an unwrapping nightmare with all the Option<NonZero*>. 😵

I'm currently thinking of having fields that were introduced in the same kernel version as the struct not be options (just the direct value). What do you think?

For map_ids(), it looks like it's reasonable to assume the map id would never be zero based on what I traced:

https://elixir.bootlin.com/linux/v4.15/source/kernel/bpf/syscall.c#L1590: how map IDs are getting filled in bpf_prog_get_info_by_fd()

https://elixir.bootlin.com/linux/v4.15/source/kernel/bpf/verifier.c#L4735: how map info is filled from bpf_verifier_env into bpf_prog_aux

https://elixir.bootlin.com/linux/v4.15/source/kernel/bpf/verifier.c#L4249: how map is set in bpf_verifier_env before checking if map is valid

Some of the code is shifted in kernel v6.10, but the logic should still be the same. https://elixir.bootlin.com/linux/v6.10/source/kernel/bpf/verifier.c#L18565

cc'ing @alessandrod for thoughts as well

I'm currently thinking of having fields that were introduced in the same kernel version as the struct not be options (just the direct value). What do you think?

Yep this sounds good.

For map_ids(), it looks like it's reasonable to assume the map id would never be zero based on what I traced:

I don't think that any bpf ids can be 0 regardless of type (map, program, etc)?

In fact I'd probably change all these to return Option. I think NonZero* is nice to accept as input to enforce safety, but as a return value in these utils seems more annoying than anything else.

I don't think that any bpf ids can be 0 regardless of type (map, program, etc)?

It should not be possible yes

In fact I'd probably change all these to return Option. I think NonZero* is nice to accept as input to enforce safety, but as a return value in these utils seems more annoying than anything else.

Yeah I do agree with you, I would prefer Option over NonZero* on returns too.

tamird · 2024-09-05T15:13:58Z

aya/src/sys/bpf.rs

@@ -694,6 +703,62 @@ pub(crate) fn is_prog_name_supported() -> bool {
    bpf_prog_load(&mut attr).is_ok()
 }

+/// Tests whether `nr_map_ids` & `map_ids` fields in `bpf_prog_info` is available.
+pub(crate) fn is_info_map_ids_supported() -> bool {


whole lot of repetition in these functions :/

marysaka

Overall, I agree with @alessandrod, we shouldn't use Option<NonZero*> as return values but Option<IntegerType> instead (0 mapping to None)

Some code pattern could also be simplified I feel here and there but the logic related to those are usually small so that's fine in any cases.

Nice to see all the new comments around kernel version here, I didn't dig too much if the version were appropriate everywhere but if you want I can look more closely at it later 👍

marysaka · 2024-09-06T08:41:48Z

aya-obj/src/links.rs

+    fn try_from(link_type: u32) -> Result<Self, Self::Error> {
+        use bpf_link_type::*;
+        Ok(match link_type {
+            x if x == BPF_LINK_TYPE_UNSPEC as u32 => BPF_LINK_TYPE_UNSPEC,


Kind of a bit nit here but I would instead transform link_type to bpf_link_type::Type that would remove all the as u32 boilerplate around 😄

Could you elaborate on ::Type? I'm not familiar with the usage of that syntax. 🤔

Those BPF_LINK_TYPE_ values have a type of bpf_link_type::Type (that you can find defined here for x86_64)

So the idea would be to cast link_type to bpf_link_type::Type instead of casting each BPF_LINK_TYPE_ values to u32.

marysaka · 2024-09-06T08:42:18Z

aya-obj/src/links.rs

+
+    fn try_from(attach_type: u32) -> Result<Self, Self::Error> {
+        use bpf_attach_type::*;
+        Ok(match attach_type {


Same comment as earlier but this time with bpf_attach_type::Type

aya-obj/src/maps.rs

marysaka · 2024-09-06T08:44:36Z

aya-obj/src/programs/types.rs

+    type Error = InvalidTypeBinding<u32>;
+
+    fn try_from(prog_type: u32) -> Result<Self, Self::Error> {
+        Ok(match prog_type {


Same comment as earlier but this time with bpf_prog_type::Type

marysaka · 2024-09-06T08:52:27Z

aya/src/programs/info.rs

+    pub fn name_as_str(&self) -> Option<&str> {
+        let name = std::str::from_utf8(self.name()).ok();
+        if let Some(name_str) = name {
+            if FEATURES.bpf_name() || !name_str.is_empty() {
+                return name;
+            }
+        }
+        None
+    }


Suggested change

pub fn name_as_str(&self) -> Option<&str> {

let name = std::str::from_utf8(self.name()).ok();

if let Some(name_str) = name {

if FEATURES.bpf_name() || !name_str.is_empty() {

return name;

}

}

None

}

pub fn name_as_str(&self) -> Option<&str> {

let name = std::str::from_utf8(self.name()).ok()?;

if FEATURES.bpf_name() || !name.is_empty() {

return Some(name);

}

None

}

marysaka · 2024-09-06T08:54:35Z

aya/src/util.rs

+impl Display for KernelVersion {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        write!(f, "{}.{}.{}", self.major, self.minor, self.patch)
+    }
+}
+


Nice thing to see added 👍

tamird · 2024-09-06T10:00:41Z

Overall, I agree with @alessandrod, we shouldn't use Option<NonZero*> as return values but Option<IntegerType> instead (0 mapping to None)

Why is that better? If we know they can't be zero it's better to communicate that in the type.

alessandrod · 2024-09-06T10:16:44Z

Why is that better? If we know they can't be zero it's better to communicate that in the type.

Because API ergonomics matter.

NonZero<T> doesn't Deref to T, so I have to call .get(). Having to call get seems worse than having extra information there's no use for, especially since NonZero<T>::new(v) return Option<NonZero<T>>, so I have to match and get.

// This _does_ provide useful information - I can't call f(0)
fn f(x: NonZeroU32) {
}

// Option tells me whether the map_id is there or not. Then what use is it to me to know that the map id won't be 0? 
// This would be useful if I were to pass the return value to something else, but we have stronger wrappers for things like ProgramIds etc.
fn map_id(&self) -> Option<NonZeroU32> {
}

tamird · 2024-09-06T11:03:23Z

Why is that better? If we know they can't be zero it's better to communicate that in the type.

Because API ergonomics matter.

NonZero<T> doesn't Deref to T, so I have to call .get(). Having to call get seems worse than having extra information there's no use for, especially since NonZero<T>::new(v) return Option<NonZero<T>>, so I have to match and get.
// This _does_ provide useful information - I can't call f(0)
fn f(x: NonZeroU32) {
}

// Option tells me whether the map_id is there or not. Then what use is it to me to know that the map id won't be 0? 
// This would be useful if I were to pass the return value to something else, but we have stronger wrappers for things like ProgramIds etc.
fn map_id(&self) -> Option<NonZeroU32> {
}

Then why expose the scalar at all? If the value doesn't matter, make it opaque.

alessandrod · 2024-09-06T11:14:09Z

Then why expose the scalar at all? If the value doesn't matter, make it opaque.

We should definitely have types for all these IDs, but babysteps. Even without stronger ids this PR was an improvement over what was there before, and removing NonZeroU32 is an improvement over what's there, imo.

Addresses the feedback from aya-rs#1007: - remove panic from `unwrap` and `expect` - Option<NonZero*> => Option<int> with `0` mapping to `None` - use `aya_ebpf::binding` in `u32` conversion in prog, link, and attach type for more straightforward matching Refs: aya-rs#1007

Addresses the feedback from aya-rs#1007: - remove panic from `unwrap` and `expect` - Option<NonZero*> => Option<int> with `0` mapping to `None` Refs: aya-rs#1007

Addresses the feedback from #1007: - remove panic from `unwrap` and `expect` - Option<NonZero*> => Option<int> with `0` mapping to `None` Refs: #1007

mergify bot added the api/needs-review Makes an API change that needs review label Aug 4, 2024

mergify bot requested a review from alessandrod August 4, 2024 23:50

mergify bot added aya This is about aya (userspace) aya-log Relating to aya-log aya-obj Relating to the aya-obj crate test A PR that improves test cases or CI labels Aug 4, 2024

tyrone-wu force-pushed the aya/info-api branch 6 times, most recently from f6ff1c8 to e76da67 Compare August 10, 2024 13:34

tyrone-wu marked this pull request as ready for review August 10, 2024 15:23

tyrone-wu force-pushed the aya/info-api branch 3 times, most recently from eaa9670 to be61084 Compare August 12, 2024 20:04