new(libscap, libsinsp): new `scap_stats_v2` API, featuring new `libbpf` bpftool prog show like stats for bpf (new metrics 3/n) #1021

incertum · 2023-04-02T23:54:28Z

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind bug

/kind cleanup

/kind design

/kind documentation

/kind failing-test

/kind feature

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area API-version

/area build

/area CI

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

/area libscap-engine-bpf

/area libscap-engine-gvisor

/area libscap-engine-kmod

/area libscap-engine-modern-bpf

/area libscap-engine-nodriver

/area libscap-engine-noop

/area libscap-engine-source-plugin

/area libscap-engine-savefile

/area libscap-engine-udig

/area libscap

/area libpman

/area libsinsp

/area tests

/area proposals

Does this PR require a change in the driver versions?

/version driver-API-version-major

/version driver-API-version-minor

/version driver-API-version-patch

/version driver-SCHEMA-version-major

/version driver-SCHEMA-version-minor

/version driver-SCHEMA-version-patch

What this PR does / why we need it:

Introduce native support for libbpf bpftool prog show statistics as discussed here falcosecurity/falco#2222 (comment).

Approach:

libbpf is already bundled in libs. As @gnosek anticipated it was an easy integration only requiring adding bpf_obj_get_info_by_fd
Getting the fds directly from the engine specific handle (m_attached_progs), therefore I don't anticipate any issues ...
The way we look the libbpf stats properties up should naturally extend to any future programs we support without requiring a refactor, plus extracting more fields from bpf_prog_info is straight forward.
Please note these libbpf stats are currently only available on a per program granularity AFAIK and we won't get filler / tail call resolution, but maybe one day :)
These stats are going to be very valuable when inspecting the trade-offs between sys enter and exit tracepoints and LSM hooks. LSM hook are planned for libs 0.12.0, see Support Linux Security Modules hook for more accurate security tracing #252
When testing please ensure echo 1 > /proc/sys/kernel/bpf_stats_enabled is enabled on your machine, else no stats will be available. Plus test with old bpf for now as modern_bpf support is pending some additional extensions in order to reach parity with the old bpf.

CC @falcosecurity/libs-maintainers for awareness.

Which issue(s) this PR fixes:

falcosecurity/falco#2222

Fixes #

Special notes for your reviewer:

@Andreagit97 still need to support m_attached_progs in struct modern_bpf_engine, was wondering if we should do it in this PR or a follow up PR? Started all the scaffolding so that at least we don't break the scap-open test binary in the meantime.

In addition, since we just recently separated out the bpf programs for each driver and you and @FedeDP worked on this, are there other implications I may be missing? Thanks in advance for checking!

Does this PR introduce a user-facing change?:

new(libscap, libsinsp): add libbpf `bpftool prog show` like scap_libbpf_stats

incertum · 2023-04-03T00:02:07Z

New scap-open test binary stats output now looks like this

sudo libscap/examples/01-open/scap-open --bpf driver/bpf/probe.o

[SCAP-OPEN]: Hello!

---------------------- SCAP SOURCE ----------------------
* BPF probe: 'driver/bpf/probe.o'
-----------------------------------------------------------

--------------------- CONFIGURATIONS ----------------------
* Print single event type: -1 (`-1` means no event to print).
* Run until '18446744073709551615' events are catched.
-----------------------------------------------------------

---------------------- INTERESTING SYSCALLS ----------------------
* All sc codes are enabled!
------------------------------------------------------------------

* OK! BPF probe correctly loaded: NO VERIFIER ISSUES :)
* Live capture in progress...
* Press CTRL+C to stop the capture
^C
---------------------- STATS -----------------------

[SCAP-OPEN]: General statistics

Events correctly captured (SCAP_SUCCESS): 56201039
Seen by driver (kernel side events): 56201398
Time elapsed: 1994 s
Rate of userspace events (events/second): 28185
Rate of kernel side events (events/second): 28185
Number of dropped events: 0
Number of timeouts: 65845
Number of 'next' calls: 56266884
Number of preemptions: 0
Number of events skipped due to the tid being in a set of suppressed tids: 0
Number of threads currently being suppressed: 0

[SCAP-OPEN]: libbpf statistics

Number of total bpf program invocations (sum run_cnt): 56206994
Nanoseconds spent in total across bpf programs (sum run_time_ns): 107288950041
Average time spent across bpf programs (avg_time_ns): 1908
Prog [sys_enter] run_cnt: 20741581
Prog [sys_enter] run_time_ns: 32088344143
Prog [sys_enter] avg_time_ns: 1547
Prog [sys_exit] run_cnt: 20743009
Prog [sys_exit] run_time_ns: 40765096995
Prog [sys_exit] avg_time_ns: 1965
Prog [sched_process_e] run_cnt: 1850
Prog [sched_process_e] run_time_ns: 6870257
Prog [sched_process_e] avg_time_ns: 3713
Prog [sched_switch] run_cnt: 6650672
Prog [sched_switch] run_time_ns: 30832591912
Prog [sched_switch] avg_time_ns: 4636
Prog [page_fault_user] run_cnt: 8031096
Prog [page_fault_user] run_time_ns: 3572066118
Prog [page_fault_user] avg_time_ns: 444
Prog [page_fault_kern] run_cnt: 36638
Prog [page_fault_kern] run_time_ns: 18727841
Prog [page_fault_kern] avg_time_ns: 511
Prog [signal_deliver] run_cnt: 2148
Prog [signal_deliver] run_time_ns: 5252775
Prog [signal_deliver] avg_time_ns: 2445

[SCAP-OPEN]: Kernel side buffer drop statistics

Number of dropped events caused by full buffer (total / all buffer drops - includes all categories below, likely higher than sum of syscall categories): 0
Number of dropped events caused by full buffer (n_drops_buffer_clone_fork_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_clone_fork_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_execve_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_execve_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_connect_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_connect_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_open_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_open_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_dir_file_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_dir_file_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_other_interest_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_other_interest_exit syscall category): 0
Number of dropped events caused by full scratch map: 0
Number of dropped events caused by invalid memory access (page faults): 0
Number of dropped events caused by an invalid condition in the kernel instrumentation (bug): 0
-----------------------------------------------------

[SCAP-OPEN]: Bye!

Andreagit97 · 2023-04-03T11:15:20Z

Uhm I have some doubts about this PR... Let's say in this release we have removed the tp logic from sinsp and we have pushed it into the engines, this is because the tracepoint concept is something that the user should not know about. Now in this PR, we are exposing stats about tracepoints to userspace... this is in some way against what we did some weeks ago :/ do we need again a shared enum between engines to loop over tracepoints (called now "attached progs")? This is precisely what we removed some changes ago, now every engine has its enum/logic because every engine could have different progs attached...
WDYT? @jasondellaluce @FedeDP

If we want to expose again some tracepoints information, probably I would enrich the actual scap_stats with new info. I would avoid another v-table method since the new logic is just related to bpf engines :/ Maybe we could have something like this

typedef struct bpf_prog_stats
{
  uint64_t run_cnt;
  uint64_t run_time_ns;
  // ...
}bpf_prog_stats;

/*!
  \brief Statistics about an in progress capture
*/
typedef struct scap_stats
{
	uint64_t n_evts; ///< Total number of events that were received by the driver.
	uint64_t n_drops; ///< Number of dropped events.
	uint64_t n_drops_buffer; ///< Number of dropped events caused by full buffer.
	uint64_t n_drops_buffer_clone_fork_enter;
	uint64_t n_drops_buffer_clone_fork_exit;
	uint64_t n_drops_buffer_execve_enter;
	uint64_t n_drops_buffer_execve_exit;
	uint64_t n_drops_buffer_connect_enter;
	uint64_t n_drops_buffer_connect_exit;
	uint64_t n_drops_buffer_open_enter;
	uint64_t n_drops_buffer_open_exit;
	uint64_t n_drops_buffer_dir_file_enter;
	uint64_t n_drops_buffer_dir_file_exit;
	uint64_t n_drops_buffer_other_interest_enter;
	uint64_t n_drops_buffer_other_interest_exit;
	uint64_t n_drops_scratch_map; ///< Number of dropped events caused by full frame scratch map.
	uint64_t n_drops_pf; ///< Number of dropped events caused by invalid memory access.
	uint64_t n_drops_bug; ///< Number of dropped events caused by an invalid condition in the kernel instrumentation.
	uint64_t n_preemptions; ///< Number of preemptions.
	uint64_t n_suppressed; ///< Number of events skipped due to the tid being in a set of suppressed tids.
	uint64_t n_tids_suppressed; ///< Number of threads currently being suppressed.
        bpf_prog_stats sys_enter;
        bpf_prog_stats sys_exit;
        //...
}scap_stats;

in the old probe, you can just cut and paste the actual implementation, while in the modern bpf we should do something like this

	for(int index = 0; index < g_state.n_possible_cpus; index++)
	{
		if(bpf_map_lookup_elem(counter_maps_fd, &index, &cnt_map) < 0)
		{
			snprintf(error_message, MAX_ERROR_MESSAGE_LEN, "unbale to get the counter map for CPU %d", index);
			pman_print_error((const char *)error_message);
			goto clean_print_stats;
		}
		stats->n_evts += cnt_map.n_evts;
		stats->n_drops_buffer += cnt_map.n_drops_buffer;
		stats->n_drops_scratch_map += cnt_map.n_drops_max_event_size;
		stats->n_drops += (cnt_map.n_drops_buffer + cnt_map.n_drops_max_event_size);
	}

	/* Enhanced logic... */
	int ret = 0;
	struct bpf_prog_info info = {0};
	u32 info_len = sizeof(info);
	ret = bpf_obj_get_info_by_fd(bpf_program__fd(g_state.skel->progs.sys_enter), &info, &info_len);
	if(ret!=0)
	{
		stats->sys_enter.run_cnt = info.run_cnt;
		stats->sys_enter.run_time_ns = info.run_time_ns;
		// ...
	}

I don't like too much this approach but I don't see many other ways, at least right now 🤔

Some random thoughts:

doing all this stuff into scap_get_stats could be time-consuming if we do it frequently, so maybe we need to add a flag into the scap_stats struct, something like bool enhanced_stats, to collect these new stats only on demand. WDYT?
I would try to converge all the new stats into the scap_get_stats method. We are adding different methods to get different metrics from different places, maybe it would be better to find a unique place and enable these new logic on demand (?) ☝️

incertum · 2023-04-03T17:31:48Z

Uhm I have some doubts about this PR... Let's say in this release we have removed the tp logic from sinsp and we have pushed it into the engines, this is because the tracepoint concept is something that the user should not know about. Now in this PR, we are exposing stats about tracepoints to userspace... this is in some way against what we did some weeks ago :/ do we need again a shared enum between engines to loop over tracepoints (called now "attached progs")? This is precisely what we removed some changes ago, now every engine has its enum/logic because every engine could have different progs attached... WDYT? @jasondellaluce @FedeDP

Thank you @Andreagit97 for your input. Offering some additional context:

Agree that for the instrumentation over Falco rules the end user does not need to know about tracepoints, however when it comes to resource utilization and additional performance metrics such as the ones introduced here yes the end user absolutely needs to know about the tracepoints. This is because you need to ensure large scale production stability.
The good news is that the shared enum here appears fine as it's just for passive stats collections, meaning the indices don't mean anything to any of the engines and don't map to an exact tracepoint as a consequence, it's just a random index. We would however need to know the absolute MAX tracepoints to ensure we allocate it appropriately for both bpf drivers. Then a different index means a different tracepoints between bpf and modern_bpf and this is fine as for the stats reporting we just loop over them and extract the tracepoint name and respective metrics dynamically from the struct, no hard-coding here. Also no hard-coding for the Falco logs we will use the name as key for the logs dynamically (of course with appropriate checks in place), such a more auto-discovery approach is very common in data pipelines in ETLs.

If we want to expose again some tracepoints information, probably I would enrich the actual scap_stats with new info. I would avoid another v-table method since the new logic is just related to bpf engines :/ Maybe we could have something like this

typedef struct bpf_prog_stats
{
  uint64_t run_cnt;
  uint64_t run_time_ns;
  // ...
}bpf_prog_stats;

/*!
  \brief Statistics about an in progress capture
*/
typedef struct scap_stats
{
	uint64_t n_evts; ///< Total number of events that were received by the driver.
	uint64_t n_drops; ///< Number of dropped events.
	uint64_t n_drops_buffer; ///< Number of dropped events caused by full buffer.
	uint64_t n_drops_buffer_clone_fork_enter;
	uint64_t n_drops_buffer_clone_fork_exit;
	uint64_t n_drops_buffer_execve_enter;
	uint64_t n_drops_buffer_execve_exit;
	uint64_t n_drops_buffer_connect_enter;
	uint64_t n_drops_buffer_connect_exit;
	uint64_t n_drops_buffer_open_enter;
	uint64_t n_drops_buffer_open_exit;
	uint64_t n_drops_buffer_dir_file_enter;
	uint64_t n_drops_buffer_dir_file_exit;
	uint64_t n_drops_buffer_other_interest_enter;
	uint64_t n_drops_buffer_other_interest_exit;
	uint64_t n_drops_scratch_map; ///< Number of dropped events caused by full frame scratch map.
	uint64_t n_drops_pf; ///< Number of dropped events caused by invalid memory access.
	uint64_t n_drops_bug; ///< Number of dropped events caused by an invalid condition in the kernel instrumentation.
	uint64_t n_preemptions; ///< Number of preemptions.
	uint64_t n_suppressed; ///< Number of events skipped due to the tid being in a set of suppressed tids.
	uint64_t n_tids_suppressed; ///< Number of threads currently being suppressed.
        bpf_prog_stats sys_enter;
        bpf_prog_stats sys_exit;
        //...
}scap_stats;

in the old probe, you can just cut and paste the actual implementation, while in the modern bpf we should do something like this

	for(int index = 0; index < g_state.n_possible_cpus; index++)
	{
		if(bpf_map_lookup_elem(counter_maps_fd, &index, &cnt_map) < 0)
		{
			snprintf(error_message, MAX_ERROR_MESSAGE_LEN, "unbale to get the counter map for CPU %d", index);
			pman_print_error((const char *)error_message);
			goto clean_print_stats;
		}
		stats->n_evts += cnt_map.n_evts;
		stats->n_drops_buffer += cnt_map.n_drops_buffer;
		stats->n_drops_scratch_map += cnt_map.n_drops_max_event_size;
		stats->n_drops += (cnt_map.n_drops_buffer + cnt_map.n_drops_max_event_size);
	}

	/* Enhanced logic... */
	int ret = 0;
	struct bpf_prog_info info = {0};
	u32 info_len = sizeof(info);
	ret = bpf_obj_get_info_by_fd(bpf_program__fd(g_state.skel->progs.sys_enter), &info, &info_len);
	if(ret!=0)
	{
		stats->sys_enter.run_cnt = info.run_cnt;
		stats->sys_enter.run_time_ns = info.run_time_ns;
		// ...
	}

I don't like too much this approach but I don't see many other ways, at least right now 🤔

Curious to hear more feedback. Thinking behind introducing a new libbpf stats struct is that we want to call the libbpf stats less frequently and opt-in. We could accomplish some gating within the existing get_stats, but would we want to or would keeping them separate be cleaner?

Re the implementation suggestion above would be hesitant to hard-code out an exact struct by tracepoint name as it's going to generate a lot more maintenance overhead. The simple array that we populate with the new bpf struct is more dynamic and will automatically extend to new tracepoints as the name is merely part of the struct.

Some random thoughts:

doing all this stuff into scap_get_stats could be time-consuming if we do it frequently, so maybe we need to add a flag into the scap_stats struct, something like bool enhanced_stats, to collect these new stats only on demand. WDYT?

I would try to converge all the new stats into the scap_get_stats method. We are adding different methods to get different metrics from different places, maybe it would be better to find a unique place and enable these new logic on demand (?) ☝️

Heard, see comment above. At the end either way will be fine, let's collect more feedback and collectively make a decision.

gnosek · 2023-04-03T17:51:17Z

Yay, another tracepoint discussion :D

As much as I'm against exposing tracepoints in the scap/sinsp API, I'm all for returning info about them in resource utilization stats, as long as we explicitly document them as unstable (not part of any public contract).

On an implementation level, personally I would add yet another vtable method (eventually maybe drop the current get_stats one) with an API like:

struct scap_stat
{
  const char* name;
  // maybe some metadata here, like the unit as an enum
  int64_t value;
};

int32_t get_stats(struct engine_handle engine, void** cursor, struct scap_stat* buf, size_t buf_size);

where the engine has full control over the names of the stats fields and they need not have any specific structure (unless we choose to do so)

It would be used like this:

#define BUF_SIZE 42
void* cursor = NULL;
struct scap_stat stats[BUF_SIZE];

while(1) {
  int32_t ret = get_stats(engine, &cursor, stats, sizeof(stats));
  if(ret < 0) {
    return SCAP_FAILURE;
  }

  if(ret == 0)
  {
    // all done
    break;
  }

  ASSERT(ret <= sizeof(stats));
  for(size_t i = 0; i < ret; ++i)
  {
    printf("%s = %lld\n", stats[i].name, stats[i].value);
  }
}

i.e. the caller provides a buffer of arbitrary size and the engine fills it with as many stats as it likes, using the cursor variable to keep track across calls (if it has more stats than would fit in the buffer).

Of course there are many decisions here, like const char* vs char[SOME_SIZE] (or maybe an enum for the well known currently built in stats and a string for custom stuff) etc., but overall it feels:

simple enough to be used from C
flexible enough to provide stats at different layers with low overhead
composable (one component can delegate a part of its stats to another one with minimal ceremony)

jasondellaluce · 2023-04-03T17:56:24Z

Will take some time to review this deeper, but I'm 100% onboard with @gnosek's proposal. This will also be the first step for a dynamic and implementation agnostic stats collection, which we would still want to achieve in libsinsp somewhere in the future. I think also satisfies both the information hiding and flexibility need that we brought to the table.

incertum · 2023-04-03T18:00:20Z

Thanks @gnosek and @jasondellaluce for a preview, meaning this is now 3 times feedback for consolidating to one stats, hence would call this a decision wrt to this aspect. As far as the implementation details let's gather more feedback and something dynamic like @gnosek suggested would make me happy :) Plus we need to have gates in place to tell which stats category to look up so that it can be called frequently without unnecessary lookups if for example all you want is the counters not the libbpf stats ...

gnosek · 2023-04-03T18:01:45Z

Plus we need to have gates in place to tell which stats category to look up

Just a thought but the engines have a ->configure method, we can add new settings there easily

incertum · 2023-04-03T18:04:05Z

Great, let me move this PR to WIP since we need to refactor it quite a bit before it's ready! The more feedback the better and I'll start a refactor trying to take everyones input into consideration.

incertum · 2023-04-04T06:06:29Z

Pushed a staging commit for checkin. Not yet as elegant as @gnosek suggestion, because Grzegorz is like light years somewhere else in C coding skills 🤣

Furthermore, could we ease in the transition with an interim scap_stats_new approach which would include the new libbpf metrics and existing counters, because otherwise I will break everything including all clients besides Falco.

Flattening out the schema kind of works pretty well, more ideas around the schema? The chosen metadata field names are generic enough to cover many metrics use cases.

typedef struct scap_stats_new
{
	/* Metadata */
	char name[NAME_MAX];
	uint32_t type;
	uint32_t id;
	int valid;
	/* Stats (key, value) */
	char key[NAME_MAX];
	uint64_t value;
	// todo: add unit enum
}scap_stats_new;

current new scap open test output:

Stats name [sys_enter] key [run_cnt] and value [4382]
Stats name [sys_enter] key [run_time_ns] and value [13841362]
Stats name [sys_enter] key [avg_time_ns] and value [3158]
Stats name [sys_exit] key [run_cnt] and value [4383]
Stats name [sys_exit] key [run_time_ns] and value [16531416]
Stats name [sys_exit] key [avg_time_ns] and value [3771]
Stats name [sched_process_e] key [run_cnt] and value [1]
Stats name [sched_process_e] key [run_time_ns] and value [6843]
Stats name [sched_process_e] key [avg_time_ns] and value [6843]
Stats name [sched_switch] key [run_cnt] and value [1984]
Stats name [sched_switch] key [run_time_ns] and value [13562477]
Stats name [sched_switch] key [avg_time_ns] and value [6835]
Stats name [page_fault_user] key [run_cnt] and value [275]
Stats name [page_fault_user] key [run_time_ns] and value [515259]
Stats name [page_fault_user] key [avg_time_ns] and value [1873]
Stats name [page_fault_kern] key [run_cnt] and value [11]
Stats name [page_fault_kern] key [run_time_ns] and value [3279]
Stats name [page_fault_kern] key [avg_time_ns] and value [298]
Stats name [signal_deliver] key [run_cnt] and value [2]
Stats name [signal_deliver] key [run_time_ns] and value [6953]
Stats name [signal_deliver] key [avg_time_ns] and value [3476]

If we agree that this is the right first iteration, would go ahead and remove all the libbpf_stats references and cleanup more as I also wasn't sure where to init the stats buffer including defining the buffer size, feedback appreciated :)

…tering Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

Co-authored-by: Grzegorz Nosek <grzegorz.nosek@sysdig.com> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

* apply reviewers feedback * stats static const sized allocated in each engine for now * general cleanup, such as re-audit returns on possible failures * more comments Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

Co-authored-by: Andrea Terzolo <andrea.terzolo@polito.it> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

Co-authored-by: Grzegorz Nosek <grzegorz.nosek@sysdig.com> Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com> Co-authored-by: Andrea Terzolo <andrea.terzolo@polito.it> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

Co-authored-by: Grzegorz Nosek <grzegorz.nosek@sysdig.com> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

userspace/libpman/src/capture.c

userspace/libscap/engine/bpf/scap_bpf.c

gnosek · 2023-04-14T06:03:46Z

userspace/libscap/engine/bpf/scap_bpf.c

+				if (offset > nstats_allocated - 1)
+				{
+					break;
+				}


It seems it's a bug if we ended up in here, right?

Yes would say so, want some error message

yes probably here we need to set an error message and return immediately with SCAP_FAILURE since we should never reach this point

Co-authored-by: Grzegorz Nosek <grzegorz.nosek@sysdig.com> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

poiana · 2023-04-17T09:06:31Z

LGTM label has been added.

Git tree hash: 5e266b79eed23903660fde26efa7a8ce017f3e86

Andreagit97

I will approve it so we can go on with the next steps, but I would like to see these comments addressed before releasing the final version :)
/approve

Andreagit97 · 2023-04-14T17:06:43Z

userspace/libpman/src/capture.c

+				snprintf(error_message, MAX_ERROR_MESSAGE_LEN, "unable to get the counter map for CPU %d", index);
+				pman_print_error((const char *)error_message);
+				close(counter_maps_fd);
+				return -ret;


Suggested change

return -ret;

return SCAP_FAILURE;

Andreagit97 · 2023-04-14T18:30:44Z

test/libscap/test_suites/engines/kmod/kmod.cpp

+	uint32_t nstats;
+	int32_t rc;
+	const scap_stats_v2* stats_v2;
+	stats_v2 = scap_get_stats_v2(h, flags, &nstats, &rc);


for other PRs: we could also test failure cases like a NULL stats vector or stuff like that

Andreagit97 · 2023-04-17T09:56:15Z

userspace/libscap/engine/modern_bpf/scap_modern_bpf.c

+const struct scap_stats_v2* scap_modern_bpf__get_stats_v2(struct scap_engine_handle engine, uint32_t flags, OUT uint32_t* nstats, OUT int32_t* rc)
+{
+	*rc = SCAP_SUCCESS;
+	if(pman_get_scap_stats_v2((void*)engine.m_handle->m_stats, flags, (void*)nstats))


Suggested change

if(pman_get_scap_stats_v2((void*)engine.m_handle->m_stats, flags, (void*)nstats))

if(pman_get_scap_stats_v2((void*)engine.m_handle->m_stats, flags, nstats))

Andreagit97 · 2023-04-17T10:01:00Z

userspace/libpman/src/capture.c

+{
+	struct scap_stats_v2 *stats = (struct scap_stats_v2 *)scap_stats_v2_struct;
+	*nstats = 0;
+	int ret;


if we use SCAP_FAILURE this should be no more necessary

Suggested change

int ret;

Andreagit97 · 2023-04-17T10:09:13Z

userspace/libscap/engine/bpf/scap_bpf_stats.h

+	RUN_TIME_NS,
+	AVG_TIME_NS,
+	BPF_MAX_LIBBPF_STATS,
+};


we can add the typedef also here like with bpf_kernel_counters_stats

Andreagit97 · 2023-04-17T10:27:52Z

userspace/libscap/engine/gvisor/scap_gvisor.cpp

+const struct scap_stats_v2* engine::get_stats_v2(uint32_t flags, uint32_t* nstats, int32_t* rc)
+{
+	*nstats = scap_gvisor::stats::MAX_GVISOR_COUNTERS_STATS;
+	scap_stats_v2* stats = engine::m_stats;


Suggested change

scap_stats_v2* stats = engine::m_stats;

scap_stats_v2* stats = this->m_stats;

Andreagit97 · 2023-04-17T10:34:53Z

userspace/libscap/engine/modern_bpf/scap_modern_bpf.c

@@ -29,6 +29,13 @@ limitations under the License.
 #include <sys/utsname.h>
 #include "ringbuffer/ringbuffer.h"

+const char * const modern_bpf_kernel_counters_stats_names[] = {


capture.c in libpman seems the right place for this... we don't use it here

Andreagit97 · 2023-04-17T10:36:09Z

userspace/libscap/engine/modern_bpf/scap_modern_bpf_stats.h

+	MODERN_BPF_MAX_KERNEL_COUNTERS_STATS
+}modern_bpf_kernel_counters_stats;
+
+extern const char * const modern_bpf_kernel_counters_stats_names[];


if we move the declaration in capture.c we don't need this

Andreagit97 · 2023-04-17T10:40:04Z

userspace/libscap/scap_stats_v2.h

+//
+// machine_info flags
+//
+#define PPM_BPF_STATS_ENABLED (1 << 0)


we could remove the PPM_ prefix and use just SCAP_ WDYT?

userspace/libsinsp/sinsp.cpp

poiana · 2023-04-17T10:45:43Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97, incertum, jasondellaluce

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [Andreagit97,incertum,jasondellaluce]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

poiana added release-note kind/cleanup dco-signoff: yes kind/feature New feature or request area/driver-bpf approved area/driver-modern-bpf area/libscap-engine-bpf area/libscap-engine-modern-bpf area/libscap labels Apr 2, 2023

poiana requested review from hbrueckner and mstemm April 2, 2023 23:54

poiana added area/libpman area/libsinsp size/L labels Apr 2, 2023

incertum added this to the 0.11.0 milestone Apr 2, 2023

incertum changed the title ~~new(libscap, libsinsp): add libbpf bpftool prog show like scap_libbpf_stats~~ new(libscap, libsinsp): add libbpf bpftool prog show like scap_libbpf_stats (new metrics 3/n) Apr 3, 2023

incertum changed the title ~~new(libscap, libsinsp): add libbpf bpftool prog show like scap_libbpf_stats (new metrics 3/n)~~ wip: new(libscap, libsinsp): add libbpf bpftool prog show like scap_libbpf_stats (new metrics 3/n) Apr 3, 2023

poiana added do-not-merge/work-in-progress size/XL and removed size/L labels Apr 3, 2023

incertum and others added 11 commits April 13, 2023 23:29

new(libscap): add flags to scap_stats_v2 schema for consumer side fil…

a703512

…tering Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

cleanup(libscap): refactor scap_stats_v2 API boundaries

6fa2eb3

Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

cleanup(libscap): use union in scap_stats_v2 metric values

03e6120

Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

new(libsinsp): new scap_stats_v2 API

3da1667

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

fix(libscap): fix strncat

cf41399

Co-authored-by: Grzegorz Nosek <grzegorz.nosek@sysdig.com> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

new(test): add unit tests for scap_stats_v2

cac453e

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

cleanup(libscap): cleanup scap_stats_v2

582fb77

Co-authored-by: Andrea Terzolo <andrea.terzolo@polito.it> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

cleanup(libscap): move some flags to scap.h

838d17d

Co-authored-by: Andrea Terzolo <andrea.terzolo@polito.it> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

cleanup(libscap): allocate bpf m_stats based on m_attached_progs

407ffe1

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

cleanup(libscap): new scap_stats_v2.h

8190247

Co-authored-by: Grzegorz Nosek <grzegorz.nosek@sysdig.com> Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com> Co-authored-by: Andrea Terzolo <andrea.terzolo@polito.it> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

incertum dismissed Andreagit97’s stale review via fe59888 April 13, 2023 23:48

incertum force-pushed the libbpf-stats branch from bcebe02 to fe59888 Compare April 13, 2023 23:48

poiana removed the lgtm label Apr 13, 2023

poiana requested a review from Andreagit97 April 13, 2023 23:48

cleanup(libscap): refine scap_stats_v2

d94b12a

Co-authored-by: Grzegorz Nosek <grzegorz.nosek@sysdig.com> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

incertum force-pushed the libbpf-stats branch from fe59888 to d94b12a Compare April 14, 2023 02:35

gnosek reviewed Apr 14, 2023

View reviewed changes

userspace/libpman/src/capture.c Outdated Show resolved Hide resolved

gnosek reviewed Apr 14, 2023

View reviewed changes

userspace/libscap/engine/bpf/scap_bpf.c Outdated Show resolved Hide resolved

gnosek reviewed Apr 14, 2023

View reviewed changes

cleanup(libscap): minor tweaks scap_stats_v2

9472f86

Co-authored-by: Grzegorz Nosek <grzegorz.nosek@sysdig.com> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>

jasondellaluce approved these changes Apr 17, 2023

View reviewed changes

poiana assigned jasondellaluce Apr 17, 2023

poiana added the lgtm label Apr 17, 2023

Andreagit97 approved these changes Apr 17, 2023

View reviewed changes

poiana merged commit 16580de into falcosecurity:master Apr 17, 2023

incertum mentioned this pull request Apr 18, 2023

cleanup(libsinsp): refactor resource_utilization, adopt scap_stats_v2 schema #1054

Merged

incertum deleted the libbpf-stats branch December 8, 2023 20:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

new(libscap, libsinsp): new `scap_stats_v2` API, featuring new `libbpf` bpftool prog show like stats for bpf (new metrics 3/n) #1021

new(libscap, libsinsp): new `scap_stats_v2` API, featuring new `libbpf` bpftool prog show like stats for bpf (new metrics 3/n) #1021

incertum commented Apr 2, 2023 •

edited

Loading

incertum commented Apr 3, 2023

Andreagit97 commented Apr 3, 2023

incertum commented Apr 3, 2023

gnosek commented Apr 3, 2023

jasondellaluce commented Apr 3, 2023

incertum commented Apr 3, 2023

gnosek commented Apr 3, 2023

incertum commented Apr 3, 2023

incertum commented Apr 4, 2023

gnosek Apr 14, 2023

incertum Apr 14, 2023

Andreagit97 Apr 17, 2023

poiana commented Apr 17, 2023

Andreagit97 left a comment

Andreagit97 Apr 14, 2023

Andreagit97 Apr 14, 2023

Andreagit97 Apr 17, 2023

Andreagit97 Apr 17, 2023

Andreagit97 Apr 17, 2023

Andreagit97 Apr 17, 2023

Andreagit97 Apr 17, 2023

Andreagit97 Apr 17, 2023

Andreagit97 Apr 17, 2023

poiana commented Apr 17, 2023

	if(pman_get_scap_stats_v2((void)engine.m_handle->m_stats, flags, (void)nstats))
	if(pman_get_scap_stats_v2((void*)engine.m_handle->m_stats, flags, nstats))

	scap_stats_v2* stats = engine::m_stats;
	scap_stats_v2* stats = this->m_stats;

new(libscap, libsinsp): new scap_stats_v2 API, featuring new libbpf bpftool prog show like stats for bpf (new metrics 3/n) #1021

new(libscap, libsinsp): new scap_stats_v2 API, featuring new libbpf bpftool prog show like stats for bpf (new metrics 3/n) #1021

Conversation

incertum commented Apr 2, 2023 • edited Loading

incertum commented Apr 3, 2023

Andreagit97 commented Apr 3, 2023

incertum commented Apr 3, 2023

gnosek commented Apr 3, 2023

jasondellaluce commented Apr 3, 2023

incertum commented Apr 3, 2023

gnosek commented Apr 3, 2023

incertum commented Apr 3, 2023

incertum commented Apr 4, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

poiana commented Apr 17, 2023

Andreagit97 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

poiana commented Apr 17, 2023

new(libscap, libsinsp): new `scap_stats_v2` API, featuring new `libbpf` bpftool prog show like stats for bpf (new metrics 3/n) #1021

new(libscap, libsinsp): new `scap_stats_v2` API, featuring new `libbpf` bpftool prog show like stats for bpf (new metrics 3/n) #1021

incertum commented Apr 2, 2023 •

edited

Loading