Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tetragon: sensor cleanup fixes #2968

Merged
merged 5 commits into from
Oct 5, 2024
Merged

tetragon: sensor cleanup fixes #2968

merged 5 commits into from
Oct 5, 2024

Conversation

olsajiri
Copy link
Contributor

@olsajiri olsajiri commented Oct 1, 2024

do better job on cleaning up after sensors is unloaded or fails to load

@olsajiri olsajiri added the release-note/minor This PR introduces a minor user-visible change label Oct 1, 2024
Copy link

netlify bot commented Oct 1, 2024

Deploy Preview for tetragon ready!

Name Link
🔨 Latest commit 5ea987f
🔍 Latest deploy log https://app.netlify.com/sites/tetragon/deploys/66fe98280434c6000861b287
😎 Deploy Preview https://deploy-preview-2968--tetragon.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@olsajiri olsajiri force-pushed the pr/olsajiri/fixes branch 3 times, most recently from 0391ee3 to c0d2d7d Compare October 2, 2024 14:26
@olsajiri olsajiri changed the title tetragon: Cleanup policy/sensor/progs directories tetragon: sensor cleanup fixes Oct 2, 2024
We did not properly cleanup the directory hierarchy we create
for policy/sensor/program.

Adding the missing cleanup and adding error check when creating
the directories.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Currently we do not remove map/progs bpffs hierarchy directories
if we fail to load the sensor for some reason, fixing that.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Moving loop out of loadMaps function, it will ease up following
patch, there's no functional change.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Currently we do not unload progs/maps that were loaded prior
the fail to load the sensor for some reason, fixing that.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Adding tests for proper cleanup after sensor unload.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
@olsajiri olsajiri marked this pull request as ready for review October 3, 2024 21:11
@olsajiri olsajiri requested a review from a team as a code owner October 3, 2024 21:11
@olsajiri olsajiri requested review from tixxdz and kkourt October 3, 2024 21:11
Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good and I'm happy to merge it as is.

One thought I had was that maybe we would want to add a waning if the last sensor did not manage to remove the policy directory. So maybe something like:

diff --git a/pkg/sensors/collection.go b/pkg/sensors/collection.go
index 979517b61..6e4f8b2ba 100644
--- a/pkg/sensors/collection.go
+++ b/pkg/sensors/collection.go
@@ -5,9 +5,12 @@ package sensors
 
 import (
        "fmt"
+       "os"
+       "path/filepath"
        "sync"
 
        "github.com/cilium/tetragon/api/v1/tetragon"
+       "github.com/cilium/tetragon/pkg/option"
        "github.com/cilium/tetragon/pkg/tracingpolicy"
        "go.uber.org/multierr"
 )
@@ -134,6 +137,15 @@ func (c *collection) unload() error {
        if err != nil {
                return fmt.Errorf("failed to unload all sensors from collection %s: %w", c.name, err)
        }
+
+       if tp := c.tracingpolicy; tp != nil {
+               tpPath := filepath.Join(option.Config.BpfDir, sanitize(tp.TpName()))
+               _, err := os.Stat(path)
+               if os.IsNotExist(err) {
+                       return fmt.Erorrf("failed to remove policy direcotry: %s", tpPath)
+               }
+       }
+
        return nil
 }

As I was writing the above, however, I realized that we might have a more significant problem. Two policies might have the same name if they belong to different k8s namespaces. See for example #2337.

So I think we need something like:

diff --git a/pkg/sensors/tracing/policyhandler.go b/pkg/sensors/tracing/policyhandler.go
index 16603d3ab..a799a975b 100644
--- a/pkg/sensors/tracing/policyhandler.go
+++ b/pkg/sensors/tracing/policyhandler.go
@@ -25,6 +25,10 @@ func (h policyHandler) PolicyHandler(
 ) (sensors.SensorIface, error) {
 
        policyName := policy.TpName()
+       if tpNs, ok := tp.(tracingpolicy.TracingPolicyNamespaced); ok {
+               policyName = fmt.Printf("%s/%s", tpNs, policyName)
+       }
+
        spec := policy.TpSpec()

@olsajiri
Copy link
Contributor Author

olsajiri commented Oct 4, 2024

yea, we need to fix that.. let's address that in separate PR, thanks

@olsajiri olsajiri merged commit c55dbdd into main Oct 5, 2024
41 checks passed
@olsajiri olsajiri deleted the pr/olsajiri/fixes branch October 5, 2024 13:02
@olsajiri olsajiri restored the pr/olsajiri/fixes branch October 6, 2024 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/minor This PR introduces a minor user-visible change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants