Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add auditd service #9620

Merged
merged 1 commit into from
Nov 2, 2024
Merged

Conversation

frezbo
Copy link
Member

@frezbo frezbo commented Nov 1, 2024

Adds a auditd service that gathers all audit logs from kernel.

@frezbo
Copy link
Member Author

frezbo commented Nov 1, 2024

@dsseng you could try this PR

@dsseng dsseng self-assigned this Nov 1, 2024
@dsseng
Copy link
Member

dsseng commented Nov 1, 2024

Will rebase and take a look now

@dsseng
Copy link
Member

dsseng commented Nov 1, 2024

working a bit on it to actually avoid printk output

@smira
Copy link
Member

smira commented Nov 1, 2024

working a bit on it to actually avoid printk output

we can probably parse events if that makes it easier to consume for human beings ;)

@dsseng
Copy link
Member

dsseng commented Nov 1, 2024

We can, yes

working a bit on it to actually avoid printk output

we can probably parse events if that makes it easier to consume for human beings ;)

@frezbo
Copy link
Member Author

frezbo commented Nov 1, 2024

working a bit on it to actually avoid printk output

we can probably parse events if that makes it easier to consume for human beings ;)

yeh I guess the library has auparse, also I tried using the github.com/mdlayher/netlink netlink library, though the elastic project expects a different interface, unless we want to wrap it

Copy link
Member

@dsseng dsseng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise. Unsure if a test is possible. Did you test without SELinux? Is anything in the audit log in that case? If so, we can add an integration for len > 1

internal/app/auditd/auditd.go Outdated Show resolved Hide resolved
@frezbo
Copy link
Member Author

frezbo commented Nov 1, 2024

LGTM otherwise. Unsure if a test is possible. Did you test without SELinux? Is anything in the audit log in that case? If so, we can add an integration for len > 1

without selinux there was nothing, so I was not sure if it was working at all 😅

@dsseng
Copy link
Member

dsseng commented Nov 1, 2024

we can probably parse events if that makes it easier to consume for human beings ;)

The current format fits well for audit2allow (it's the same as auditd or kauditd give). If we decompose those messages, let's be sure we keep the original so audit2allow could still be used.

@dsseng
Copy link
Member

dsseng commented Nov 1, 2024

LGTM otherwise. Unsure if a test is possible. Did you test without SELinux? Is anything in the audit log in that case? If so, we can add an integration for len > 1

without selinux there was nothing, so I was not sure if it was working at all 😅

Pretty weird, after my patch on top of your branch logs have some tens of messages, so an integration test should work

@dsseng
Copy link
Member

dsseng commented Nov 1, 2024

First ones are these, so we surely can know something is in audit once it's initialized:

172.20.0.2: type=CONFIG_CHANGE msg=audit(1730463345.419:39): op=set audit_pid=1 old=0 auid=4294967295 ses=4294967295 subj=kernel res=1
172.20.0.2: type=CONFIG_CHANGE msg=audit(1730463345.419:40): op=set audit_backlog_limit=4096 old=64 auid=4294967295 ses=4294967295 subj=kernel res=1

@dsseng
Copy link
Member

dsseng commented Nov 1, 2024

Ah, might be wrong, I've got SELinux cmdline as I didn't rebuild talosctl

@dsseng
Copy link
Member

dsseng commented Nov 1, 2024

No, even without SE it has a bulk of audit logs, try that again with my suggested change

@frezbo frezbo force-pushed the feat/auditd-service branch from 9dd38b8 to 5a1b498 Compare November 1, 2024 13:11
@frezbo frezbo marked this pull request as ready for review November 1, 2024 13:11
internal/app/auditd/auditd.go Outdated Show resolved Hide resolved
internal/app/auditd/auditd.go Outdated Show resolved Hide resolved
@frezbo frezbo force-pushed the feat/auditd-service branch 2 times, most recently from 9cacf59 to f009031 Compare November 1, 2024 13:41
internal/app/auditd/auditd.go Outdated Show resolved Hide resolved
@frezbo frezbo force-pushed the feat/auditd-service branch from f009031 to e93c44c Compare November 1, 2024 14:16
@smira smira force-pushed the feat/auditd-service branch from e93c44c to 1b09f63 Compare November 1, 2024 16:50
@smira
Copy link
Member

smira commented Nov 1, 2024

The library creates a socket without syscall.SOCK_CLOEXEC, we need to send an upstream fix and use a fork

Comment on lines 49 to 64
switch format {
case "summary":
sm := event.Summary

fmt.Fprintf(
s.writer,
`%s: sequence=%v category=%v type=%v actor=%v/%v action=%v thing=%v/%v how=%v tags=%v`+"\n",
event.Timestamp.Format("2006/01/02 15:04:05.000000"), event.Sequence, event.Category, event.Type, sm.Actor.Primary, sm.Actor.Secondary,
sm.Action, sm.Object.Primary, sm.Object.Secondary, sm.How, event.Tags,
)
case "yaml":
s.writer.Write([]byte("---\n")) //nolint:errcheck
yaml.NewEncoder(s.writer).Encode(event) //nolint:errcheck
case "json":
json.NewEncoder(s.writer).Encode(event) //nolint:errcheck,errchkjson
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a format SELinux tools, or maybe some generic Linux security tools people might want to pipe audit data into would accept. Honestly it looks way less human-readable than Linux default format :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe I'm doing something wrong that I cannot get raw logs using talosctl

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think raw should be closer, and we can adjust it

@frezbo
Copy link
Member Author

frezbo commented Nov 2, 2024

note to self: disable when platform mode is container

@frezbo
Copy link
Member Author

frezbo commented Nov 2, 2024

note to self: disable when platform mode is container

this was already handled

@frezbo
Copy link
Member Author

frezbo commented Nov 2, 2024

some thoughts, I think we should only enable if audit=1 kernel commandline is set which means a user explicitly wants this and we could also have a talos.auditd.format arg to control the format

@smira
Copy link
Member

smira commented Nov 2, 2024

some thoughts, I think we should only enable if audit=1 kernel commandline is set which means a user explicitly wants this and we could also have a talos.auditd.format arg to control the format

I think we should run auditd always, it doesn't hurt, and provides valuable logs (including troubleshooting).

As for the format, let's stick with something for now (e.g. raw), and create a ticket to revisit later before 1.9 release - as @dsseng is working on SELinux, we'll see what we need more. We can even have a parser outside of Talos (feed raw logs, convert to JSON).

We can probably make raw default for now

@dsseng
Copy link
Member

dsseng commented Nov 2, 2024

Yes, let's do raw for now, as it's likely to be more useful to people who use osquery or other common Linux tools for analyzing audit logs. And perhaps audit logs will be useful for later (e.g. if something doesn't work under SELinux it provides debug info), and also might be wanted by some customers (well, some users read audit)

@dsseng
Copy link
Member

dsseng commented Nov 2, 2024

@frezbo
Copy link
Member Author

frezbo commented Nov 2, 2024

some thoughts, I think we should only enable if audit=1 kernel commandline is set which means a user explicitly wants this and we could also have a talos.auditd.format arg to control the format

I think we should run auditd always, it doesn't hurt, and provides valuable logs (including troubleshooting).

Should we then skip dropping some audit events?

As for the format, let's stick with something for now (e.g. raw), and create a ticket to revisit later before 1.9 release - as @dsseng is working on SELinux, we'll see what we need more. We can even have a parser outside of Talos (feed raw logs, convert to JSON).

We can probably make raw default for now

actually we can add a parser in talosctl client side or just point to auparse which in itself is a cmd that can parse and print.

@dsseng
Copy link
Member

dsseng commented Nov 2, 2024

actually we can add a parser in talosctl client side or just point to auparse which in itself is a cmd that can parse and print

yes, we better do this. Just give out events as given by the kernel from Talos, and then process on the client side (also make it configurable via cli opts)

Should we then skip dropping some audit events?

Do we skip any? I believe 1300-2999 is quite complete

@frezbo
Copy link
Member Author

frezbo commented Nov 2, 2024

Should we then skip dropping some audit events?

Do we skip any? I believe 1300-2999 is quite complete

That's where i'm confused the comment says so, but AUDIT_USER_AUTH is 1100,
so existing code would skip over messages below 1100 and greater than 2999, the latter condition is correct, but not the former

// Messages from 1300-2999 are valid audit messages.
		if rawEvent.Type < auparse.AUDIT_USER_AUTH ||
			rawEvent.Type > auparse.AUDIT_LAST_USER_MSG2 {
			continue
		}

hmm I think it's the comment that's wrong since types b/w 1000-1100 are actually related to audit subsystem itself

t (
	AUDIT_GET                       AuditMessageType = 1000
	AUDIT_SET                       AuditMessageType = 1001
	AUDIT_LIST                      AuditMessageType = 1002
	AUDIT_ADD                       AuditMessageType = 1003
	AUDIT_DEL                       AuditMessageType = 1004
	AUDIT_USER                      AuditMessageType = 1005
	AUDIT_LOGIN                     AuditMessageType = 1006
	AUDIT_WATCH_INS                 AuditMessageType = 1007
	AUDIT_WATCH_REM                 AuditMessageType = 1008
	AUDIT_WATCH_LIST                AuditMessageType = 1009
	AUDIT_SIGNAL_INFO               AuditMessageType = 1010
	AUDIT_ADD_RULE                  AuditMessageType = 1011
	AUDIT_DEL_RULE                  AuditMessageType = 1012
	AUDIT_LIST_RULES                AuditMessageType = 1013
	AUDIT_TRIM                      AuditMessageType = 1014
	AUDIT_MAKE_EQUIV                AuditMessageType = 1015
	AUDIT_TTY_GET                   AuditMessageType = 1016
	AUDIT_TTY_SET                   AuditMessageType = 1017
	AUDIT_SET_FEATURE               AuditMessageType = 1018
	AUDIT_GET_FEATURE               AuditMessageType = 1019
	AUDIT_USER_AUTH                 AuditMessageType = 1100

Adds a auditd service that gathers all audit logs from kernel.

Signed-off-by: Noel Georgi <git@frezbo.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
@frezbo frezbo force-pushed the feat/auditd-service branch from 1b09f63 to 9abf161 Compare November 2, 2024 16:55
@frezbo
Copy link
Member Author

frezbo commented Nov 2, 2024

I've dropped the log format and will implement in client side as a different PR, now logs as standard auditd format

@frezbo
Copy link
Member Author

frezbo commented Nov 2, 2024

/m

@talos-bot talos-bot merged commit 9abf161 into siderolabs:main Nov 2, 2024
50 checks passed
@frezbo frezbo deleted the feat/auditd-service branch November 2, 2024 17:21
@stereobutter
Copy link
Contributor

https://github.com/threathunters-io/laurel https://github.com/bfuzzy1/auditd-attack

some examples of software reading audit

We are using laurel at my day job to parse auditd logs. I run both auditd and laurel as daemonsets on my talos lab setup currently.

Are there any plans for machine config/ a controller for auditd? I'd expect at least the auditd rules are something that would make sense as machine config/controller. For other config, such as auditd plugins I think just dropping a config file into the respective auditd config dir would be fine since not everybody will use plugins (we use the audisp-af_unix plugin for laurel) if used plugin config doesn't change often.

@smira
Copy link
Member

smira commented Dec 3, 2024

No, no plans to so far besides what is in this PR.

@dsseng
Copy link
Member

dsseng commented Dec 3, 2024

Is your config still working? If not, we might want to have a disable flag to avoid Talos capturing the audit logs on its own

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants