-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add EnableLogging
and LogLabel
supports for Node NetworkPolicy
#6626
Add EnableLogging
and LogLabel
supports for Node NetworkPolicy
#6626
Conversation
16431bb
to
cb09cbc
Compare
82bc18c
to
1589c2a
Compare
a829204
to
43efbe4
Compare
// used as iptables log prefix. According to https://ipset.netfilter.org/iptables-extensions.man.html, the prefix is | ||
// up to 29 letters long. | ||
if len(logLabel) > 28 { | ||
klog.InfoS("The log label exceeds 28 characters and will be truncated", "logLabel", logLabel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we log this in the validation webhook (instead or in addition)? cc @tnqn for his opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with validating with a webhook.
One more thing, we need to add some basic information about the rule to the logLabel which will be used as iptables log prefix. The worst case is like the following:
| Antrea |:|Out|:|Reject|:| [user-provided label] |:|
| 6 |1|3 |1|6 |1| n |1|
As a result, the user-provided log label (n
) is up to 10 characters long. Will that be sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that having a fixed-limit, regardless of the rule direction or the rule action, may be the right approach.
Maybe we could replace "Out" with "O" and "In" with "I". I would give us 2 extra characters (making the limit 12 for the user-provided label), without really compromising readability too much. I was thinking of doing the same for the action, but I think it would hurt readability too much?
12 characters is not much, but we are heavily constrained by the implementation.
I was initially thinking of failing in the validation webhook if the limit is exceeded, but I now feel like it is better to just log a message warning the user. I'd log it once in the webhook (controller) and once in the agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd log it once in the webhook (controller) and once in the agent.
Do you mean that something should be shown when a Node NetworkPolicy with a log label over 12 characters?
I made some changes to the validator. If I understood correctly, what I expected is that: when I use kubectl
to apply a Node NetworkPolicy with a log label over 12 characters, a warning message will be shown in the terminal. It seems that there is something wrong, after applying such Node NetworkPolicy, nothing is shown in the terminal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you will need to use the Warnings
field of the AdmissionResponse
: https://pkg.go.dev/k8s.io/api/admission/v1#AdmissionResponse
I am not sure we have done it in the past. Is this what you tried?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! That's what I need!
43efbe4
to
0244a6d
Compare
// used as iptables log prefix. According to https://ipset.netfilter.org/iptables-extensions.man.html, the prefix is | ||
// up to 29 letters long. | ||
if len(logLabel) > 28 { | ||
klog.InfoS("The log label exceeds 28 characters and will be truncated", "logLabel", logLabel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log is a little misleading, probably better to include the logLabel with prefix and the original one in Node NP.
klog.InfoS("The log label exceeds 28 characters and will be truncated", "logLabel", logLabel) | |
klog.InfoS("The log label exceeds 28 characters and will be truncated", "logLabel", logLabel, "rule.LogLabel", rule.LogLabel) |
I am wondering will this kind of truncation has any side-affect? this label seems likely to be duplicated in different Node NPs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this label seems likely to be duplicated in different Node NPs.
Do you mean that different Node NPs that have that same logLabels?
3dfb4ee
to
5df6efe
Compare
503637b
to
fa40171
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good on my side, only minor comments
EnableLogging: true, | ||
LogLabel: "ingress1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just want to make sure that we still have a policy rule for which EnableLogging
is false, for the sake of testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You just reminded me. I left one policy with EnableLogging
set to false
and another policy with EnableLogging
set to true
but with LogLabe
l left empty, for the sake of testing.
446c1da
to
4fef712
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good on my side
@luolanzone do you want to do another review?
Probably no time this week, I will check with hongliang offline for the question I posted. @antoninbas you can go ahead. Thanks. |
/test-all |
@hongliangl I just realized, shouldn't we have an e2e test for this? Without it we only have unit tests and no guarantee that it actually works |
Make sense, I'll add. |
test/e2e/nodenetworkpolicy_test.go
Outdated
assert.EventuallyWithT(t, func(tc *assert.CollectT) { | ||
// After generating some traffic, there should be corresponding kernel logs on hostNetwork Pod x/a, and the kernel | ||
// logs should contain the generated random logLabel and not contain the part that should be truncated. | ||
stdout, stderr, err = data.RunCommandFromPod(antreaNamespace, antreaPodName, agentContainerName, []string{"dmesg"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to limit the output of dmesg, the default output might be large. Maybe dmesg | tail -n 200
?
otherwise, the following codes will continuously checking the logs and fail at the last.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense
ca3a41c
to
0bcf2ea
Compare
9a163b2
to
dc77645
Compare
test/e2e/nodenetworkpolicy_test.go
Outdated
re4 := regexp.MustCompile(fmt.Sprintf(`(?s)%s.*SRC=%s DST=%s`, expectedLogPrefix, srcIP4, dstIP4)) | ||
re6 := regexp.MustCompile(fmt.Sprintf(`(?s)%s.*SRC=%s DST=%s`, expectedLogPrefix, srcIP6, dstIP6)) | ||
|
||
assert.EventuallyWithT(t, func(tc *assert.CollectT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we usuall just name the *assert.CollectT
variable t
test/e2e/nodenetworkpolicy_test.go
Outdated
// logs should contain the generated random logLabel and not contain the part that should be truncated. | ||
stdout, stderr, err = data.RunCommandFromPod(antreaNamespace, antreaPodName, agentContainerName, []string{"sh", "-c", "dmesg | tail -n 200"}) | ||
if err != nil || stderr != "" { | ||
t.Fatalf("Got error: %v, %v", err, stderr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not technically correct. You cannot call Fatal
/ FailNow
from here, as it does not run in the main goroutine. This is why for such cases we tend to stick to wait.Poll
, until the fix to stretchr/testify#1396 is part of a testify release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have reminded me. I have changed to use wait.Poll
and left a TODO to replace wait.Poll
with assert.Eventually
after stretchr/testify#1396 is merged.
dc77645
to
9f848c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, you just need to address the linter error
9f848c7
to
786d692
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one comment.
test/e2e/nodenetworkpolicy_test.go
Outdated
re4 := regexp.MustCompile(fmt.Sprintf(`(?s)%s.*SRC=%s DST=%s`, expectedLogPrefix, srcIP4, dstIP4)) | ||
re6 := regexp.MustCompile(fmt.Sprintf(`(?s)%s.*SRC=%s DST=%s`, expectedLogPrefix, srcIP6, dstIP6)) | ||
|
||
// TODO: Replace wait.PollUntilContextTimeout with assert.EventuallyWithT after https://github.com/stretchr/testify/issues/1396 is resolved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the issue is already resolved, please check if you want to update in this PR or with a new PR to resolve this TODO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the issue is already resolved
Exactly, but the latest release of github.com/stretchr/testify is v1.9.0, which doesn't include the fix for the issue.
786d692
to
aef1720
Compare
/test-kind-all |
This commit introduces limited support for traffic logging in Node NetworkPolicy. The limitations are: - Traffic logs are written only to the system log (not managed by Antrea). Users can filter logs using syslog filters. - The `LogLabel` for Node NetworkPolicy is restricted to a maximum of 12 characters. Node NetworkPolicy's data path is implemented via iptables. An iptables "non-terminating target" `LOG` is added before the final matching rule to log packets to the system kernel log. The logs provide packet match details, such as: ``` Sep 2 10:31:07 k8s-node-control-plane kernel: [6657320.789675] Antrea:I:Allow:allow-http:IN=ens224 OUT= MAC=00:50:56:a7:fb:18:00:50:56:a7:23:47:08:00 SRC=10.10.0.10 DST=192.168.240.200 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=52813 DF PROTO=TCP SPT=57658 DPT=80 WINDOW=64240 RES=0x00 SYN URGP=0 Sep 2 10:31:11 k8s-node-control-plane kernel: [6657324.899219] Antrea:I:Drop:default-drop:IN=ens224 OUT= MAC=00:50:56:a7:fb:18:00:50:56:a7:23:47:08:00 SRC=192.168.240.201 DST=192.168.240.200 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=27486 DF PROTO=TCP SPT=33152 DPT=80 WINDOW=64240 RES=0x00 SYN URGP=0 ``` The log prefix (e.g., `Antrea:I:Allow:allow-http:`) is up to 29 characters long and includes a user-provided log label (up to 12 characters). The log prefix format: ``` |---1--| |2| |---3--| |----------4--------| |Antrea|:|I|:|Reject|:|user-provided label|:| |6 |1|1|1|4-6 |1|1-12 |1| ``` - Part 1: Fixed, "Antrea" - Part 2: Direction, "I" (In) or "O" (Out) - Part 3: Action, "Allow", "Drop", or "Reject" - Part 4: User-provided log label, up to 12 characters Signed-off-by: Hongliang Liu <hongliang.liu@broadcom.com>
aef1720
to
4feb47c
Compare
@hongliangl was the last push just a rebase? |
/test-kind-all |
A rebase to resolve conflicts |
/test-kind-all |
/test-all |
This commit introduces limited support for traffic logging for Node NetworkPolicy. The limitations are: - Traffic logs are written only to the system log (not managed by Antrea). Users can filter logs using syslog filters. - The `LogLabel` for Node NetworkPolicy is restricted to a maximum of 12 characters. Node NetworkPolicy's data path is implemented via iptables. An iptables "non-terminating target" `LOG` is added before the final matching rule to log packets to the system kernel log. The logs provide packet match details. The log prefix (e.g., `Antrea:I:Allow:allow-http:`) is up to 29 characters long and includes a user-provided log label (up to 12 characters). The log prefix format: - Part 1: Fixed, "Antrea" - Part 2: Direction, "I" (In) or "O" (Out) - Part 3: Action, "Allow", "Drop", or "Reject" - Part 4: User-provided log label, up to 12 characters Signed-off-by: Hongliang Liu <hongliang.liu@broadcom.com>
For #6525
This commit introduces limited support for traffic logging in Node
NetworkPolicy. The limitations are:
Users can filter logs using syslog filters.
LogLabel
for Node NetworkPolicy is restricted to a maximum of12 characters.
Node NetworkPolicy's data path is implemented via iptables. An iptables
"non-terminating target"
LOG
is added before the final matching rule tolog packets to the system kernel log. The logs provide packet match details,
such as:
The log prefix (e.g.,
Antrea:I:Allow:allow-http:
) is up to 29 characters longand includes a user-provided log label (up to 12 characters). The log prefix format: