-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
updating policy event logs with metadata #296
base: event-log-enhancement
Are you sure you want to change the base?
updating policy event logs with metadata #296
Conversation
552b7a3
to
992f481
Compare
992f481
to
e46e6d4
Compare
pkg/ebpf/events/events.go
Outdated
func logPolicyEventsWithK8sMetadata(log logr.Logger, message *string, nodeName, srcIP string, srcPort int, destIP string, destPort int, protocol, verdict string) { | ||
srcName, srcNS, err := utils.GetPodMetadata(srcIP) | ||
if err != nil { | ||
log.Info("Failed to get source name and namespace metadata", "source IP:", srcIP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do away with this. We're already making an assumption that an unknown IP is an external IP to the cluster...
pkg/ebpf/events/events.go
Outdated
} | ||
destName, destNS, err := utils.GetPodMetadata(destIP) | ||
if err != nil { | ||
log.Info("Failed to get destination name and namespace metadata for", "dest IP:", destIP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. Let's directly call utils.LogFlowInfo
instead from capturePolicyEvents
pkg/rpc/rpc_handler.go
Outdated
rpc.RegisterNPBackendServer(grpcServer, &server{policyReconciler: policyReconciler, log: rpcLog}) | ||
|
||
// Connect to metadata cache service | ||
serviceIP, err := utils.GetServiceIP(clientset, "metadata-cache-service", "default", rpcLog) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're making an assumption that aws-k8s-metadata service is available and we exit if we fail to find such a service on the cluster. We can't exit out of the process. We should make this optional (i.e.,) we print an enhanced log message if we find the service and if not we print the 5 tuple info of the flow (SrcIP/SrcPort/DstIP/DstPort/Protocol) like we do today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metadata-cache-service
--> let's make this a const and maybe add an aws
prefix to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that we're creating a K8S client obj just to find the service IP of metadata cache service and there is no need to do it. We can just resolve the service name (DNS resolution) and that should provide us with service IP.
pkg/ebpf/events/events.go
Outdated
"Proto", protocol, "Verdict", verdict) | ||
|
||
message = "Node: " + nodeName + ";" + "SIP: " + utils.ConvByteArrayToIP(rb.SourceIP) + ";" + "SPORT: " + strconv.Itoa(int(rb.SourcePort)) + ";" + "DIP: " + utils.ConvByteArrayToIP(rb.DestIP) + ";" + "DPORT: " + strconv.Itoa(int(rb.DestPort)) + ";" + "PROTOCOL: " + protocol + ";" + "PolicyVerdict: " + verdict | ||
logPolicyEventsWithK8sMetadata(log, &message, nodeName, srcIP, srcPort, destIP, destPort, protocol, verdict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should call this only if we find the aws-k8s-metadata service running on the cluster. If not, we print the log in the current format and move on. If not, we will see empty values for metadata fields in the log messages and that will look odd..
@@ -83,6 +85,18 @@ func main() { | |||
os.Exit(1) | |||
} | |||
|
|||
k8sconfig, err := rest.InClusterConfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a need for this. I see that we rely on this for deriving the service IP of metadata cache service and instead we can just resolve the metadata service name to get the service IP..
pkg/utils/utils.go
Outdated
|
||
func LogFlowInfo(log logr.Logger, message *string, nodeName, srcIP, srcN, srcNS string, srcPort int, destIP, destN, destNS string, destPort int, protocol, verdict string) { | ||
switch { | ||
case srcN == "" && srcNS == "": // if no metadata for source IP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might've to rearrange this flow. Let's say destN
and destNS
are empty as well, we will end up printing "" strings for DN
and DNS
. Same applies to rest of the combinations below...
Also, why are we using SHOSTIP
for Source IP field here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is a host networking pod such as aws-node pods or kube-proxy pods, we agreed to just say state that the IP is a host IP with no name or namespace.
pkg/utils/utils.go
Outdated
log.Info("Flow Info: ", "Src IP", srcIP, "Src Name", srcName, "Src Namespace", srcName, "Src Port", srcPort, | ||
"Dest Ext Srv IP", destIP, "Dest Port", destPort, "Proto", protocol, "Verdict", verdict) | ||
*message = "Node: " + nodeName + ";" + "SIP: " + srcIP + ";" + "SN" + srcName + ";" + "SNS" + srcName + ";" + "SPORT: " + strconv.Itoa(srcPort) + ";" + | ||
"DEXTSRVIP: " + destIP + ";" + "DPORT: " + strconv.Itoa(destPort) + ";" + "PROTOCOL: " + protocol + ";" + "PolicyVerdict: " + verdict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - "DEXTSRVIP" -> "DESTSRVIP"
Issue #, if available:
When users enable policy event logs, it only contains IP and Port information for Source and Destination, and users have to look up name and namespace for the pod of the given IP address. This creates inefficiency for debugging network policy issues.
Description of changes:
This PR improves the visibility by supporting the Network Policy Agent to fetch pod Kubernetes metadata through a gRPC call with a new deployment on the customer's cluster. This deployment maintains a local cache of pod and service IP address and name/namespace info from the Kubernetes API server, syncing every 10 seconds.
Manual Testing
I did some testing of each of the functions and now the policy event logs look like this:
{"level":"info","ts":"2024-07-23T23:03:50.718Z","logger":"ebpf-client","caller":"utils/utils.go:106","msg":"Flow Info: ","Src IP":"192.168.42.144","Src Name":"client-5467bc68f4-zcl6g","Src Namespace":"client-5467bc68f4-zcl6g","Src Port":34765,"Dest IP":"192.168.77.216","Dest Name":"coredns-787cb67946-blpcn","Dest Namespace":"kube-system","Dest Port":53,"Proto":"UDP","Verdict":"DENY"}
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.