-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce item not found #439
Conversation
5710183
to
cfec99e
Compare
Sorry for the late reply and thanks for your contribution. The issue on kindling/collector/pkg/metadata/kubernetes/k8scache.go Lines 153 to 166 in be46fd6
For the second issue, could you provide more details about "multiple network cards" to help us validate your solution? How is the node used? Are there any more fields for such nodes? What does the data look like you would like to see? Maybe we should open a new issue to discuss the case. These questions are important for further reference. |
# ip -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet 10.10.40.25/24 brd 10.10.40.255 scope global em1
valid_lft forever preferred_lft forever
6: p5p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet 10.10.66.25/24 brd 10.10.66.255 scope global p5p1
valid_lft forever preferred_lft forever
681: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
685: cilium_host@cilium_net: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
inet 171.0.1.42/32 scope link cilium_host
valid_lft forever preferred_lft forever For example, we have the em1 network card as the business network, the p5p1 network card as the cluster network, and the cilium_host network card as the internal network. Since k8s will only report one ip, for example, we correspond to the ip of p5p1. The traffic of other network cards is marked as NOTFOUND, which is not conducive to traffic analysis |
cfec99e
to
c059d72
Compare
Could you please marshal the metadata |
I'm guessing the motivation for why you added other interfaces. If there are mistakes, please feel free to point them out. First, let me explain what is
So when you see
|
For example, the kubelet heartbeat check is an http request, and its source address is the ip of cilium_host, not the ip of the cluster network. Traffic that checks the availability of containers like this should also be counted as internal traffic. In addition, enter the cluster from node1, jump to node2 through vxlan, and then go to the target pod on node2. This kind of traffic is captured on node2, and the source IP is also cilium_host. Should it also be classified as internal traffic? In k8s, |
Thanks for your explanation. What is your opinion on this? @NeJan2020 |
According to the definition of In this implementation, an agent can only get the interfaces on its node, and it doesn't know the interfaces on other nodes. This results in a situation where the same IP could be considered as having different statuses on different nodes. For example, if you have em1 Kubernetes API doesn't provide interfaces of nodes. So to fix the issue above, we have to introduce another API to share interfaces between agents. This will increase the complexity and it is not what we want now for Kindling. |
The But the current method can only get the network cards of the node where the agent is located. Would this cause agents on other nodes to mark the IP of |
Are you still working on this PR? We are eager for your response. |
I'm thinking about this too, I'll think about it first. I need to sort out what scenarios this kind of traffic will appear in |
I figure it will take a long time to conduct a conclusion, but your changes on |
6475e1f
to
fbbeef4
Compare
@dxsup commit has been updated. |
Regarding whether it is necessary to share the network interface ip of machine A with other machines, I thought about the traffic situation:
Cross-host traffic should be similar to this. And this will also be related to what kind of network components are used.
https://docs.cilium.io/en/stable/gettingstarted/terminology/#reserved-labels I don't quite know how kindling handles the traffic forwarded from machine A to machine B and then to the pod. Is it necessary to do a finer-grained division like cilium? |
Signed-off-by: longhui.li <longhui.li@woqutech.com>
fbbeef4
to
04eb1fd
Compare
@dxsup How about this? |
Kindling captures the syscalls that transmit messages via sockets to create topologies. These syscalls differ between the client and server sides, and we analyze them to obtain socket information.
So in your case:
Back to your question, a more fine-grained division is good, of course. But unlike Cilium which is a CNI plugin, Kindling doesn't have the networking metadata inherently, so it is hard to identify every type of traffic. One of the obstacles is the issue we talked about earlier. |
The network is indeed complex as you describe it, but tracking tools are also designed to make complex things simple. From the above figure, although different network components have different implementations, the most complicated one is the traffic number 4, and most of the business traffic used in production is in this direction. flow 4 in a |
This is an idea of mine, I don't know if it can be realized. For different cni plug-ins, can kindling open some interfaces, like kubelet open cni, to implement different packet analysis strategies for different cni, I am happy to improve some packet collection and analysis methods for cilium scenarios |
我发现你也是杭州的,直接中文沟通吧 |
Signed-off-by: Longhui li longhui.li@woqutech.com
Description
During the test, we found a lot of
NOT_FOUND_EXTERNAL
andNOT_FOUND_INTERNAL
, which is not conducive to network analysis. Tracking and positioning found that partly due to the exclusion ofdaemonset
, partly due to the fact that when there are multiple network cards, only the ip of one node is analyzed, and other traffic such ascilium_host
is not clearly identified.Related Issue
None
Motivation and Context
How Has This Been Tested?
Yes