-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[System] Make network_summary namespace/container aware #605
Comments
Pinging @elastic/integrations (Team:Integrations) |
This is definitely a cool idea, and it won't be difficult. I assume we'll want to keep this as an addition to the |
@elkargig, @matschaffer would love your input here as it will greatly help us in solving the use case. |
@ydubreuil may also be interested to this conversation around PID/mapping. |
Is it possible to configure a list of docker container names (which map to docker container ids -> cgroup IDs) instead of process names ? The docker container name -> docker container id map doesn't sound too expensive to be happening on every metricbeat run. If it is, could we set a caching TTL (and suffer all the consequences of cache invalidation errors in the future) ? At least for Elastic Cloud that would be preferred I believe, as docker container names are static, but I fully understand why someone would prefer to run this for all processes matching a name, eg 'nginx', 'haproxy', etc. |
Yah, I assume that mapping container names to cgroup and process info is something we're already doing somewhere else. Again, the issue then becomes if this kind of data is better suited to a different metricset, or perhaps a processor of some kind. |
I think we could potentially follow the same logic as https://github.com/google/cadvisor. |
At the very least, we would need the cgroup id as it is reported in |
So, I'm leaning towards adding this to |
So we could just report cgroups network_summary, making sure the id is included. Then enable the I found this issue from @andrewkroh that may help here: elastic/beats#2483 |
@exekias what exactly are you imagining here? |
tbh I don't have a clear picture of how this could look 😇. Right now If we follow the pattern that we have for other cgroup metrics this would go under the My biggest concern is that reporting processes with cgroup data is becoming expensive in terms of storage. I also see how @elkargig is more interested on having these metrics per cgroup (/container) as compared to process. I wonder if you folks would prefer a different organization, something like having a We can definitely enrich both cases with docker metadata (including container name). |
Yah, that's why I was thinking of adding it to docker/network, since |
The end goal for us it to be able to graph "various network metrics per (docker) container name". I'm not sure I fully understand the proposal here, but having the same network metrics captured per different process pid seems a little wasteful regarding stored resources, as multiple processes will be reporting the same network metrics as they belong in the same cgroup. hope it helps |
Thanks, I think that helps answering the question! you are more interested in metrics at the container level over fine grained process details. @fearful-symmetry I'm actually wondering if docker offers these metrics already and we are just not retrieving them? |
@exekias if they're somewhere in the stats APIs, I don't see them. I started experimenting with this last week, I think the best way to do this is to grab the PIDs associated with the container via the docker API, and then query the network stats for each of those PIDs. |
PR Has been merged here: elastic/beats#25354 Unfortunately this probably won't make it into the next release unless we want to backport this late, as I had a perfect storm complexity in the PR itself and of CI issues that resulted in it just sitting in review for a week. |
thanks a lot @fearful-symmetry ! is there a beats build we can use to test this before it gets released ? |
@elkargig It'll be in the 7.14 snapshot builds, which should be available by Jun 29th-ish. |
Hi! We just realized that we haven't looked into this issue in a while. We're sorry! We're labeling this issue as |
Elastic Cloud SREs want to be able to export network summary metrics for all namespaces/docker containers.
Right now
network_summary
works for the host(/namespace) metricbeat is running on, but it would be really useful to debug networking issues if they were able to have the same kind of network statistics from all namespaces/containers running on a host without the need to run a separate metricbeat inside each container.These network metrics are accessible from a pid perspective, e.g. in
/proc/XXXX/net/netstat
or/proc/XXXX/snmp
Since we know in which cgroup each pid belongs from
/proc/XXXX/cgroup
so it should be feasible to only grab these metrics once percgroup
and not for everypid
per metricbeat poll.These metrics should then have an additional field with the container/namespace id.
The text was updated successfully, but these errors were encountered: