Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

otel cpu utilization is high #25815

Closed
strive-after opened this issue Aug 15, 2023 · 7 comments · Fixed by #26474
Closed

otel cpu utilization is high #25815

strive-after opened this issue Aug 15, 2023 · 7 comments · Fixed by #26474
Labels
bug Something isn't working receiver/hostmetrics

Comments

@strive-after
Copy link

strive-after commented Aug 15, 2023

Component(s)

cmd/otelcontribcol, receiver/hostmetrics

What happened?

Deploying otel on the four-tier proxy server to collect hostmetrics data will cause the CPU to be extremely high. After closing the network collection, it will return to normal. Through the flame graph query, it is found that gopsutil is used. This will scan files under the proc directory. The more links, the more CPU. The more resources are occupied, see if it can be adjusted. At present, there is no such problem in falcon using ss

Collector version

0.61.x

@strive-after strive-after added bug Something isn't working needs triage New item requiring triage labels Aug 15, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@strive-after
Copy link
Author

err = s.recordNetworkConnectionsMetrics()
if err != nil {
	errors.AddPartial(connectionsMetricsLen, err)
}

Even if I turn off the indicator, this part of the logic is still running and still consumes a lot of cpu

@crobert-1
Copy link
Member

Hello @strive-after, can you post the configuration you're using?

@crobert-1
Copy link
Member

Even if I turn off the indicator, this part of the logic is still running and still consumes a lot of cpu

Can you expand on what you mean by "turn off the indicator"?

@dmitryax
Copy link
Member

dmitryax commented Sep 5, 2023

@strive-after, thanks for reporting the issue. I submitted #26474 to not collect the data from the host if the metric is disabled.

I don't see anything else we can do on the collector side, the issue should probably go to gopsutil repo

@crobert-1
Copy link
Member

/label -needs-triage

@github-actions github-actions bot removed the needs triage New item requiring triage label Sep 5, 2023
dmitryax added a commit that referenced this issue Sep 6, 2023
…#26474)

If `system.network.connections` metric is disabled, don't collect the
information from the host to not waste CPU cycles

Fixes
#25815
@cforce
Copy link

cforce commented Oct 21, 2023

@dmitryax How can one effectively manage the number of concurrent threads at regular intervals, typically synchronized with the host metric collection period, where these metrics are gathered and may lead to CPU spikes? When dealing with small IoT embedded devices, it would be highly beneficial to smoothen the performance by avoiding simultaneous execution and opting for a more serialized approach. Additionally, in cases where the central computing unit (CCU) is unavailable due to system throttling, it's crucial for the collector to maintain proper behavior, ensuring it doesn't end up in states where recovery is impossible or where it accumulates even more threads instead of reducing them. To enhance resource efficiency in general, it would be advantageous to enable the collector to respond to external signals and adjust its resource utilization accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working receiver/hostmetrics
Projects
None yet
4 participants