Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Metricbeat] Add support for cpuinfo metricset #25471

Open
anpag opened this issue Apr 30, 2021 · 10 comments · Fixed by #31643
Open

[Metricbeat] Add support for cpuinfo metricset #25471

anpag opened this issue Apr 30, 2021 · 10 comments · Fixed by #31643
Labels
Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Comments

@anpag
Copy link

anpag commented Apr 30, 2021

Describe the enhancement:
Add Sigar cpuinfo MHz cpu usage implementation into metricbeat:
https://github.com/hyperic/sigar/blob/ad47dc3b494e9293d1f087aebb099bdba832de5e/go_bindings/gotoc/cpuInfo.go

Describe a specific use case for the enhancement or feature:
The cgo bindings to libsigar in the official Sigar repo support Mhz cpu usage collection. I wonder if it would be possible to add it to metricbeat as an extra dataset to the system module. Monitoring cpu usage in Mhz is useful to monitor cpu turbo boost.

Thanks in advance for taking the time to read this and thanks for all the hard work with the Beats!

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Apr 30, 2021
@anpag anpag changed the title Add cpuinfo to Metricbeat [Metricbeat] Add support for cpuinfo metricset May 1, 2021
@ChrsMark ChrsMark added the Team:Integrations Label for the Integrations team label May 5, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations (Team:Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label May 5, 2021
@gingerwizard
Copy link

I would also like to see this enhanced with the processor name and model - specifically information from /cat/cpuinfo

processor       : 23
vendor_id       : GenuineIntel
cpu family      : 6
model           : 85
model name      : Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
stepping        : 7
microcode       : 0x5003005
cpu MHz         : 3088.589
cache size      : 36608 KB
physical id     : 0
siblings        : 24
core id         : 11
cpu cores       : 12
apicid          : 23
initial apicid  : 23
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 4999.99
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

We see very high variance across models of CPU in cloud providers e.g. consider the N1 type in GCE.

N1 machine types are Compute Engine's first-generation general-purpose machine types. The N1 machines are available on Skylake, Broadwell, Haswell, Sandy Bridge, and Ivy Bridge CPU platforms. N1 machine types provide the following benefits:

These potentially offer wide performance differences under some workloads. In cases where we are monitoring ES, a single node can inhibit cluster performance.

@ebadyano

@exekias
Copy link
Contributor

exekias commented May 6, 2021

Thank you for the explanations on why this is important.

I'm wondering, would it be enough if we report total CPU MHz? I guess the current CPU usage in MHz can be calculated based on existing cpu usage metrics (%)?

@fearful-symmetry any thoughts on this one?

@fearful-symmetry
Copy link
Contributor

I think we had this conversation in the past, and it didn't happen because it wasn't metrics-y enough. That being said, I think stuff like MHz information is a good potential addition, maybe as an enhancement to system/cpu or system/core.

@sorantis
Copy link
Contributor

@fearful-symmetry in addition to MHz information it looks like @gingerwizard is looking for these specific fields:

model           : 85
model name      : Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz

can we also add the to the system/cpu or system/core?

@fearful-symmetry
Copy link
Contributor

@sorantis I think it would make sense to add that as well. For the sake of keeping track, so far we want to add the following fields to system/cpu :

  • model number
  • model name
  • Mhz

@jlind23 jlind23 added Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team and removed Team:Integrations Label for the Integrations team labels Jan 4, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@yuvielastic
Copy link

yuvielastic commented Apr 20, 2022

@jlind23 @fearful-symmetry @sorantis I wanted to bump the urgency of this information to be available into metricbeat.

We see variances across models of CPU in cloud providers such as N2 with GCP and Edsv4/Ddv4 etc with Azure. As within same instance generation, we get variable performance as upon provisioning capacity we are allocated instance with a random CPU generation, we have started to hear some concerns around predictability as customers are getting wide performance differences in two clusters. Therefore, this information will help us monitor CPU variances across the hosts to identify the magnitude of these issues as we scale our infrastructure on Cloud.

Please let me know when can we expect cpu info to be added in the metricbeat.

@belimawr
Copy link
Contributor

Re-opening because it is only implemented on Linux.

@botelastic
Copy link

botelastic bot commented Apr 3, 2024

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Apr 3, 2024
@jlind23 jlind23 removed the Stalled label Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants