-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Available CPU MHz Varying Wildly for Same Instance Type #7681
Comments
Hi @herter4171 and thanks for the detail in this issue. In order to help diagnose this problem would you be able to provide the output of the following two commands from a couple of the instances where you are seeing this behaviour?
|
Seems like parsing
|
Hi @jrasell, thank you for the response! Output for those For the first instance,
For the second instance,
For the third instance,
Givne the
|
Nomad uses |
Hi @dvusboy, the lay of the land is that I'm using Amazon Linux 2 pretty much out of the box. That platform has |
@herter4171 By |
@dvusboy, I latently picked up on that and edited my last comment accordingly. Can I do something to make Amazon Linux 2 play ball for Nomad, or can something be done on the Nomad side of things to fix this? One idea I have is spawning |
I suppose you can use cpu_total_compute in the client configuration to override the fingerprinted values. |
@dvusboy, I'm aware of that option, and I don't think it addresses the core issue. Nomad should be capable enough to set available MHz. |
Hi @dvusboy and company, after rooting around a bit, I can see the difficulty in getting rated clock speed on Amazon Linux 2 without assumed access to
I'd still like to see this functionality become native instead of depending on my hacky Bash, but I'm equipped to move on if there's not interest in pursuing this. Thanks for the help so far. |
Hey @shoenig, I'm having a bit of additional difficulty in spite of my fix. Even though I've set the client stanza like I described and verified the updated value is reflected in Nomad, jobs still fail to be placed due to this other hidden limit shown in my screenshot. I'm a bit confused, because |
I'm thinking this is actually a problem on all EC2 instances, not just ubuntu@ip-172-31-82-121:~$ cpupower frequency-info
analyzing CPU 0:
no or unknown cpufreq driver is active on this CPU
CPUs which run at the same hardware frequency: Not Available
CPUs which need to have their frequency coordinated by software: Not Available
maximum transition latency: Cannot determine or is not supported.
Not Available
available cpufreq governors: Not Available
Unable to determine current policy
current CPU frequency: Unable to call hardware
current CPU frequency: Unable to call to kernel
boost state support:
Supported: no
Active: no ubuntu@ip-172-31-82-121:~$ # there is no cpufreq/cpuinfo_max_freq
ubuntu@ip-172-31-82-121:~$ ls /sys/devices/system/cpu/cpu0
cache crash_notes crash_notes_size driver firmware_node hotplug node0 power subsystem topology uevent
ubuntu@ip-172-31-82-121:~$ ls /sys/devices/system/cpu/cpufreq # empty If there's any good news, the CPU cgroup management seems unaffected
[ec2-user@ip-172-31-94-218 proc]$ cat /proc/cgroups
#subsys_name hierarchy num_cgroups enabled
cpuset 11 3 1
cpu 9 3 1
cpuacct 9 3 1
blkio 10 3 1
memory 6 3 1
devices 5 25 1
freezer 4 3 1
net_cls 2 3 1
perf_event 8 3 1
net_prio 2 3 1
hugetlb 7 3 1
pids 3 3 1 I'm going to keep researching and asking around, but I suspect this may boil down to parsing the rated CPU speed out of the CPU |
Hey @shoenig, thanks for the digging. One thing about using |
Another possibility might be to modify I put together a quick demo to check if this works, before submitting the idea upstream $ for i in {1..10}; do ./loadcpu && sleep 3 && echo ""; done
read current speed: 800.04
loaded max speed: 3900.70
read current speed: 1924.65
loaded max speed: 3901.08
read current speed: 1495.16
loaded max speed: 3900.33
read current speed: 2826.81
loaded max speed: 3900.00
read current speed: 3400.18
loaded max speed: 3902.43
read current speed: 1979.91
loaded max speed: 3900.95
read current speed: 2627.13
loaded max speed: 3900.19
read current speed: 889.96
loaded max speed: 3901.62
read current speed: 3391.65
loaded max speed: 3902.97
read current speed: 906.17
loaded max speed: 3900.63 |
Fixes #7681 The current behavior of the CPU fingerprinter in AWS is that it reads the **current** speed from `/proc/cpuinfo` (`CPU MHz` field). This is because the max CPU frequency is not available by reading anything on the EC2 instance itself. Normally on Linux one would look at e.g. `sys/devices/system/cpu/cpuN/cpufreq/cpuinfo_max_freq` or perhaps parse the values from the `CPU max MHz` field in `/proc/cpuinfo`, but those values are not available. Furthermore, no metadata about the CPU is made available in the EC2 metadata service. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-categories.html Since `go-psutil` cannot determine the max CPU speed it defaults to the current CPU speed, which could be basically any number between 0 and the true max. This is particularly bad on large, powerful reserved instances which often idle at ~800 MHz while Nomad does its fingerprinting (typically IO bound), which Nomad then uses as the max, which results in severe loss of available resources. Since the CPU specification is unavailable programmatically (at least not without sudo) use a best-effort lookup table. This table was generated by going through every instance type in AWS documentation and copy-pasting the numbers. https://aws.amazon.com/ec2/instance-types/ This approach obviously is not ideal as future instance types will need to be added as they are introduced to AWS. However, using the table should only be an improvement over the status quo since right now Nomad miscalculates available CPU resources on all instance types.
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Nomad v0.10.4 (f750636)
Operating system and Environment details
Amazon Linux 2 with a fixed head node and an auto-scaling group of
c5.24xlarge
instances, with scaling driven by Nomad state using a custom cloud metric.Issue
The number of MHz available on a node varies wildly. For the exact same instance type (96 cores, 3 GHz stock, 3.9 GHz max), I'm seeing as low as 1.6E5 MHz all the way up to 3.4E5 MHz. Just now, I've launched 3
c5.24xlarge
nodes, and their max MHz areI'd rather not hard-wire
cpu_total_compute
in the client config, and everything else I've read claims Nomad sets the MHz based on core count multiplied by rated clock speed rather than current.Having MHz vary like this causes jobs to not be placed, even when the node actually has the capacity. Would a short-term fix be forcing all but one core to 100%, launching the Nomad client, and taking the load off of CPU? The docs I've read claim Nomad uses stock clock speed, so I'm kind of at a loss here.
Reproduction steps
Launch a few instances of the same type with the Nomad client running on boot (I'm using
systemctl
). Rated MHz for each client in the web UI should vary appreciably.The text was updated successfully, but these errors were encountered: