-
Notifications
You must be signed in to change notification settings - Fork 63
Energy monitoring
Starting with RAxML-NG v1.0.0, you might find a line like this at the end of log file:
Consumed energy: 35.958 Wh
This value is measured using RAPL and gives you a very rough idea about electricity usage of your analysis. Usually, it is an underestimate which includes energy consumption of CPU and DRAM only. However, it could also be an overestimate, e.g. if you run RAxML-NG on a shared server using just a subset of CPU cores. This is because we report system-wide energy consumption here. Unfortunately, energy monitoring on current systems turns out to be surprisingly tricky.
Typically, energy monitoring induces very low overhead. Still, if you suspect it slows down your analysis, or causes problems on your system, or makes you sleep worse - just add --extra energy-off
to disable. In the former two cases, please also let us know.
Currently, energy monitoring is only available under Linux. Multi-node MPI runs and checkpointing are supported.
Because climate change. Because writing faster programs does not help. Because energy-efficient hardware does not help either.
Because all software is crap If you have a better answer, please let me know. Statements below are based on my rather limited research of the topic.
First of all, we should distinguish application-level vs. system-level energy consumption.
Clearly, developers and users are primarily interested in energy consumption of their particular application, but such measurements are conceptually difficult. This is because multiple running programs share the same energy-consuming hardware components. For instance, how shall we distribute power consumed by the system fan? To my understanding, this is not an impossible task, but it would certainly require some processing power, implementation effort, and making "dirty" pragmatic decisions. Apparently, it was just never considered important enough. Maybe, Greta should talk to her fellow Scandinavian :)
In contrast, system-level energy can be measured with a common AC power meter. Or it could be, in principle. Often, there is no physical access to the server, and we usually do not want to stay in the server room and take pen&paper notes all day :) On many systems, there are means to get energy consumption programmatically. But historically, only system administrators were interested in this kind of data. Consequently, the respective interfaces (e.g. IPMI) were not designed to be used by application developers. So they require root privileges, have poor temporal resolution etc. So we are left with RAPL, which is available on most systems and can be used without root access (on Linux). Unfortunately, on most systems RAPL measurements do not include all system components (usually CPU and DRAM only, although some modern laptops report full power). Furthermore, there is no portable way (e.g. POSIX function) to read RAPL measurements. Why? I have no idea...
For even more technical background and rants, please read this thread.