powerman: when status and status_all are defined, use status_all only on full pluglist #156

chu11 · 2024-03-01T23:54:55Z

Problem: When selecting a script to use for an action, the "all" version of a script is always used if the action is a query action (e.g. power status query, beacon query, etc.).

This can be problematic in situations where blades/nodes/etc. are not fully populated, such as in a chassis. We do not want to query "all" things in that case.

Solution: Only call the "all" script if it is a query and a singlet version of the script does not exist. For device files where unpopulated nodes may exist, it can define an all script and singlet script for each scenario.

Built on top of #155

garlick · 2024-03-02T05:51:03Z

Nice! This probably needs test coverage though.

chu11 · 2024-03-02T05:54:01Z

Nice! This probably needs test coverage though.

Ahh, yeah. I guess it was sort of there in the bigger PR, but could use some specific tests targeted just to this.

chu11 · 2024-03-02T06:13:07Z

re-pushed, added some coverage in t0004-status-query.t.

Edit: oops bash is ok with == but not dash.

garlick · 2024-03-02T14:31:10Z

Thanks for the test!

Other things:

Title should be more descriptive like, "powerman: when status and status_all are defined, use status_all only on full pluglist"
Any existing dev specs that define both should be updated to only define status_all to avoid using formerly dead and probably untested code. See appro-gb2.dev, appro-greenblade.dev and swpdu.dev, all of which contain multiple specifications.
This behavior should be explained in powerman.dev(5).

BTW which script does powerman choose when the device spec contains both status and status_all and does not define a plug name list?

garlick

Oh just realized that this applies to status_beacon_all.
Hmm, looks like there is a status_temp_all but no status_temp?

garlick · 2024-03-02T14:51:00Z

t/t0004-status-query.t

@@ -145,6 +145,72 @@ test_expect_success 'stop powerman daemon and device server' '
 wait &&
 wait
 '
+test_expect_success 'create new powerman.conf with no status and status_all script' '


garlick · 2024-03-02T14:59:29Z

src/powerman/device.c

@@ -683,8 +683,12 @@ static int _enqueue_targeted_actions(Device * dev, int com, hostlist_t hl,
 pluglist_iterator_destroy(itr);


Commit message:

Solution: Only call the "all" script if it is a query and a singlet
version of the script does not exist.

Makes it sound like "all" is never called if the singlet exists. Maybe "only call the all query script if all plugs are requested OR the singlet version of the script is not defined."?

For device files where
unpopulated nodes may exist, it can define an all script and singlet
script for each scenario.

s/nodes/plugs/

chu11 · 2024-03-02T22:43:37Z

BTW which script does powerman choose when the device spec contains both status and status_all and does not define a plug name list?

TBH I wasn't sure. Adding some printf debug into _enqueue_targeted_actions() it appears dev->plugs is simply all the plugs defined by node configuration if the plug name list isn't formally listed. And if the user doesn't input anything, the input hostlist hl parameter is everything specified in the config file.

Problem: The "temp" command was missing a "goto ok" statement after completing the temperature request. It lead to unexpected output. Add the missing "goto ok".

chu11 · 2024-03-02T23:39:44Z

re-pushed ... tweaked nit things per comments above and ...

added beacon and temperature equivalent script tests, it ends up "status_temp" was never covered in vpcd and there was a corner case in there :P
add documentation in powerman.dev(5)
removed unusued/untested singleton scripts in device files

garlick

LGTM! Just one commit message suggestion.

garlick · 2024-03-02T23:46:44Z

etc/devices/appro-gb2.dev

- script status {
- send "pmnode %s\r"
- expect "node([0-9]+): (on|off|n/a)"


In the commit message for this, it would probably be good to mention that the current behavior is changing so that the untested singletons would start to get used, so this change just preserves the existing behavior.

Problem: Current powerman implementation will always call the "all" status script instead of a "singleton" status scripts for power status, beacon status, and temperature status. In the future this behavior will change. Some devices specify both "all" and "singleton" status scripts. This means the "singleton" status scripts are never used and are untested, but with future changes they will begin to be used. Solution: Remove the singleton status scripts in appro-gb2.dev, appro-greenblade.dev, and swpdu.dev. This will preserve current behavior of those device files.

Problem: When selecting a script to use for an action, the "all" version of a script is always used if the action is a query action (e.g. power status query, beacon query, etc.). This can be problematic in situations where blades/nodes/etc. are not fully populated, such as in a chassis. We do not want to query "all" things in that case. Solution: Only call the all query script if all plugs are requested OR the singlet version of the script is not defined."? For device files where unpopulated plugs may exist, it can define an all script and singlet script for each scenario.

Problem: There are no tests to see which status script (status vs status_all) is chosen when both are specified in a device file. Add coverage in t0004-status-query.t, t0007-temperature.t, and t0008-beacon.t.

Problem: Recent updates altered when an "all" vs "singleton" status script is called. However this change is not documented. Add information on why a user might want to specify both an "all" and "singleton" status script for power status, beacon status, and temperature status.

chu11 · 2024-03-03T01:41:15Z

thanks, pushed a commit message tweak and setting MWP

chu11 force-pushed the redfishpower_status_selection branch from 71d07fa to a97bd5e Compare March 2, 2024 05:44

chu11 mentioned this pull request Mar 2, 2024

redfishpower: support setplugs configuration #157

Merged

chu11 force-pushed the redfishpower_status_selection branch from a97bd5e to 653f335 Compare March 2, 2024 06:12

chu11 force-pushed the redfishpower_status_selection branch from 653f335 to 6793494 Compare March 2, 2024 06:23

garlick reviewed Mar 2, 2024

View reviewed changes

chu11 changed the title ~~powerman: do not always call "all" query script~~ powerman: when status and status_all are defined, use status_all only on full pluglist Mar 2, 2024

t/simulators: add missing "goto" in vpcd simulator

db02fcc

Problem: The "temp" command was missing a "goto ok" statement after completing the temperature request. It lead to unexpected output. Add the missing "goto ok".

chu11 force-pushed the redfishpower_status_selection branch from 6793494 to 4ca5822 Compare March 2, 2024 23:37

garlick approved these changes Mar 2, 2024

View reviewed changes

chu11 added 4 commits March 2, 2024 16:49

t: test which status script is used

9992f96

Problem: There are no tests to see which status script (status vs status_all) is chosen when both are specified in a device file. Add coverage in t0004-status-query.t, t0007-temperature.t, and t0008-beacon.t.

chu11 force-pushed the redfishpower_status_selection branch from 4ca5822 to ac7726c Compare March 3, 2024 01:41

chu11 added the merge-when-passing label Mar 3, 2024

mergify bot merged commit 7d62cad into chaos:master Mar 3, 2024
8 checks passed

chu11 deleted the redfishpower_status_selection branch March 3, 2024 05:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

powerman: when status and status_all are defined, use status_all only on full pluglist #156

powerman: when status and status_all are defined, use status_all only on full pluglist #156

chu11 commented Mar 1, 2024 •

edited

Loading

garlick commented Mar 2, 2024

chu11 commented Mar 2, 2024

chu11 commented Mar 2, 2024 •

edited

Loading

garlick commented Mar 2, 2024

garlick left a comment

garlick Mar 2, 2024

garlick Mar 2, 2024

chu11 commented Mar 2, 2024

chu11 commented Mar 2, 2024 •

edited

Loading

garlick left a comment

garlick Mar 2, 2024

chu11 commented Mar 3, 2024

		@@ -683,8 +683,12 @@ static int _enqueue_targeted_actions(Device * dev, int com, hostlist_t hl,
		pluglist_iterator_destroy(itr);

powerman: when status and status_all are defined, use status_all only on full pluglist #156

powerman: when status and status_all are defined, use status_all only on full pluglist #156

Conversation

chu11 commented Mar 1, 2024 • edited Loading

garlick commented Mar 2, 2024

chu11 commented Mar 2, 2024

chu11 commented Mar 2, 2024 • edited Loading

garlick commented Mar 2, 2024

garlick left a comment

Choose a reason for hiding this comment

garlick Mar 2, 2024

Choose a reason for hiding this comment

garlick Mar 2, 2024

Choose a reason for hiding this comment

chu11 commented Mar 2, 2024

chu11 commented Mar 2, 2024 • edited Loading

garlick left a comment

Choose a reason for hiding this comment

garlick Mar 2, 2024

Choose a reason for hiding this comment

chu11 commented Mar 3, 2024

chu11 commented Mar 1, 2024 •

edited

Loading

chu11 commented Mar 2, 2024 •

edited

Loading

chu11 commented Mar 2, 2024 •

edited

Loading