-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add redfishpower HPE Cray EX chassis device file #173
Conversation
b95676d
to
074be36
Compare
e5ef2ba
to
85d3660
Compare
re-pushed, rebasing on #172. mostly just updating the tests, "diagnostic" error messages are now by default (no |
The dependent PRs are approved so I'll wait to review this until it's rebased on master after those are merged if that's OK. |
85d3660
to
2faf3a9
Compare
re-pushed, rebasing on master and fixing up tests given new error message output possibilities |
For a start, could we just propose the one spec in this PR, for the fully populated chassis, drop the inline documentation, and shorten the name to "cray-ex" or similar? |
2faf3a9
to
118a764
Compare
ok, re-pushed per your comments above. We'll split out rabbit config into it's own PR later. |
118a764
to
4def2a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So nice to see this coming together!
etc/devices/redfishpower-cray-ex.dev
Outdated
# include "/etc/powerman/redfishpower-cray-ex.dev" | ||
# device "redfishpower" "redfishpower-CrayEX" "/usr/sbin/redfishpower -h -h cmm0,pmynode[0-15] |&" | ||
# node "cmm0,myperif[0-7],myblade[0-7],mynode[0-15]" "redfishpower" "Enclosure,Perif[0-7],Blade[0-7],Node[0-15]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that an extra -h
on the device line?
Suggest you don't name the example device "redfishpower" but something more useful like "chassis0".
Finally, since these device lines are going to be repeated quite a lot in a large config, can we shorten the spec name further and drop the capital letters?
E.g. instead of
device "redfishpower" "redfishpower-CrayEX" "/usr/sbin/redfishpower -h -h cmm0,pmynode[0-15] |&"
go with
device "chassis0" "cray-ex" "/usr/sbin/redfishpower -h cmm0,pmynode[0-15] |&"
send "setplugs Enclosure 0\n" | ||
expect "redfishpower> " | ||
send "setplugs Perif[0-7],Blade[0-7] 0 Enclosure\n" | ||
expect "redfishpower> " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might improve readability a bit if on the first occurrence of setplugs
there was a comment listing the positional arguments, e.g.
# setplugs pluglist hostindex [parent plug]
send "setpath Enclosure,Perif[0-7],Blade[0-7] stat redfish/v1/Chassis/{{plug}}\n" | ||
expect "redfishpower> " | ||
send "setpath Enclosure,Perif[0-7],Blade[0-7] on redfish/v1/Chassis/{{plug}}/Actions/Chassis.Reset {\"ResetType\":\"On\"}\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with setpath
e.g.
# setpath pluglist {stat|on|off} path
send "settimeout 75\n" | ||
expect "redfishpower> " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This timeout deserves quick comment too like
# settimeout [seconds] - on/off operations fail on expiration
t/t0029-redfish.t
Outdated
# | ||
|
||
test_expect_success 'create powerman.conf for chassis w/ 16 redfish nodes (crayex)' ' | ||
cat >powerman_hpe_cray_supercomputing_ex_chassis.conf <<-EOT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a temp file so maybe we could shorten the name a bit :-)
t/t0029-redfish.t
Outdated
# | ||
# redfishpower hpe cray supercomputing ex chassis test | ||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be its own test script since its likely we'll want to add tests as things come up on the real system. Plus it allows for the tests to run in parallel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, and we can add your "big redfish test" in afterwards. (#127)
4def2a9
to
de6f32b
Compare
re-pushed per comments above! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Just one more thought.
etc/devices/redfishpower-cray-ex.dev
Outdated
# - If your chassis is not fully populated, put placeholder hosts in | ||
# the redfishpower hosts configuration. Adjust plugs when | ||
# configuring specific targets. For example, let's say perif[4-7] | ||
# and blades[4-7] are not populated (thus nodes[8-15] also do not exist). | ||
# | ||
# device "chassis0" "cray-ex" "/usr/sbin/redfishpower -h cmm0,pnode[0-7],unused[0-7] |&" | ||
# node "cmm0,myperif[0-3],myblade[0-3],mynode[0-7]" "redfishpower" "Enclosure,Perif[0-3],Blade[0-3],Node[0-7]" | ||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be slightly clearer if the example focused on the unused elements of the hostlist. So just depopulate Blade[4-7]
hence Node[8-15]
and leave Perif[4-7]
out of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh one other quick thing. I assume auth USER:PASS
is not the published vendor default password. Do we want to put the default in the dev script ? Then we wouldn't require people to immediatley copy the dev script.
It seems like this came up and we found the default on a public web document somewhere? (need to confirm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have to check if this is accurate, but if so, it's out there searchable with a google so IMHO is fair game to include in a dev script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be slightly clearer if the example focused on the unused elements of the hostlist. So just depopulate Blade[4-7] hence Node[8-15] and leave Perif[4-7] out of it.
I guess in my example I wanted to show how to config if switches aren't populated too. Would it be clearer if I did two examples?
Oh one other quick thing. I assume auth USER:PASS is not the published vendor default password. Do we want to put the default in the dev script ? Then we wouldn't require people to immediatley copy the dev script.
Good point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess in my example I wanted to show how to config if switches aren't populated too. Would it be clearer if I did two examples?
Eh, I guess if you want although that part is just standard powerman stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm gonna punt on the default username/password to a follow up PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, I guess if you want although that part is just standard powerman stuff.
Ohhh, i see your point. It's normal powerman conf stuff. makes sense.
de6f32b
to
01e9953
Compare
re-pushed, tweaking the one thing commented on above. if we decide to change default username/password will put into a follow up PR. |
etc/devices/redfishpower-cray-ex.dev
Outdated
# are not populated (thus nodes[8-15] also do not exist). | ||
# | ||
# device "chassis0" "cray-ex" "/usr/sbin/redfishpower -h cmm0,pnode[0-7],unused[0-7] |&" | ||
# node "cmm0,myperif[0-7],myblade[0-3],mynode[0-7]" "redfishpower" "Enclosure,Perif[0-7],Blade[0-3],Node[0-7]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
device name should be "chassis0" here
t/t0037-cray-ex.t
Outdated
|
||
# Use port = 11000 + test number | ||
# That way there won't be port conflicts with make -j | ||
testaddr=localhost:11029 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be 11000 + 37 = 11037
Add device file for a HPE Cray Supercomputing EX Chassis. Fixes chaos#128
Problem: There is no testing for the new HPE Cray Supercomputing EX Chassis device file. Add new tests in new t0037-cray-ex.t test file.
01e9953
to
c951691
Compare
doh! good catches ... re-pushed with those two tweaks above |
Built on top of #167 and #172, but those don't have to go in first. #167 is a "pretty nice to have" (would like to have before release), #172 can be debated before release if it should go in.