-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: print annotations for key package versions in transformers test #1184
Conversation
366461a
to
2a843d5
Compare
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Annotations are available on a summary page of executed workflow. Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
echo "| jobs.$GITHUB_JOB.versions.python | $(python --version | cut -f2 -d' ') |" >> $GITHUB_STEP_SUMMARY | ||
packages=" \ | ||
level-zero \ | ||
libigc1 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this accounts to both cases: LTS and rolling drivers. They have some components available in the packages with the different names, like libigc1 vs. libigc2 and same for level-zero. This will result in printing empty versions for not installed packages.
LGTM. @chuanqi129 Could you please double check? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, add 2 minor comments, FYI
echo "pip installed packages:" | ||
pip list | tee ${{ github.workspace }}/transformers/tests_log/pip_list.txt | ||
echo "lspci gpu devices:" | ||
lspci -d ::0380 | tee ${{ github.workspace }}/transformers/tests_log/lspci_0380.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use xpu-smi discovery
to get the gpu devices information? Because the runner procession with ZE_AFFINITY_MASK
by default, the lspci
always show all devices information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. But let's add running xpu-smi in a stand-alone step. I remember it gives lengthy output by itself and for us it has special interest. Good to separate from this step. I will submit another PR on this since I need to experiment with xpu-smi a little for the better output.
echo "| jobs.$GITHUB_JOB.torch.xpu.device_count | $var |" >> $GITHUB_STEP_SUMMARY | ||
# printing annotations with key environment variables | ||
echo "| jobs.$GITHUB_JOB.env.ZE_AFFINITY_MASK | $ZE_AFFINITY_MASK |" >> $GITHUB_STEP_SUMMARY | ||
echo "| jobs.$GITHUB_JOB.env.NEOReadDebugKeys | $NEOReadDebugKeys |" >> $GITHUB_STEP_SUMMARY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do you think if we convert this step as a standalone scripts instead of long step in the workflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not that long at the moment to strongly require converting to the script per my opinion.... Eventually complicated stuff, or often reused stuff, or lengthy stuff needs conversion to the script. At the same time I personally am trying to avoid calling scripts from github actions since this impacts workflow transparency and ultimately complicates workflow. So, right balance needs to be looked for here.
For this particular step, I actually wonder maybe we need to convert it to the stand-alone action rather than script? In this way it can be reused by other workflows in the repo in their steps and bring unified format for reporting this information.
Overall, let's merge in this PR as is and discuss these proposals.
PR did not modify the scope of failing ci test (pull/preci-ut/Torch-XPU-UT-Tests). Can be merged. |
Annotations are available on a summary page of executed workflow.