Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: print annotations for key package versions in transformers test #1184

Merged
merged 3 commits into from
Dec 20, 2024

Conversation

dvrogozh
Copy link
Contributor

@dvrogozh dvrogozh commented Dec 19, 2024

Annotations are available on a summary page of executed workflow.

@dvrogozh dvrogozh force-pushed the ci branch 7 times, most recently from 366461a to 2a843d5 Compare December 19, 2024 21:38
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Annotations are available on a summary page of executed workflow.

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
@dvrogozh dvrogozh marked this pull request as ready for review December 19, 2024 21:44
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
echo "| jobs.$GITHUB_JOB.versions.python | $(python --version | cut -f2 -d' ') |" >> $GITHUB_STEP_SUMMARY
packages=" \
level-zero \
libigc1 \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this accounts to both cases: LTS and rolling drivers. They have some components available in the packages with the different names, like libigc1 vs. libigc2 and same for level-zero. This will result in printing empty versions for not installed packages.

@RUIJIEZHONG66166
Copy link
Contributor

LGTM. @chuanqi129 Could you please double check?

Copy link
Contributor

@chuanqi129 chuanqi129 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, add 2 minor comments, FYI

echo "pip installed packages:"
pip list | tee ${{ github.workspace }}/transformers/tests_log/pip_list.txt
echo "lspci gpu devices:"
lspci -d ::0380 | tee ${{ github.workspace }}/transformers/tests_log/lspci_0380.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use xpu-smi discovery to get the gpu devices information? Because the runner procession with ZE_AFFINITY_MASK by default, the lspci always show all devices information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. But let's add running xpu-smi in a stand-alone step. I remember it gives lengthy output by itself and for us it has special interest. Good to separate from this step. I will submit another PR on this since I need to experiment with xpu-smi a little for the better output.

echo "| jobs.$GITHUB_JOB.torch.xpu.device_count | $var |" >> $GITHUB_STEP_SUMMARY
# printing annotations with key environment variables
echo "| jobs.$GITHUB_JOB.env.ZE_AFFINITY_MASK | $ZE_AFFINITY_MASK |" >> $GITHUB_STEP_SUMMARY
echo "| jobs.$GITHUB_JOB.env.NEOReadDebugKeys | $NEOReadDebugKeys |" >> $GITHUB_STEP_SUMMARY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you think if we convert this step as a standalone scripts instead of long step in the workflow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that long at the moment to strongly require converting to the script per my opinion.... Eventually complicated stuff, or often reused stuff, or lengthy stuff needs conversion to the script. At the same time I personally am trying to avoid calling scripts from github actions since this impacts workflow transparency and ultimately complicates workflow. So, right balance needs to be looked for here.

For this particular step, I actually wonder maybe we need to convert it to the stand-alone action rather than script? In this way it can be reused by other workflows in the repo in their steps and bring unified format for reporting this information.

Overall, let's merge in this PR as is and discuss these proposals.

@dvrogozh
Copy link
Contributor Author

PR did not modify the scope of failing ci test (pull/preci-ut/Torch-XPU-UT-Tests). Can be merged.

@dvrogozh dvrogozh added this pull request to the merge queue Dec 20, 2024
Merged via the queue into intel:main with commit e2e802f Dec 20, 2024
3 of 4 checks passed
@dvrogozh dvrogozh deleted the ci branch December 20, 2024 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants