Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: move prerelease_deps_coverage_64bit_blas to GitHub actions. #15958

Merged
merged 5 commits into from
Apr 18, 2023

Conversation

tupui
Copy link
Member

@tupui tupui commented Apr 8, 2022

Part of #15814

For now I did not use any templating, first trying to run this correctly in the CI.

  • The linux job was mostly copied and adapted with what was in the job on Azure.
  • Also skip a problematic test.

@tupui tupui added the CI Items related to the CI tools such as CircleCI, GitHub Actions or Azure label Apr 8, 2022
@tupui tupui self-assigned this Apr 8, 2022
@tupui tupui force-pushed the azure_gha branch 2 times, most recently from 0df322c to 8e87a46 Compare April 8, 2022 17:27
@rgommers
Copy link
Member

@tupui I think you want to set this job up on your own fork first. That is by far the best way to iteratively work on CI jobs; only when it works on your own fork you want to switch to a PR here.

@tupui
Copy link
Member Author

tupui commented Apr 11, 2022

ok sorry for the noise+CI

@tupui
Copy link
Member Author

tupui commented Apr 13, 2022

See tupui#6 for discussions.

@tupui tupui marked this pull request as ready for review April 13, 2022 16:56
@tupui tupui requested a review from larsoner as a code owner April 13, 2022 16:56
@tupui
Copy link
Member Author

tupui commented Apr 13, 2022

CI failure is not related. (I opened an issue for that.)

@tupui
Copy link
Member Author

tupui commented Apr 14, 2022

PR is ready on my side. This PR copies the linux job and adjust it with what was in the job on Azure.

@mdhaber I am skipping a test you added in gh-10495. Apparently it's making the CI hangs sometimes when running this test and when it does not, it fails with nans here.

@tupui tupui requested a review from AnirudhDagar April 14, 2022 09:46
@tupui
Copy link
Member Author

tupui commented Apr 14, 2022

cc @tirthasheshpatel since you also looked at coverage, this might interest you.

@tupui
Copy link
Member Author

tupui commented Apr 19, 2022

@rgommers @AnirudhDagar anything else I should do here?

Copy link
Member

@tirthasheshpatel tirthasheshpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a big improvment over azure: the CI time has reduced to 39 minutes from 60+ minutes on Azure. I am OK to merge this as is. The comments I have left can be addressed in a follow-up. Let me know if you want to address those here @tupui. Otherwise I think this is good to merge.

@tupui
Copy link
Member Author

tupui commented Apr 21, 2022

Thanks @tirthasheshpatel! I am fine getting this in since it would make the CI happy again. But I know very little about the some build deps as you noted. I would let @rgommers have a look/merge.

@tirthasheshpatel
Copy link
Member

tirthasheshpatel commented Apr 21, 2022

Sure! I looked at the job. Other than caching, I can say that it is exactly equivalent to the azure job. So, I am +1 to merge this. Maybe, @HarshCasper can confirm if caching is done correctly here.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tupui. There's a few minor things to fix, but overall this looks good. It does drop the ILP64 test completely though, so I'm not sure we want to merge this before we add that support back. We missed that when saying this job should be moved to Meson I think.

That decision depends on whether we can easily fix the current Azure job that's timing out I'd say. Does adding the one test skip to that job fix things there?

.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
Copy link
Member Author

@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rgommers

It does drop the ILP64 test completely though, so I'm not sure we want to merge this before we add that support back. We missed that when saying this job should be moved to Meson I think.

My understanding was that ILP64 was just for the windows job. I don't see this elsewhere in the Azure config.

That decision depends on whether we can easily fix the current Azure job that's timing out I'd say. Does adding the one test skip to that job fix things there?

You mean, stay on Azure and just skip the test? I can try in another PR if you want.

.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
@tupui tupui requested a review from rgommers May 4, 2022 12:00
@tupui tupui added this to the 1.9.0 milestone May 21, 2022
@tylerjereddy tylerjereddy added this to the 1.10.0 milestone May 27, 2022
@tylerjereddy
Copy link
Contributor

I've bumped the milestone based on the feedback above, and to keep the "review queue" sane for branching. Sounds like we might be "ok" with skipping that test before we branch, if that's desired.

@tupui
Copy link
Member Author

tupui commented May 27, 2022

I've bumped the milestone based on the feedback above, and to keep the "review queue" sane for branching. Sounds like we might be "ok" with skipping that test before we branch, if that's desired.

Thanks Tyler. I think it's fine to leave as it is if it does not block the release. Since we updated the timeout on Azure it's mostly passing now.

But it would just take me 2 seconds to make a PR with that so let me know if you want me to do it.

@tylerjereddy
Copy link
Contributor

So we're back to this PR still being open before another release/branch cutting. Looking through some comments:

  • I think we want to get rid of the ATLAS install since we're using OpenBLAS
  • can we leave test_bug_10466() alone now? the original form still seems to be present in main as far as I can tell
  • the main blocker seems to be the same, related to Improve BLAS/LAPACK support: dependency detection, UX, and ILP64 #17244, meson not supporting ILP64 OpenBLAS yet--my guess is that won't change in the next two weeks, so options are perhaps:
    • accept a short-medium period where ILP64 builds are not tested in CI at all and proceed with merging this soon
    • move the ILP64-enabled build to another setuptools-distutils-based build in the CI?
    • do nothing here for now, and bump the milestone so we don't worry about it on current release cycle--the original Azure job is still performing "ok" I think?

cc: @eli-schwartz @rgommers maybe

@eli-schwartz
Copy link
Contributor

eli-schwartz commented Nov 20, 2022

meson not supporting ILP64 OpenBLAS yet--my guess is that won't change in the next two weeks, so options are perhaps:

@rgommers has been working on an implementation of that in mesonbuild/meson#10921 but it's still marked as WIP.

If he has more time to work on it in the next two or three weeks, then we might be able to get it merged inside the current Meson release cycle (currently projected to graduate to RC on December 9, which I guess is after SciPy's current cycle completes).

@rgommers rgommers removed this from the 1.10.0 milestone Nov 20, 2022
@rgommers
Copy link
Member

I would like to work on ILP64 support in Meson soon again - right after getting the initial NumPy Meson build merged into NumPy main. But so far this is still blocked, and there's no impact of having this in a particular release so I removed the milestone.

@andyfaff
Copy link
Contributor

@tupui I'm going to rebase on main and start pushing to your feature branch.

@andyfaff
Copy link
Contributor

@rgommers @tupui this should be in a good state now. I squashed the commits to make history a bit cleaner.

  • coverage has been removed
  • cython is currently pinned but can be changed to pre release when the issues with cython 3 are ironed out.
  • there's no ILP64 currently, once meson can achieve the build then it should be straightforward to activate that.

@rgommers rgommers added this to the 1.11.0 milestone Apr 18, 2023
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andyfaff, this is looking pretty good, some minor comments to clean things up a bit more. The "avoid LD_LIBRARY_PATH" one is the only substantial one.

.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved

- name: Test SciPy
run: |
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't ever need this, that makes me think something is wrong. I guess that's because of the custom OpenBLAS location? Can you try export PKG_CONFIG_PATH=/usr/local/lib instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait, not awake yet - we're testing not building.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-correcting again, we are building as well as testing. Then PKG_CONFIG_PATH should work and is preferred.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build occurs on L324 that works without any issue.
Without LD_LIBRARY_PATH the test fails/errors because libopenblas can't be found by the dynamic loader. Does PKG_CONFIG_PATH help with that?

Copy link
Member

@rgommers rgommers Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, there's python dev.py build higher up. It doesn't quite make sense to me that the build works but import fails at test time - probably something else is wrong.

The way this works is that if a library is in a non-default path, the build should write RPATH entries for those non-default paths into the extension modules so things work at test time. PKG_CONFIG_PATH helps only with discovering dependencies at build time. If that works but then afterwards a library path goes missing, that's a bug somewhere. LD_LIBRARY_PATH is an anti-pattern that should never be needed. The only reason it's kinda okay (but still not great) in the wheel builds is that afterwards we run auditwheel/delocate to vendor the relevant shared libraries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could merge this once CI finishes and clean it up later though, to get on with the Azure migration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI was green after 283def4

The following commits just removed comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we merge @rgommers?

.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
azure-pipelines.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
.github/workflows/linux_meson.yml Outdated Show resolved Hide resolved
andyfaff and others added 2 commits April 18, 2023 21:23
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, in it goes. Thanks @tupui, @andyfaff & reviewers!

@rgommers rgommers merged commit 09246e4 into scipy:main Apr 18, 2023
@tupui tupui deleted the azure_gha branch April 18, 2023 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Items related to the CI tools such as CircleCI, GitHub Actions or Azure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants