Skip to content
This repository has been archived by the owner on Feb 15, 2023. It is now read-only.

MAINT: master->main refs #166

Merged
merged 18 commits into from
Jun 13, 2022
Merged

Conversation

tylerjereddy
Copy link
Collaborator

  • there are no plans I'm aware of to change master to
    main on the wheels repo, but I've started seeing cron
    errors related to master checkouts of the main repo

  • so, try to fixup cases where master is incorrectly used
    to reference the main SciPy repo, but leave master in for
    those cases where we are referring to the wheels repo proper
    (at least for now)

  • reference CI failure:
    https://app.travis-ci.com/github/MacPython/scipy-wheels/jobs/558549551

  • I do realize we plan to replace the wheels infrastructure here eventually, but until then...

* there are no plans I'm aware of to change `master` to
`main` on the wheels repo, but I've started seeing cron
errors related to `master` checkouts of the main repo

* so, try to fixup cases where `master` is incorrectly used
to reference the main SciPy repo, but leave `master` in for
those cases where we are referring to the wheels repo proper
(at least for now)

* reference CI failure:
https://app.travis-ci.com/github/MacPython/scipy-wheels/jobs/558549551
@rgommers
Copy link
Contributor

rgommers commented Feb 7, 2022

Appveyor failure is Pythran, the Linux 32-bit wheel may be related since this PR pulls in a newer numpy.distutils version.

@tylerjereddy
Copy link
Collaborator Author

Close/re-open--let's see where we stand here a few months later. I'm almost certain we'll need more wheels repo master adjustments than this before I can tackle 1.9.0rc1, but it is a start and I need to see what fails beyond the currently nightly crons that get stuck at master vs. main issues per https://app.travis-ci.com/github/MacPython/scipy-wheels/jobs/571740251

tylerjereddy added a commit to tylerjereddy/scipy that referenced this pull request May 31, 2022
* replicated scipygh-16139 on the latest maintenance branch
because the `master` branch of the wheels repo will
encounter the issues described in that PR (for example, see:
MacPython/scipy-wheels#166 which
has Travis and Azure failures caused by those same
versioning issues)

* I think the `cwd` is still correct even though the patch
is being applied to a different file this time (used to be
`setup.py`), though we could double check this by pointing
the wheels PR at the commit hash of this PR if we want

* any reason not to forward port this as well at this point,
if we're going to need to keep backporting it?
tylerjereddy added a commit to tylerjereddy/scipy that referenced this pull request May 31, 2022
* replicated scipygh-16139 on the latest maintenance branch
because the `master` branch of the wheels repo will
encounter the issues described in that PR (for example, see:
MacPython/scipy-wheels#166 which
has Travis and Azure failures caused by those same
versioning issues)

* I think the `cwd` is still correct even though the patch
is being applied to a different file this time (used to be
`setup.py`), though we could double check this by pointing
the wheels PR at the commit hash of this PR if we want

* any reason not to forward port this as well at this point,
if we're going to need to keep backporting it?
tylerjereddy added a commit to tylerjereddy/scipy that referenced this pull request May 31, 2022
* replicated scipygh-16139 on the latest maintenance branch
because the `master` branch of the wheels repo will
encounter the issues described in that PR (for example, see:
MacPython/scipy-wheels#166 which
has Travis and Azure failures caused by those same
versioning issues)

* I think the `cwd` is still correct even though the patch
is being applied to a different file this time (used to be
`setup.py`), though we could double check this by pointing
the wheels PR at the commit hash of this PR if we want

* any reason not to forward port this as well at this point,
if we're going to need to keep backporting it?
rgommers pushed a commit to scipy/scipy that referenced this pull request Jun 1, 2022
replicated gh-16139 on the latest maintenance branch
because the `master` branch of the wheels repo will
encounter the issues described in that PR (for example, see:
MacPython/scipy-wheels#166 which
has Travis and Azure failures caused by those same
versioning issues)

[ci skip]
@tylerjereddy tylerjereddy reopened this Jun 1, 2022
* try pinning setuptools on Appveyor
@tylerjereddy
Copy link
Collaborator Author

For the Windows failures I'll try pinning setuptools, which is pinned on v1.8.x but not on master of this repo.

I'm also not going to be surprised if there is yet another complication with this merged in now: scipy/scipy#16335

@tylerjereddy
Copy link
Collaborator Author

@mdhaber @mckib2 I'm seeing a failure to compile scipy\_lib\highs\src\simplex\HEkkDual.obj on Windows. From a careful dissection of the differences in the compile line for this source file here vs. in the main repo, the best explanation I can come up with is that we are using Visual Studio 2017 here in the wheels repo, which is lagging behind the main repo at 2019.

Here's an image of a colored "word diff" between the two compile lines, and it seems almost identical beyond the tooling versions, so I'll give the bump a try...
image

* bump the Visual Studio tool version used in Appveyor
to better match the main repo, because I'm seeing a failure
to compile `HEkkDual` source in the wheels repo
@tylerjereddy
Copy link
Collaborator Author

I may need to ask for some more Travis credits too based on the banner I'm seeing there this evening..

@tylerjereddy
Copy link
Collaborator Author

Ah, for appveyor I see stuff hardcoded to C:\Program Files (x86)\Microsoft Visual Studio 14.0 in a few places, and one of the directory listings doesn't necessarily show the new version (unless it is the unversioned listing):

05/27/2022  03:35 PM    <DIR>          Microsoft Visual Studio
05/27/2022  10:39 PM    <DIR>          Microsoft Visual Studio 10.0
05/27/2022  10:39 PM    <DIR>          Microsoft Visual Studio 11.0
05/27/2022  10:39 PM    <DIR>          Microsoft Visual Studio 12.0
05/27/2022  04:13 PM    <DIR>          Microsoft Visual Studio 14.0
05/27/2022  03:53 PM    <DIR>          Microsoft Web Tools

I may need to debug that on a fork or something. The usage of both years and numbers for versions doesn't make things clearer either.

@tylerjereddy
Copy link
Collaborator Author

tylerjereddy commented Jun 4, 2022

32-bit Windows jobs are successfully building with VS 2019 now (though failing a few tests), but 64-bit Windows jobs need some checking.

Making progress on my fork now: tylerjereddy#1

* fixup the 64-bit mingw path needed for new Appveyor
image
@tylerjereddy
Copy link
Collaborator Author

tylerjereddy commented Jun 7, 2022

The Travis CI team filled up our credits again but I've temporarily disabled Travis CI here while I iterate.

The most annoying issue at the moment is that the Linux 32-bit jobs seemed to recently switch to basically running forever at the test stage and ignoring timeout cancellation requests, along with not providing any usable log output. I'm hoping that changes with the latest push, but if it doesn't I may need to check this in a 32-bit container locally.

@tylerjereddy
Copy link
Collaborator Author

Another thing to consider is that stuff like this may not be forward compatible now that I've bumped the compiler version:
copy "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\redist\x64\Microsoft.VC140.CRT\msvcp140.dll" ?

I suspect I should try to copy in the DLL from the matching version of Visual Studio.

@tylerjereddy
Copy link
Collaborator Author

I'll iterate/debug the DLL updates on my fork until that works: tylerjereddy#3

@tylerjereddy
Copy link
Collaborator Author

The DLL updates have been cherry-picked in. The 32-bit linux test-time hang/infinite run I'm still investigating.

@tylerjereddy
Copy link
Collaborator Author

@isuruf @matthew-brett @charris @mattip have you seen this issue where the 32-bit Linux jobs literally run for days instead of timing out and giving useful output? https://github.com/MacPython/scipy-wheels/runs/6785487807

image

@tylerjereddy
Copy link
Collaborator Author

tylerjereddy commented Jun 10, 2022

Appveyor was fully passing so I've temporarily disabled that and also simplified the Azure matrix to focus on debugging the 32-bit Linux builds that keep running indefinitely.

As a start, I'll see what happens if I bump DOCKER_TEST_IMAGE to a newer base image than xenial. The main repo is using a newer base image I think with i386/ubuntu:bionic in Azure over there. I'm expecting this to fail though since no such image exists at https://hub.docker.com/u/multibuild. Hopefully it will not run indefinitely though--the current status is that the test stage will not even timeout with DOCKER_TEST_IMAGE: "multibuild/xenial_{PLAT}", it will just run forever with no feedback, which is rather hard to debug of course.

If I finally see a proper error/termination with that change, then I'll probably have to switch the image back and experiment a bit. I could try bumping the multibuild submodule just in case I guess.

@tylerjereddy
Copy link
Collaborator Author

tylerjereddy commented Jun 11, 2022

Ok, using an invalid DOCKER_TEST_IMAGE did indeed cause a proper failure for the 32-bit Linux jobs, rather than running forever. I've switched DOCKER_TEST_IMAGE back to the only option we have for 32-bit Linux, but also updated multibuild submodule to the latest hash to see if that might help.

It feels like a bit of a stretch, but worth a try.

@tylerjereddy
Copy link
Collaborator Author

Looks like the "hang" happens fairly early in the 32-bit Linux testsuite:

image

One option then is to use a higher verbosity test run to see which part of the suite is visited to get a hint where the freeze happens.

Before I do that though, let me temporarily point this branch at the tip of maintenance/1.9.x to see if that avoids the hang--if it does I would gladly avoid dealing with it on main. If it still hangs after that, then I'll need to narrow the hang down a bit more.

@tylerjereddy
Copy link
Collaborator Author

Pointing to the tip of the maintenance branch allowed the 32-bit Linux job with Python 3.9 to fail the testsuite in the normal way, but 3.8 is running indefinitely even on that branch.

@tylerjereddy
Copy link
Collaborator Author

Watching the problematic 32-bit Linux jobs more carefully, in real time, I see 20+ minutes stuck here for both Python 3.8/3.9:

3.8:

+ python -m pip install /io/wheelhouse/SciPy-1.10.0.dev0+0.da7a602-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl
Processing ./wheelhouse/SciPy-1.10.0.dev0+0.da7a602-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl
Collecting numpy>=1.19.5
  Downloading numpy-1.22.4.zip (11.5 MB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Building wheels for collected packages: numpy
  Building wheel for numpy (PEP 517): started

3.9:

+ python -m pip install /io/wheelhouse/SciPy-1.10.0.dev0+0.da7a602-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl
Processing ./wheelhouse/SciPy-1.10.0.dev0+0.da7a602-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl
Collecting numpy>=1.19.5
  Downloading numpy-1.22.4.zip (11.5 MB)
  Installing build dependencies: started

@tylerjereddy
Copy link
Collaborator Author

tylerjereddy commented Jun 12, 2022

Ok, both jobs stuck there for 40 minutes now, better touch base with NumPy team re: this getting built from source and taking forever/hanging vs. the binary availble in 1.21.x series. I'll touch base with SciPy mailing list as well, in case we might follow NumPy in dropping 32-bit Linux support..

@charris
Copy link
Contributor

charris commented Jun 12, 2022

Have you tried with manylinux2014?

* try forcing binary NumPy install when installing
the pre-built SciPy wheel, to avoid the hang when
building NumPy `1.22.x` from source on 32-bit Linux

* this was suggested by Matti on the mailing list
@tylerjereddy
Copy link
Collaborator Author

Haven't tried that, let me see if Matti's suggestion of install_wheel --only-binary :numpy: allows me to squeak by first.

* local testing suggests `--prefer-binary` will work
for forcing an older/binary NumPy for 32-bit builds
when `pip` installing the pre-built SciPy wheel, so try
that

* revert some higher verbosity testing that used for
debugging
@tylerjereddy
Copy link
Collaborator Author

That didn't work, but a generic --prefer-binary did, so I'll proceed with cleaning this PR up a bit now.

@tylerjereddy
Copy link
Collaborator Author

I think this is in pretty good shape now @rgommers. Here's a summary of what has changed:

  • master -> main in a bunch of places for SciPy and NumPy (distutils); latter shim may not even be needed anymore but things seem "ok" so..
  • appveyor.yml gets a bigger set of changes:
    • it needed some newer C++ things, so the base image is bumped to a newer Visual Studio version (already done in the main repo quite some time ago, albeit on Azure instead)
    • bump build-time version of NumPy to 1.18.5 (not using pyproject.toml for Windows builds here yet); not going as high as main yet to facilitate the branching here
    • DLL/mingw toolchain adjustments for VS 2019
  • the install_wheel command has been adjusted to install_wheel --prefer-binary so that the 32-bit Linux Azure jobs don't hang/run for 40+ minutes trying to build NumPy 1.22.4 from source; other command suggestions failed for whatever reason, maybe because installing directly from a local wheel file is slightly different..
  • bump multibuild submodule to latest upstream version; no real benefit to this--it was one of many attempts to deal with the 32-bit Linux wheel issues, but I'm not sure it is worth reverting either since we usually bump liberally there anyway

What is still failing in CI?

  • 32-bit Linux wheels have 3 failures, 2 of which are handled by TST, MAINT: adjust 32-bit xfails for HiGHS scipy/scipy#16390, the other is new as of today so I'm less worried about it for branching purposes
  • that's it, everything else is passing, which is a huge change from where we were before this PR (and with the help of several recent PRs on main in main repo)

Copy link
Contributor

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Tyler! Everything in this PR LGTM, and I agree that the test failures on the two 32-bit Linux jobs are not a blocker. In it goes

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants