Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openmpi v5.0.0 #128

Merged
merged 5 commits into from
Dec 21, 2023
Merged

Conversation

regro-cf-autotick-bot
Copy link
Contributor

It is very likely that the current package version for this feedstock is out of date.

Checklist before merging this PR:

  • Dependencies have been updated if changed: see upstream
  • Tests have passed
  • Updated license if changed and license_file is packaged

Information about this PR:

  1. Feel free to push to the bot's branch to update this PR if needed.
  2. The bot will almost always only open one PR per version.
  3. The bot will stop issuing PRs if more than 3 version bump PRs generated by the bot are open. If you don't want to package a particular version please close the PR.
  4. If you want these PRs to be merged automatically, make an issue with @conda-forge-admin,please add bot automerge in the title and merge the resulting PR. This command will add our bot automerge feature to your feedstock.
  5. If this PR was opened in error or needs to be updated please add the bot-rerun label to this PR. The bot will close this PR and schedule another one. If you do not have permissions to add this label, you can use the phrase @conda-forge-admin, please rerun bot in a PR comment to have the conda-forge-admin add it for you.

Pending Dependency Version Updates

Here is a list of all the pending dependency version updates for this repo. Please double check all dependencies before merging.

Name Upstream Version Current Version
openmpi 5.0.0 Anaconda-Server Badge

Dependency Analysis

Please note that this analysis is highly experimental. The aim here is to make maintenance easier by inspecting the package's dependencies. Importantly this analysis does not support optional dependencies, please double check those before making changes. If you do not want hinting of this kind ever please add bot: inspection: false to your conda-forge.yml. If you encounter issues with this feature please ping the bot team conda-forge/bot.

Analysis by source code inspection shows a discrepancy between it and the the package's stated requirements in the meta.yaml.

Packages found by source code inspection but not in the meta.yaml:

  • cython

This PR was created by the regro-cf-autotick-bot. The regro-cf-autotick-bot is a service to automatically track the dependency graph, migrate packages, and propose package version updates for conda-forge. Feel free to drop us a line if there are any issues! This PR was generated by https://github.com/regro/cf-scripts/actions/runs/6655727034, please use this URL for debugging.

@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@leofang
Copy link
Member

leofang commented Oct 26, 2023

This week is full of MPI 😂 @dalcinl Since you've been testing upstream Open MPI, anything we should know before merging?

cc: @jakirkham @pentschev for vis

@dalcinl
Copy link
Contributor

dalcinl commented Oct 26, 2023

anything we should know before merging?

Ahhahahaha... do truly expected it was going to be that easy?

@dalcinl
Copy link
Contributor

dalcinl commented Oct 26, 2023

@leofang I can take care of this update. However, I believe we should have a branch v4.1.x to track updates to the v4.1 series. I'm not really sure how that process work, but if you know, could you please take care of it or ask someone for pointers?

@leofang
Copy link
Member

leofang commented Oct 26, 2023

Yes, it's easy, just create a new branch (say, v4.x) based on main. You can use web UI to do it.

What would happen is that the CI would run on that branch too, but at the package upload time, the upload would be blocked because nothing is changed in the recipe and the package hashes collide with existing ones, so we're safe.

@dalcinl
Copy link
Contributor

dalcinl commented Oct 26, 2023

@leofang This feedstock has become a dumpster. The build scripts have hundreds of hacks accumulated over the years. Maybe this is a good opportunity for a cleanup, but it could be a long rodeo.

@leofang
Copy link
Member

leofang commented Oct 26, 2023

I think a big part of it was to enable cross-compiling for osx-arm (#69 (comment)), but TBH I feel they should be removed if they can be autogen'd. IIRC @erykoff suggested these changes back then, for both openmpi and mpich. Not sure if @erykoff can share with us how it was obtained.

@dalcinl
Copy link
Contributor

dalcinl commented Oct 26, 2023

I just did a local build from the release tarball. I'm getting the following:

$ readelf -d /home/devel/mpi/openmpi/5.0.0/bin/ompi_info

Dynamic section at offset 0x6d08 contains 40 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libmpi.so.40]
 0x0000000000000001 (NEEDED)             Shared library: [libpsm2.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libopen-pal.so.80]
 0x0000000000000001 (NEEDED)             Shared library: [libfabric.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libucp.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libucs.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libucm.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libuct.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libnl-route-3.so.200]
 0x0000000000000001 (NEEDED)             Shared library: [libnl-3.so.200]
 0x0000000000000001 (NEEDED)             Shared library: [libpmix.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libevent_core-2.1.so.7]
 0x0000000000000001 (NEEDED)             Shared library: [libevent_pthreads-2.1.so.7]
 0x0000000000000001 (NEEDED)             Shared library: [libhwloc.so.15]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000001d (RUNPATH)            Library runpath: [/home/devel/mpi/openmpi/5.0.0/lib]
...

@jsquyres This looks awfully wrong. Why would ofi and ucx libraries get injected in ompi_info ? Was this change done on purpose? Am I missing some configure flag? I did not notice the issue before because in my local debug builds I disable everything and only enable internally-available components like hwloc/pmix/etc (after all, I'm an mpi4py developer, not an Open MPI developer). This definitely looks like a regression respect to 4.1.6

@dalcinl
Copy link
Contributor

dalcinl commented Oct 26, 2023

Perhaps users are now expected to pass LDFLAGS=-Wl,--as-needed explicitly?

@dalcinl dalcinl marked this pull request as draft October 27, 2023 09:22
@dalcinl dalcinl force-pushed the 5.0.0_h822a68 branch 2 times, most recently from 57b86e6 to 9b4ea55 Compare October 27, 2023 10:39
@jsquyres
Copy link

@dalcinl This is probably expected. We made a change from v4.x to v5.x in that all plugins are now -- by default -- slurped into their respective libraries (vs. being built as DSOs). This has turned out to be fairly important for scalability (vs. hammering on a shared / network filesystem by calling dlopen() a bazillion times on each of a large number of nodes).

For example, both of these were built with Libfabric support:

# For Open MPI v4.1.6
root@acf210f5a701:/outside/openmpi-5.0.0# ldd /tmp/bogus-4.1/lib/libmpi.so
	linux-vdso.so.1 (0x0000ffffae54a000)
	libopen-rte.so.40 => /tmp/bogus-4.1/lib/libopen-rte.so.40 (0x0000ffffae2f0000)
	libopen-pal.so.40 => /tmp/bogus-4.1/lib/libopen-pal.so.40 (0x0000ffffae1d0000)
	libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffffae130000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffffadf80000)
	/lib/ld-linux-aarch64.so.1 (0x0000ffffae511000)

# For Open MPI v5.0.0
root@acf210f5a701:/outside/openmpi-5.0.0# ldd /tmp/bogus-5.0/lib/libmpi.so
	linux-vdso.so.1 (0x0000ffff9edea000)
	libopen-pal.so.80 => /tmp/bogus-5.0/lib/libopen-pal.so.80 (0x0000ffff9e940000)
	libfabric.so.1 => /tmp/bogus/lib/libfabric.so.1 (0x0000ffff9e820000)
	libpmix.so.2 => /tmp/bogus-5.0/lib/libpmix.so.2 (0x0000ffff9e610000)
	libevent_core-2.1.so.7 => /tmp/bogus-5.0/lib/libevent_core-2.1.so.7 (0x0000ffff9e5c0000)
	libevent_pthreads-2.1.so.7 => /tmp/bogus-5.0/lib/libevent_pthreads-2.1.so.7 (0x0000ffff9e5a0000)
	libhwloc.so.15 => /tmp/bogus-5.0/lib/libhwloc.so.15 (0x0000ffff9e530000)
	libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff9e490000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff9e2e0000)
	/lib/ld-linux-aarch64.so.1 (0x0000ffff9edb1000)
	libatomic.so.1 => /lib/aarch64-linux-gnu/libatomic.so.1 (0x0000ffff9e2c0000)

That being said, I could have sworn that we documented this change somewhere, specifically because it could cause some surprise to some people (e.g., packagers). All I can find easily is this, https://docs.open-mpi.org/en/v5.0.x/installing-open-mpi/packagers.html#components-plugins-dso-or-no, where we mention that building components as DSOs is not the default -- but it doesn't specifically mention that thit is a change compared to v4.x. ☹️

...ah, if you go to https://docs.open-mpi.org/en/v5.0.x/installing-open-mpi/configure-cli-options/installation.html#installation-options and scroll all the way down to the --enable-mca-dso description, there's a "Note" in there that says that this default is new in v5.0.0. Hmph. Feels like we should make this a bit more prominent somehow.

Regardless, all of this means that if you'd like to restore the v4.x behavior of building DSOs and therefore not having many external dependencies in libmpi (and friends), you can --enable-mca-dso (or --enable-mca-dso=yes) to build them all as DSOs (I note that the docs don't directly state this option with no argument can be used to build everything as DSO -- I'll get that fixed). For example:

root@acf210f5a701:/outside/openmpi-5.0.0# ldd /tmp/bogus-5.0/lib/libmpi.so
	linux-vdso.so.1 (0x0000ffff9bcb8000)
	libopen-pal.so.80 => /tmp/bogus-5.0/lib/libopen-pal.so.80 (0x0000ffff9ba00000)
	libpmix.so.2 => /tmp/bogus-5.0/lib/libpmix.so.2 (0x0000ffff9b870000)
	libevent_core-2.1.so.7 => /tmp/bogus-5.0/lib/libevent_core-2.1.so.7 (0x0000ffff9b820000)
	libevent_pthreads-2.1.so.7 => /tmp/bogus-5.0/lib/libevent_pthreads-2.1.so.7 (0x0000ffff9b800000)
	libhwloc.so.15 => /tmp/bogus-5.0/lib/libhwloc.so.15 (0x0000ffff9b790000)
	libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff9b6f0000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff9b540000)
	/lib/ld-linux-aarch64.so.1 (0x0000ffff9bc7f000)

Note that libfabric is not in that list.

Does that make sense?

@dalcinl
Copy link
Contributor

dalcinl commented Oct 27, 2023

This is probably expected.
...
Does that make sense?

OK. Honestly, I did not read the documentation for v5.0.x in full, so I was not aware of these changes. The option --enable-mca-dso working properly should be enough for packagers that want to use the former behavior.

HOWEVER, I still think things are a bit off.

Look here:

$ readelf -d /home/devel/mpi/openmpi/5.0.0/bin/ompi_info 

Dynamic section at offset 0x6d08 contains 40 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libmpi.so.40]
 0x0000000000000001 (NEEDED)             Shared library: [libpsm2.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libopen-pal.so.80]
 0x0000000000000001 (NEEDED)             Shared library: [libfabric.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libucp.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libucs.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libucm.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libuct.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libnl-route-3.so.200]
 0x0000000000000001 (NEEDED)             Shared library: [libnl-3.so.200]
 0x0000000000000001 (NEEDED)             Shared library: [libpmix.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libevent_core-2.1.so.7]
 0x0000000000000001 (NEEDED)             Shared library: [libevent_pthreads-2.1.so.7]
 0x0000000000000001 (NEEDED)             Shared library: [libhwloc.so.15]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000001d (RUNPATH)            Library runpath: [/home/devel/mpi/openmpi/5.0.0/lib]

Are the explicit dependencies (DT_NEEDED) on UCX, libevent, PSM2, etc. really needed? Or would it be enough for ompi_info to depend in libmpi.so and libopen-pal.so (and maybe a few others)? Please note that I'm not using ldd here, but readelf.

Anyway, maybe my complaint is ultimately inconsequential, as libmpi.so will inject (albeit indirectly) a dependency on UCX, PSM2, etc. But perhaps you could consider some linking hygiene anyway. Maybe an easy fix would by trying adding LDFLAGS=-Wl,--as-needed .

@dalcinl
Copy link
Contributor

dalcinl commented Oct 28, 2023

@rhc54 This is what I got from --prtemca plm_base_verbose 10

[77135c70da85:179920] mca: base: component_find: searching NULL for plm components
[77135c70da85:179920] mca: base: find_dyn_components: checking NULL for plm components
[77135c70da85:179920] pmix:mca: base: components_register: registering framework plm components
[77135c70da85:179920] pmix:mca: base: components_register: found loaded component slurm
[77135c70da85:179920] pmix:mca: base: components_register: component slurm register function successful
[77135c70da85:179920] pmix:mca: base: components_register: found loaded component ssh
[77135c70da85:179920] pmix:mca: base: components_register: component ssh register function successful
[77135c70da85:179920] mca: base: components_open: opening plm components
[77135c70da85:179920] mca: base: components_open: found loaded component slurm
[77135c70da85:179920] mca: base: components_open: component slurm open function successful
[77135c70da85:179920] mca: base: components_open: found loaded component ssh
[77135c70da85:179920] mca: base: components_open: component ssh open function successful
[77135c70da85:179920] mca:base:select: Auto-selecting plm components
[77135c70da85:179920] mca:base:select:(  plm) Querying component [slurm]
[77135c70da85:179920] mca:base:select:(  plm) Querying component [ssh]
[77135c70da85:179920] [[INVALID],0] plm:ssh_lookup on agent ssh : rsh path NULL
[77135c70da85:179920] [[INVALID],0] plm:ssh: unable to be used: cannot find path for launching agent "ssh : rsh"
[77135c70da85:179920] mca:base:select:(  plm) No component selected!

As the build is happening within a container, I guess this is just a missing ssh command.
Is there any equivalent of the old plm=isolated component I been using with v4.x ?

@dalcinl
Copy link
Contributor

dalcinl commented Oct 28, 2023

@rhc54 After playing a bit on my machine, looks like the ssh command is not actually used for anything if all your processes are to be launched in localhost. However, the ssh agent has to be defined to something that can be found in the filesystem, otherwise the component selection/setup fails. Maybe this is an oversight in PRRTE?

$ mpiexec --prtemca plm_base_verbose 10 -prtemca prte_ssh_agent /usr/bin/true -n 2 hostname
[kw61149:2840787] mca: base: component_find: searching NULL for plm components
[kw61149:2840787] mca: base: find_dyn_components: checking NULL for plm components
[kw61149:2840787] pmix:mca: base: components_register: registering framework plm components
[kw61149:2840787] pmix:mca: base: components_register: found loaded component slurm
[kw61149:2840787] pmix:mca: base: components_register: component slurm register function successful
[kw61149:2840787] pmix:mca: base: components_register: found loaded component ssh
[kw61149:2840787] pmix:mca: base: components_register: component ssh register function successful
[kw61149:2840787] mca: base: components_open: opening plm components
[kw61149:2840787] mca: base: components_open: found loaded component slurm
[kw61149:2840787] mca: base: components_open: component slurm open function successful
[kw61149:2840787] mca: base: components_open: found loaded component ssh
[kw61149:2840787] mca: base: components_open: component ssh open function successful
[kw61149:2840787] mca:base:select: Auto-selecting plm components
[kw61149:2840787] mca:base:select:(  plm) Querying component [slurm]
[kw61149:2840787] mca:base:select:(  plm) Querying component [ssh]
[kw61149:2840787] [[INVALID],0] plm:ssh_lookup on agent /usr/bin/true path NULL
[kw61149:2840787] mca:base:select:(  plm) Query of component [ssh] set priority to 10
[kw61149:2840787] mca:base:select:(  plm) Selected component [ssh]
[kw61149:2840787] mca: base: close: component slurm closed
[kw61149:2840787] mca: base: close: unloading component slurm
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:ssh_setup on agent /usr/bin/true path NULL
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:receive start comm
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:setup_vm
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:setup_vm creating map
[kw61149:2840787] [prterun-kw61149-2840787@0,0] setup:vm: working unmanaged allocation
[kw61149:2840787] [prterun-kw61149-2840787@0,0] using default hostfile /home/devel/mpi/openmpi/5.0.0/etc/prte-default-hostfile
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:setup_vm only HNP in allocation
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:setting slots for node kw61149 by core
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:receive processing msg
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:receive job launch command from [prterun-kw61149-2840787@0,0]
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:receive adding hosts
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:receive calling spawn
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:receive done processing commands
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:setup_job
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:setup_vm
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:setup_vm no new daemons required
[kw61149:2840787] [prterun-kw61149-2840787@0,0] complete_setup on job prterun-kw61149-2840787@1
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:launch_apps for job prterun-kw61149-2840787@1
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:send launch msg for job prterun-kw61149-2840787@1
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:launch wiring up iof for job prterun-kw61149-2840787@1
kw61149
kw61149
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:prted_cmd sending prted_exit commands
[kw61149:2840787] [prterun-kw61149-2840787@0,0] plm:base:receive stop comm
[kw61149:2840787] mca: base: close: component ssh closed
[kw61149:2840787] mca: base: close: unloading component ssh

@rhc54
Copy link

rhc54 commented Oct 28, 2023

Not really - PRRTE is designed to be a DVM. Pretty poor DVM if there is no way to launch across nodes 😄

You also have to admit that it's a pretty sad cluster that doesn't include even rsh or ssh in it, or some other launch subsystem. So this feels like a rather niche test system.

My concern would be that allowing the DVM to not have any way of launching across nodes is inviting trouble for the "real world" situation when there is a true configuration issue. Seen it before where someone complains about having spent many hours trying to figure out why the cluster won't start, only to discover that they cannot launch across nodes (for whatever reason). Perhaps some user-required directive (like what we do for run as root) might work - would have to ponder it.

Meantime, just add the ssh package to your container.

@dalcinl
Copy link
Contributor

dalcinl commented Oct 28, 2023

You also have to admit that it's a pretty sad cluster that doesn't include even rsh or ssh in it,
...
is inviting trouble for the "real world" situation when there is a true configuration issue.

Definitely. I just asked to be sure that's the intended behavior. However, maybe the failure could be delayed to the point where the ssh/rsh command is actually needed? Anyway, this is a minor nit.

Meantime, just add the ssh package to your container.

Even better, I found a very classy workaround:

export OMPI_MCA_plm_ssh_agent=false

That way, if the ssh command is never actually used, all good, things will run on localhost. But if it is, then I would expect the failure from false to not go unnoticed and propagate up to the caller. UNLESS this behavior is actually unintended...

PS: One final comment: I find the ssh_agent name confusing. I actually thought that option was somehow related to the ssh-agent authentication utility.

@jsquyres
Copy link

@dalcinl Well, shoot. Let me know what else you need to do to see if our upstream fix was not correct (or not complete).

@dalcinl
Copy link
Contributor

dalcinl commented Nov 16, 2023

Let me know what else you need to do to see if our upstream fix was not correct (or not complete).

@jsquyres I left a comment in the PR. My guess is that all the MCA plugins that use CUDA may need a similar fix, not just the accelerator plugin. But you guys know your things better. For the time being, I can get this conda-forge update going via global CFLAGS/LDFLAGS.

@jsquyres
Copy link

@dalcinl I think we'll have this fixed for v5.0.1; sorry for the hassle. I think your workaround for the recipe here is fine.

@dalcinl
Copy link
Contributor

dalcinl commented Nov 22, 2023

@leofang @jakirkham I'm done with this. Is there anything left?

@dalcinl
Copy link
Contributor

dalcinl commented Nov 28, 2023

@conda-forge-admin rerender

1 similar comment
@dalcinl
Copy link
Contributor

dalcinl commented Dec 7, 2023

@conda-forge-admin rerender

Copy link
Contributor

github-actions bot commented Dec 7, 2023

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you, but it looks like there was nothing to do.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/openmpi-feedstock/actions/runs/7129333418.

@leofang leofang self-assigned this Dec 14, 2023
@dalcinl
Copy link
Contributor

dalcinl commented Dec 19, 2023

@leofang The ppc64le builds have timeout consistently. How should we move forward?

@leofang
Copy link
Member

leofang commented Dec 19, 2023

If this keeps happening, we'll have no way but to disable ppc...

Copy link
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Tested locally:

  1. Download the linux-64 artifact and unpack the zip
  2. Create & activate a new env, with Open MPI 5 installed
conda create -n my_env -c path/to/build_artifacts -c conda-forge python=3.10 openmpi=5 cupy gcc
conda activate my_env
  1. Clone mpi4py source and do pip install -v .
  2. Test CUDA awareness through CuPy:
    • mpirun -n 2 --mca pml ^ucx --mca osc ^ucx --mca opal_cuda_support 1 python -m mpi4py.bench pingpong -a cupy works
    • mpirun -n 2 --mca pml ucx --mca osc ucx python -m mpi4py.bench pingpong -a cupy works
    • mpirun -n 2 --mca opal_cuda_support 1 python -m mpi4py.bench pingpong -a cupy works

@leofang
Copy link
Member

leofang commented Dec 19, 2023

@dalcinl Looks like ppc is behaving today, merge?

@regro-cf-autotick-bot regro-cf-autotick-bot mentioned this pull request Dec 21, 2023
3 tasks
@jakirkham
Copy link
Member

Just glancing at the conda-forge.yml, it looks like the linux_ppc64le builds are emulated. Looking at the build time for one of the linux_aarch64 jobs makes me think it is also emulated

build_platform:
linux_aarch64: linux_64
osx_arm64: osx_64

provider:
linux_aarch64: default
linux_ppc64le: azure

Probably the long-term sustainable answer is cross-compiling. Maybe we can look into that in the new year?

@dalcinl
Copy link
Contributor

dalcinl commented Dec 21, 2023

@dalcinl Looks like ppc is behaving today, merge?

@leofang I'll remove artifact upload, and then merge. If ppc64 fails, I'll keep restarting the build until I get lucky.

Probably the long-term sustainable answer is cross-compiling. Maybe we can look into that in the new year?

@jakirkham Definitely. It would be a pity to not test the aarch64/ppc64 buils, but perhaps we can setup some testing post-mortem after the package is uploaded, just to make sure the thing actually works.

@jsquyres
Copy link

@dalcinl @leofang FYI v5.0.1 is immanent.

@leofang
Copy link
Member

leofang commented Dec 21, 2023

Thanks for heads-up, Jeff! Our bot also informed us about the update (#131), but we need to get this PR in first, due to the infra update to support v5.x.

@dalcinl
Copy link
Contributor

dalcinl commented Dec 21, 2023

@dalcinl @leofang FYI v5.0.1 is immanent.

@jsquyres Thanks for the pointer. The conda-forge bot already submitted a PR to update to 5.0.1. I'll try merge this one anyway. In the past we've been a bit sloppy and skipped some patch-level updates, but it is useful to have a build for every release.

@leofang
Copy link
Member

leofang commented Dec 21, 2023

I see the ppc CI almost completed before timeout. Let's merge and retry in main. Thanks a lot for help, @dalcinl @jsquyres @rhc54 and all!

@leofang leofang merged commit cfad5e8 into conda-forge:main Dec 21, 2023
8 of 10 checks passed
@regro-cf-autotick-bot regro-cf-autotick-bot deleted the 5.0.0_h822a68 branch December 21, 2023 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants