Skip to content

Why is Big-endian powerpc no longer supported ? #4349

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
amckinstry opened this issue Oct 17, 2017 · 30 comments
Closed

Why is Big-endian powerpc no longer supported ? #4349

amckinstry opened this issue Oct 17, 2017 · 30 comments
Labels

Comments

@amckinstry
Copy link

Sorry about the "non-bug" nature of this issue, but I'm working on OpenMPI 3.0.0 for Debian, and see BE powerpc is no longer supported.
Given that other powerpc and other BE systems are supported, I'm curious about this, and wondering whats going on.

@gpaulsen
Copy link
Member

Thanks for asking your question. I'm sorry that dropping support for Big-endian powerpc has affected you. The community makes every effort to support as wide of a selection of platforms as possible, with our limited volunteered resources.

During the development of Open MPI v3.0.0, during our Open MPI supported platforms discussion, we discussed who might be willing to work on and support Open MPI on Big-Endian Power PC platforms. IBM committed to support the ppc64le (Little Endian) platform on Linux, however all IBM roadmaps only have Little Endian products. Due to perceived lack of interest, IBM chose not to support Big Endian Platforms.

Another aspect to the removal of support is due to our new improved continuous integration testing automation. Whenever anyone creates a pull request to either master branch or a release branch, the automation automatically begins a build and test of that pull request on all supported platforms (existing across various different member organizations). This automation has drastically improved quality to the point where we now define our support statement for Open MPI such that we only support platforms that we have added to this infrastructure and test regularly.

The Open MPI community is interested in supporting as many platforms as practical, and would be interested in working with anyone who is interested in supporting a platform, provided we can obtain a testing platform, and a contact person or organization.

If you are still interested, please reply, and we can continue discussing.

@amckinstry
Copy link
Author

Thanks for this reply.

My concern is that, from a Debian Linux perspective, we support and test multiple architectures that are not part of the CI infrastructure upstream (see https://buildd.debian.org/status/package.php?p=openmpi); we then run our own tests on these.

PPC on Big-endian systems is arguably one of the better supported of the lesser archs. It seems strange that we will support alpha, hppa, sparc64, etc (without the latest fabrics, of course) but not powerpc.

This testing on other archs has two main benefits beyond supporting the tiny number of users doing HPC on minority archs. It shakes out the bugs latent in the codebase, and then in doing so, it makes future ports to new archs (arm64, etc) possible as developers know the codebase actually supports the standards rather than having many assumptions ("All the worlds a Vax", as they used to say).

Can we downgrade the failure to build on BE PPC from an hard error to a warning - "WARNING: this architecture is not tested and officially supported" ?

@gpaulsen
Copy link
Member

I think the work required is a bit more involved than just changing the error to a warning. I thought I remembered some serious bug reports against PowerPC Big Endian that would need to be resolved.

@glaubitz
Copy link

glaubitz commented Jan 7, 2018

IBM committed to support the ppc64le (Little Endian) platform on Linux, however all IBM roadmaps only have Little Endian products. Due to perceived lack of interest, IBM chose not to support Big Endian Platforms.

Of course, it was IBM. They also forcefully removed PowerPC BE support in Golang for no reason (I still have a working fork which I regularly rebase against master) and also almost objected my work of zfs on linux on PowerPC BE.

I think the work required is a bit more involved than just changing the error to a warning. I thought I remembered some serious bug reports against PowerPC Big Endian that would need to be resolved.

We in Debian are happy to support big-endian PowerPC targets and if there are bugs to be fixed, there are many people who are happy to help. It would be great if this could be turned into a warning again and if there are issues, we will be seeing them on our infrastructure and provide patches to fix them.

Thanks!

@opoplawski
Copy link
Contributor

Fedora still builds for big-endian powerpc. I'm not sure what will be more painful - dealing with a buggy open-mpi on BE powerpc or putting in all of the ExcludeArch ppc64 statements in all of the dependent libraries. FWIW - BE powerpc64 is one of the three architectures that Fedora's CI system runs builds for, see https://apps.fedoraproject.org/koschei/package/openmpi. You'll also see there that there are sporadic test failures for ppc64.

@opoplawski
Copy link
Contributor

Hmm, really surprised to see this change introduced in 2.1.2 - ostensibly a bug-fix release.

@gpaulsen
Copy link
Member

gpaulsen commented Feb 7, 2018

The Open MPI community has a policy to discontinue support in OpenMPI v3.0.0 for any platform that does not have regular continuous integration testing for new pull requests, and also a clearly identified maintainer who is willing to step up and help debug and fix problems found on that platform. Unfortunately ppc64 Big Endian was one of the architectures that no one in the community stepped up to this level of support.

The Open MPI Community welcomes additional continuous integration platforms, help maintaining those platforms, submitting CI test results to the community jenkins server, and help in debugging issues found on those platforms during continuous integration. If you're interested, please let us know.

@rhc54
Copy link
Contributor

rhc54 commented Feb 7, 2018

I believe the question wasn't about v3.0.0, but rather why it disappeared in v2.1.2 - which was supposed to be a bug-fix release and should have fallen under the compatibility promise.

@glaubitz
Copy link

glaubitz commented Feb 7, 2018

The Open MPI community has a policy to discontinue support in OpenMPI v3.0.0 for any platform that does not have regular continuous integration testing for new pull requests, and also a clearly identified maintainer who is willing to step up and help debug and fix problems found on that platform.

What's wrong with keeping the code if it doesn't hurt the other architectures?

I don't understand a policy like this. There is code that people are using, so why not just keep it?

I would understand this argument if the code would actually hurt other targets. But as long as that doesn't happen, why not just leave it in and accept drive-by patches to fix issues.

These super-strict policies just result in lots of frustration in downstream projects.

Debian and other downstreams are willing to help and provide patches, but please understand that OpenMPI isn't the only upstream project on this planet and if every other project posed such high standards on their downstream projects, we could throw away all ports except x86_64 and arm64.

We are helping whereever we can. Contributing patches to almost any important project in the community. Heck, I have even become upstream committer in OpenJDK and Firefox because I am so busy contributing such patches. But we (I) can't be upstream maintainer and committer in every project on the planet.

Really, please don't do that. It's incredibly frustrating.

@amckinstry
Copy link
Author

@gpaulsen , for reference you can see the build systems we use here:
https://buildd.debian.org/status/package.php?p=openmpi&suite=experimental
This are the build logs for Debian for various archs; change "experimental" to "sid" to see the 2.1* track. Typically we triage bugs and report non-Debian specific ones (we maintain a handful of Debian-specific patches, eg. for multiarch support) upstream here.

In building packages, we also run any test suites contained within these. Periodically, but in particular as part of transitions (major library changes) we rebuild everything and test everything - eg. for openmpi2 -> openmpi3 testing I've rebuilt all MPI packages in Debian against openmpi3 - minor bugs ensued - eg. python-escript presumed Openmpi version numbers are always dotted integers, which 3.0.1rc1 broke - minor patch sent to them for this, etc.

So we are typically doing a lot of testing that is not seen upstream (here) unless bugs are found and the change is in openmpi not the client package (or below, on a lib used by openmpi).

It would be useful if there was a reference here to the changeset that dropped BE support, and the bugs that this closed.

@jsquyres
Copy link
Member

jsquyres commented Feb 8, 2018

Thanks for bringing up all these issues and keeping us honest. Let me tell you what we have been doing over the past 36 hours (reminder that "Power BE" in this conversation effectively means "Power 7"):

  1. We reviewed why we removed Power 7 support from Open MPI 3.0.0:

    • It came down to: we no longer had a maintainer. So we snipped the code out.
    • Removing support for a platform at an x.0.0 release is seen as acceptable timing.
    • The rationale is that IBM EOL'ed Power 7 near the end of 2015. And, realistically, the last sale of a Power 7 machine for HPC was probably quite a long time ago. Such customers are almost certainly using IBM's MPI or an older version of Open MPI -- both of which will continue to work. Because a) there's zero new work going on for Power 7 (and their related compilers), b) we got reports of a few bugs, and c) no one wanted to fix them -- not even IBM -- it made sense to snip off support from the open source community perspective. I'm sorry to say that this means that we disagree with @glaubitz a bit here: maintaining old code that may or may not work does have a material cost in terms of long term code complexity, maintenance, etc. Hence, we do remove support for old platforms that are unlikely to be used in HPC platforms any more because it consumes the finite resources that the core Open MPI developer community has -- particularly when even the vendor of that platform indicates that they are no longer interested in supporting it.
    • To be clear, as of today (early Feb 2018): IBM has stated that they are only interested in supporting Power 8/9 LE on Linux.
    • That being said, we were unaware that anyone (e.g., downstream packagers) would be interested in maintaining support on the Power 7 platform. If someone wants to maintain it, great! Note that this means a little more than just ensuring that Open MPI compiles -- there is some level of multi-node testing that needs to occur, too. We don't have strict policies here -- but if someone can honestly look the Open MPI community in the eye and say, "Yes, I'm compiling and testing on at least a semi-regular basis (particularly during ramp-up to releases) and will commit to fixing bugs when possible," then sweet! Yes, we're an open source community -- we know that many people are volunteers and can't absolutely commit to timelines/etc. But if there's an honest "yes, we'll do our best", cool. We'll be happy to restore the Power 7 BE functionality.
    • But that being said, my $0.02 is that it is ok to let Power 7 go. There is almost certainly no one using Open MPI 3.x.y on a Power 7 system. Maintaining Open MPI 3.x.y Power 7 support would -- IMNSHO -- literally just be so that so that we can say that Power 7 is supported. I do not think that it would be genuinely useful.
  2. We reviewed why we removed Power 7 support from the v2.0.x and v2.1.x series.

    • Unfortunately, we suck. ☹️ We absolutely cannot remember why we did this. We cannot find emails, github issues/PRs, commit messages, or even weekly/face-to-face meeting notes explaining why we did this. We agree with you: it's highly irregular that we removed Power 7 support in the middle of the v2.0.x and v2.1.x series. We suck. ☹️ ☹️ ☹️
    • Our best guess is that:
      1. We found some bugs (based on Add XL compiler version check to configure #4053) in Open MPI v3.0.0 pre-release testing
      2. We therefore decided to remove Power BE support from Open MPI v3.0.0
      3. We then either tested/checked/assumed that the same bugs were also in Open MPI v2.0.x and v2.1.x.
      4. We therefore decided to remove Power BE support from Open MPI v2.0.x and v2.1.x.
    • So we asked a colleague to run some Power 7 BE tests last night with Open MPI v2.0.0 and v2.1.0. Surprisingly, it looks like Open MPI v2.0.0 and v2.1.0 work fine on Power 7 BE with the Red Hat provided gcc-4.8.5, locally-built gcc-7.2.0, and locally-built clang-3.7.1. Aside from one case with clang (which we're pretty sure is a problem with clang itself), everything seemed to work ok: Open MPI compiled/installed ok, and then some simple MPI tests seemed to work ok. This unfortunately further confirms the "shame on us" determination from above.
    • This means we're in a bit of a bind for the v2.0.x and v2.1.x series, and we're open to discussion here. Specifically: do the downstream packagers want Power 7/BE support put back in? I.e., revert config: Remove support for big endian PPC, XL compiler older than 13.1 #4104 and config: Remove support for big endian PPC, XL compiler older than 13.1 #4105.
      • Keep in mind that I still believe what I said above in the context of Open MPI v3.x.y: I think very few people -- if any -- are using Open MPI on Power 7 machines. Meaning: even though we goofed in removing support for it in the middle of the Open MPI v2.0.x and v2.1.x series, it may be a "shame on us" moment, but not a tragedy -- and potentially not worth putting it back.
  3. Finally, I literally just learned that Power 8 and 9 can run in either LE or BE mode. I previously thought that "Power BE" really meant "Power 7" (i.e., old platform yadda yadda yadda), and that Power 8 and 9 were LE-only architectures. This new-to-me information may therefore change the equation: it is possible -- albeit probably pretty unlikely -- that someone out there is running their Power 8 or 9 machine in BE mode, and has a BE Linux running on it.

    • What do the distros support in terms of BE on Power 8/9?

@opoplawski
Copy link
Contributor

Fedora is definitely moving away from BE ppc64 as well - see https://lists.fedoraproject.org/archives/list/ppc@lists.fedoraproject.org/message/C23EQYITA4DQWM7CQF6LJC5ABXY2XIEM/

Although unfortunately we haven't just dropped it yet.

@glaubitz
Copy link

glaubitz commented Feb 8, 2018

Well, there are other companies besides IBM making PowerPC hardware. Many users are running Linux ppc64 Big-Endian on FreeScale processors, for example. Does keeping the ppc64 big-endian code hurt the rest of the OpenMPI codebase? Or what exactly is the reason it should get removed?

I understand that IBM wants to deprecate anything older than POWER8 because they have a strong interest in selling new hardware instead of keeping old hardware supported. But I don't think it's fair that IBM alone gets to decide about the PowerPC support status in free software projects.

It might surprise some people of upstream projects, but distributions like Debian and Gentoo support hardware as old as DEC Alpha, Motorola 68000 and HP PA-RISC. And as long as keeping the code for these architectures around doesn't hurt any of the other architectures, I don't see why this should be of any problem.

Other projects like systemd, OpenJDK, LibreOffice and so on don't have problems with supporting old architectures. Why is it so much of a problem for OpenMPI?

@jsquyres
Copy link
Member

jsquyres commented Feb 8, 2018

@opoplawski Ok, good to know. So just to be clear: are you saying that Fedora doesn't care if we put Power BE support back?

@glaubitz It feels like you didn't read my entire post. Can you re-read, see where I already answered your questions, and then answer the questions that I asked of distros? (by your GitHub profile, it looks like you're a SuSE employee) Thanks!

@glaubitz
Copy link

glaubitz commented Feb 8, 2018

@jsquyres You posed a single question which is:

"What do the distros support in terms of BE on Power 8/9?"

And the answer is none. No one who ones hardware which is capable of running little-endian code will run anything big-endian on it, there is simply no market for it. Therefore there is no distribution which has a POWER8/9 port in big-endian.

And I am not sure what my employment status with SUSE has to do anything with this thread. I am talking as my role of a Debian Developer here, not as a SUSE employee.

I'm sorry to say that this means that we disagree with @glaubitz a bit here: maintaining old code that may or may not work does have a material cost in terms of long term code complexity, maintenance, etc. Hence, we do remove support for old platforms that are unlikely to be used in HPC platforms any more because it consumes the finite resources that the core Open MPI developer community has -- particularly when even the vendor of that platform indicates that they are no longer interested in supporting it.

And many other projects prove that this is possible. It's perfectly fine to mark old platforms as "Tier 2" and mark them as not officially supported but still allow people to use the code. It works well for many projects and the community often steps up to provide patches when things break.

@opoplawski
Copy link
Contributor

@jsquyres Here's the thing - there is a lot more distribution work required to build packages for only certain architectures. All packages that depend on said package need to have conditionals to work around the fact that a certain dependency is not present in certain locations. I'm not looking forward to doing this for the openmpi stack in Fedora. So giving up openmpi on ppc64 in Fedora, while certainly consistent with the "best effort" level of support, will be hard. That said, there are certainly times when the ppc64 build has failed and required work to fix, so that's work too. At this point I would prefer it if Fedora dropped ppc64, but not sure that's going to happen any time soon. I'm sure some old ppc mac owners would complain, both of them. :)

@jsquyres
Copy link
Member

jsquyres commented Feb 8, 2018

@glaubitz Sure. I assumed you were asking such questions because of your employer. My mistake. However, if you're asking for any distro (e.g., Debian instead of SuSE), for the purposes of this conversation, it doesn't really matter which one.

I also asked if someone would maintain Open MPI on Power BE platforms. If so, we'll re-enable it.

Right now, we have no one to test+maintain our code base on Power BE platforms, and we're deeply concerned about shipping code for a platform that we have no one watching over at all: we don't know when we break it, we don't have anyone to fix bugs, etc. (it's pretty clear that you and I disagree on this point; there's probably not a lot of point in going back and forth about it). If that changes, and someone helps us out with maintaining Open MPI on Power BE platforms, great.

But let's also keep in mind the truly practical point of: who on earth is running Open MPI on Power BE platforms? I'm all for free software, but I'm not in favor of doing work to support a platform on which Open MPI will realistically never be used. Specifically: I really don't want to re-enable support for a platform that is probably only compile-tested (e.g., in distro automated build farms) but largely -- or even entirely -- run-time untested just for the sake of filling in a check box on a support matrix.

@jsquyres
Copy link
Member

jsquyres commented Feb 8, 2018

@opoplawski Fair points. While we're talking through all the options here, let me ask this (not saying this is the final solution -- I just want to ask a question here): is it viable to carry a local patch in your package that reverts #4104 / #4105?

@amckinstry
Copy link
Author

amckinstry commented Feb 9, 2018

@jsquyres The concept of "supported" for a distribution like Debian is fuzzier than for commercial distributions: if you look at the logs (eg. https://buildd.debian.org/status/package.php?p=openmpi&suite=experimental) the grayed out architectures are "unofficial", 2nd tier: they are not candidates for the next official release and bugs for these archs will not block releases (we commit to keep the supported archs in sync). They may be included in the next release if their quality and support by the HPPA maintainer(s) is good enough, but bugfixes in them is lower priority for the package maintainer.

This is the approach I advocate: keep the current set as release archs / officially supported; mark any bugs for PPC BE etc as 'unsuppported' and to be fixed as best-effort.

In practice the "supported" is a matter for the package maintainer: the majority of bugs for such archs are latent standards bugs: BE /LE , 32-vs-64 bit bugs, etc. Fixing these is good for the codebase not just the architecture.

My intro to Debian came as a systems developer tasked with bringing up a Unix-like userspace on a handheld device with a new mips-based ASIC. The well-worn path at Debian for bringing up new archs meant this was a feasible task for a single engineer in a few months. Today in HPC I see a similar issue when exploring e.g. new cpus on FPGA-based accelerators, getting netcdf to work on the accelerators. I see the work I do in Debian as systems integration and testing, relevant here: I don't seriously expect to see anyone use OpenFOAM on m68k just because it compiles and runs, but keeping the codebase healthy enables new developments.

If you look at the list of Debian archs above you'll see that OpenMPI 3.0 works on HPPA. Thats because Debian maintains a trivial patch. Now HPPA is pretty quixotic, and Debian's inclusion of it is mostly humouring some hobbyists who dont use HPC, but as a systems integrator, I see OpenMPI works on HPPA simply as a matter of course: it should work on HPPA because it's well engineered and all the patch does is enable gcc atomics. I would consider it not working on HPPA as a problem in OpenMPI to be investigated, not a HPPA problem. This is why dropping PPC support irks so much: its a newer, better architecture and should just work.

@opoplawski "dependency contagion" is a real problem but the answer has to be to use it to drive the engineering - eg keep dependencies controlled within as few packages as possible and build abstractions accordingly. (eg. layers above the mpi layer should be oblivious to fabrics; packages using netcdf/hdf5 should be oblivious to whether the netcdf layer is serial/mpi or which version of mpi is used, etc.

In summary, "re-enable support" should, at the OpenMPI level, come down to a compile-time warning of "THIS IS NOT A SUPPORTED ARCHITECTURE" and no checks there, rather than a compile failure, and labelling bugs as 'unsupported/low priority'. OpenMPI shouldn't worry about the whether distros/hobbyists/researchers are using their code on unsupported archs, they just shouldn't break it.

@jsquyres
Copy link
Member

@amckinstry You make valid points. We'll discuss this in the developer community.

My question still stands, though: do you downstream packagers / distro-representing people on this issue want Power BE re-enabled in the v2.0.x and v2.1.x series?

I think whether we put a "This is not supported!" output and/or whether we re-enable Power BE for v3.0.x and v3.1.x are separate questions.

@amckinstry
Copy link
Author

For Debian we have 2.1.1 in stable and will not be moving to 2.1.2; I'm planning 3.0* for the next release, so no opinion on re-enabling Power BE for the 2.* series.

@jsquyres
Copy link
Member

@amckinstry @opoplawski @glaubitz We had a lengthy discussion about this stuff yesterday in our face-to-face Open MPI development meeting. Let me report the results to you...

Short version

  1. We sucked in our communication to you when we decided to disable POWER7/BE platforms (i.e., we didn't communicate with you at all). We need to do better in the future. Please subscribe to the (low volume!) ompi-packagers mailing list so that we can initiate dialogue with you / ask you questions / etc. if (when!) issues like this come up again.
  2. We have finally (re)discovered the reason that we actually disabled POWER7/BE platforms: we thought we had silent data corruption. Later, we figured out that it wasn't silent data corruption, but we didn't put two-and-two together at the time. A fix for the real issue is coming in opal_fifo check hanging on aarch64 / PowerPC Big Endian #4563, and then we'll remove the POWER7/BE block in configure in all of v2.x, v3.0.x, and v3.1.x.

More detail

Earlier in this issue, I cited that we could not remember why we had removed POWER7/BE support in v2.x. After much discussion today, we remembered: we thought we had a silent data corruption issue. In our world, that's about the most serious kind of bug that there is (i.e., you run an simply get wrong answers, but no obvious error occurs).

Also earlier in this issue, I stated that we dropped POWER7/BE in v3.0.x because there was no one to maintain it. That is true, but the (much) more serious issue at the time was the silent data corruption -- that's why we took the extraordinary step of blocking it in configure. I.e., we thought it was seriously broken and no one was going to fix it.

At the time, we did not understand the exact problem. We thought there was a strong possibility of silent data corruption (i.e., run an MPI program and get wrong results). No one could fix it, so we decided just to turn it off. This was deemed better than shipping known-seriously-broken code. Hence, we didn't want the casual user to be able to build/install Open MPI at all on this platform (because they would silently get wrong answers), so we added the configure block to all of v2.x, v3.0.x, and v3.1.x.

In hindsight, we really should have contacted you before doing this. That's what the new ompi-packagers mailing list is for. More on this below. Also, we did not document the decision or rationale anywhere. Shame on us. That would have made a much less frustrating and confusing discussion earlier in this issue (and we would not have conveyed at least partially-incorrect information above). Sorry about that. 😦

Later, we figured out that had an atomic issue that would lead to deadlocks (#4563). I don't think we put two-and-two together at the time to realize that what we thought was a POWER7/BE silent data corruption was this atomic/deadlocking issue.

Regardless, we now understand the issue much better and @hjelmn has said that he will work on the fix for #4563 this week. We'll get that back ported to v2.x, v3.0.x, and v3.1.x. Then we'll remove the POWER7/BE block in configure in all those branches. This doesn't mean that POWER7/BE is supported -- it just means that it is no longer known to be bad (even though it's not as bad as we thought it was). We'll add something to NEWS about this as well (that the configure block for POWER7/BE was removed, but it doesn't mean it is supported, yadda yadda yadda).

That being said, this issue has basically highlighted the fact that we need to communicate with you, our downstream packagers, better. E.g., perhaps you could have helped us debug / fix this issue. To that end, we have setup a new mailing list: ompi-packagers. We'd like to use this list (and intentionally try to keep it low volume) to communicate with you about such issues in the future.

@amckinstry
Copy link
Author

OK, thanks, this is a good outcome.

I agree with the distinction 'not supported' vs 'known to be bad'. And yes, the problem was when the cause/nature of the 'silent data corruption' issue accidentally got dropped; if there was a trail that led back to a bugreport, then better decisions could be made (on all parts).

I'm signing up to the ompi-packagers,

thanks
Alastair

jsquyres added a commit to jsquyres/ompi that referenced this issue Apr 6, 2018
We thought there was a silent data corruption issue on POWER 7/BE
systems, so we blocked building on POWER 7/BE systems altogether.  We
later figured out that it was just data hangs -- not silent data
corruption.  So in hindsight, the configure block probably wasn't
necessary -- but we didn't know it at the time.

Regardless, the hangs have now been fixed, and we're removing the
POWER 7/BE block in configure.

For more detail on the entire saga, see
open-mpi#4349 (comment).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
jsquyres added a commit to jsquyres/ompi that referenced this issue Apr 10, 2018
We thought there was a silent data corruption issue on POWER 7/BE
systems, so we blocked building on POWER 7/BE systems altogether.  We
later figured out that it was just data hangs -- not silent data
corruption.  So in hindsight, the configure block probably wasn't
necessary -- but we didn't know it at the time.

Regardless, the hangs have now been fixed, and we're removing the
POWER 7/BE block in configure.

For more detail on the entire saga, see
open-mpi#4349 (comment).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 3f0ccff)
jsquyres added a commit to jsquyres/ompi that referenced this issue Apr 10, 2018
We thought there was a silent data corruption issue on POWER 7/BE
systems, so we blocked building on POWER 7/BE systems altogether.  We
later figured out that it was just data hangs -- not silent data
corruption.  So in hindsight, the configure block probably wasn't
necessary -- but we didn't know it at the time.

Regardless, the hangs have now been fixed, and we're removing the
POWER 7/BE block in configure.

For more detail on the entire saga, see
open-mpi#4349 (comment).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 3f0ccff)
jsquyres added a commit to jsquyres/ompi that referenced this issue Apr 10, 2018
We thought there was a silent data corruption issue on POWER 7/BE
systems, so we blocked building on POWER 7/BE systems altogether.  We
later figured out that it was just data hangs -- not silent data
corruption.  So in hindsight, the configure block probably wasn't
necessary -- but we didn't know it at the time.

Regardless, the hangs have now been fixed, and we're removing the
POWER 7/BE block in configure.

For more detail on the entire saga, see
open-mpi#4349 (comment).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 3f0ccff)
@barracuda156
Copy link

I am late here, but we in Macports support PowerPC on MacOS and actively develop for it (that is, not only maintain the code, but improve and fix new software for PPC).

@barracuda156
Copy link

What's wrong with keeping the code if it doesn't hurt the other architectures?

I don't understand a policy like this. There is code that people are using, so why not just keep it?

I would understand this argument if the code would actually hurt other targets. But as long as that doesn't happen, why not just leave it in and accept drive-by patches to fix issues.

@glaubitz OMG, how I share your feelings. Also been astonished and disappointed how some upstream just throws away working code simply because no one happened to comment on time in an obscure GitHub thread…

And then it takes months sometimes to restore it and non-trivial efforts to convince that “yes, it is actually used”.

@ggouaillardet
Copy link
Contributor

@barracuda156 you cannot reasonably expect to put the burden of supporting recent software on obsolete hardware to anyone but the hobbyist who do that in their spare time.

@glaubitz
Copy link

glaubitz commented Feb 19, 2023 via email

@ggouaillardet
Copy link
Contributor

ggouaillardet commented Feb 19, 2023 via email

@barracuda156
Copy link

And keep shipping code that has not even been compiled for 5+ years?

Multiple platforms support PowerPC presently and compile code for it: Macports, FreeBSD, some versions of Linux.

@jsquyres
Copy link
Member

@barracuda156 From my reading of this issue (and the PR's that were cross-linked at the end), the configure.ac block on PPC big endian was removed years ago. Did you try to build the code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants