VERSION: Changing master to v3.2.0 #4401

gpaulsen · 2017-10-25T19:54:25Z

There are currently no known binary incompatible changes
on master that would require a first digit change.

Signed-off-by: Geoffrey Paulsen gpaulsen@us.ibm.com

There are currently no known binary incompatible changes on master that would require a first digit change. Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>

bwbarrett · 2017-10-26T16:16:17Z

I think the intermittent cray connectivity issue struck again. Bad Jenkins.

bot:ompi:retest

bwbarrett · 2017-10-30T19:30:14Z

bot:ompi:retest

Not sure why Jenkins got angry there...

bwbarrett · 2017-10-31T15:04:56Z

Not sure what's going on with the pull request checker; the build was successful. It looks like the API call from Jenkins to GitHub just didn't set the status properly. @gpaulsen, please go ahead and merge this.

gpaulsen · 2017-12-20T22:47:33Z

bot:retest

gpaulsen · 2017-12-20T22:49:06Z

I'd love to get this into master soon before we forget that our intent is the release a v3.2 and not a v4.0

bwbarrett · 2017-12-20T23:20:18Z

are we sure that the next version will be 3.2 instead of 4.0? I believe we wanted to remove the mxm mtl, since no one is supporting or testing it, but that would require bumping the version to 4.0. We should have removed for 3.0, but apparently I sucked. So we can make it 3.2 and then bump it to 4 if we do the removal, but I'm not sure how that plays in your internal release.

rhc54 · 2017-12-20T23:41:28Z

Both @jsquyres and I seem to recall that we needed to go to 4.0 next time. I don't believe we can do a 3.2.

gpaulsen · 2017-12-21T15:43:20Z

Well we should have some way to tell users if the API has changed, and if they need to recompile, versus if we just no longer support a certain interconnect.

IBM very much wants to maintain "forwards compatibility" for users compiled applications (On our platforms) but are also for culling older interconnect support that is no longer tested or used.

If we can distinguish between the two, we can help ensure the former while not precluding the later.

rhc54 · 2017-12-21T16:21:52Z

I think what we need is a better understanding of why IBM seems so concerned about staying at 3.x instead of moving to 4.0. I get maintaining ABI for SpectrumMPI users, but that surely is just a corporate decision on what level to base your code on, and not a general OMPI community issue.

Are there things in master that you want in a 3.x release, but aren't scheduled for 3.1 inclusion? If so, why doesn't IBM just backport them in SpectrumMPI? We know you are maintaining your own patches (bug fix and feature) - isn't this just another set?

gpaulsen · 2018-01-02T14:28:19Z

I believe this is an issue for Open MPI community as a whole. Every time the mpi library so name changes, it requires end users to relink (probably recompile and relink to be safe) their applications, which in turn generally requires a re-validation of their entire software stack.

In production environments, even if it's trivial to rebuild/relink an end user's application, policies can require days or weeks of validation after a rebuild. Policies usually allow for minor version updates which include changes that don't change .so version numbers, thereby allowing customers to upgrade to the next minor version at a much lower cost (in terms of validation testing).

Therefore I'm advocating a "lets not rev the major version numbers (or the user mpi library .so versions) unless its absolutely needed and planned for" strategy. Even in cases where we thought we needed to break ABI, we've found creative solutions to prevent that breakage or delay the ABI break until a planned for release.

rhc54 · 2018-01-02T14:41:17Z

Now I am further confused. We specifically did deliberately plan to make a 4.0 break in 2018 - it was openly discussed in the last devel meeting. The feeling was that enough changes were occurring to justify it, and that the 3.x series was completing its life with the 3.1.x releases (which are expected to continue throughout 2018).

Since many of us come from the national lab environment, we fully grok the validation issue, though I think you overstate it here 😄 Production codes never picked up the latest x.0 release right away, but stay back one from there. So in this case, the labs will likely stay in the 3.0 series for at least 2018, and then move to 3.1 in 2019, going to 4.x (x > 0) in 2019/2020. Note that we (at least while I was there) always posted the newer releases so those wanting/needing access to the new features could use them.

So what precisely is your point of concern? You were one of the orgs pushing for a time-based release schedule - why is 2+ years of 3.x not adequate? Why would we want to distort the code base with workarounds simply to avoid a major release? And why is the lab's strategy not adequate for the customers you are concerned about?

bwbarrett · 2018-01-02T16:38:36Z

I think I agree with everyone on this thread, which means my head has exploded :).

A couple of notes / thoughts...

Bumping the major version number of Open MPI does not mean we have to bump the shared library version to force a recompile.
We have generally used the major version of Open MPI to indicate backwards-incompatible changes to both the library interface (ie, bumping the shared library version to force a recompile) or to the user interface (mpirun, removing transports, etc.).

So I'm not sure what we want to do with the release that follows 3.1.x. It seems like we're going around and around here; perhaps we should bring this topic up at the next telecon and see if we can make more progress there?

rhc54 · 2018-01-02T16:47:41Z

Yeah, I think that makes sense. IIRC, the rationale here was that we planned to remove some things (e.g., the sm btl, mxm mtl) and have new options. One could argue that these could be delayed, but I'm trying to understand why as the historical way of dealing with these version changes has seemed adequate and acceptable.

jsquyres · 2018-01-03T18:06:46Z

I was still out of the office yesterday; I don't know if you had the Tuesday webex this week or not to discuss this stuff.

Here's my $0.02:

If we make backwards-incompatible changes, we need to bump the major version.
If we do not make backwards-incompatible changes, we (really really) should not change the major version.
Removing components and/or changing CLI or MCA parameters are backwards-incompatible changes.

Meaning: as @rhc54 pointed out, if we remove those components and/or change CLI/MCA params, then the next series needs to be 4.0.x. If we delay all those things (and no other backwards-incompatible changes occur relative to v3.0.x and v3.1.x), then the next series needs to be v3.2.x.

rhc54 · 2018-01-03T18:14:31Z

We did not meet this week, so this will get discussed next week (and likely run into the devel meeting before getting resolved). We all are in violent agreement over what you said. The issue is whether or not there should be a backwards-incompatible release in the first half of 2018. I think IBM is advocating for "no", but I still fail to grok the reasoning behind that request.

gpaulsen · 2018-01-03T19:41:38Z

@bwbarrett suggests that it's possible to rev to a new major version to incorporate backwards incompatible changes (like mpirun command line changes, or removal of components), but to NOT rev the user lib .so versions. This would support pre-built MPI apps, and more accurately describe that the change in Open MPI did not affect our ABI. It seems a somewhat confusing message, but perhaps this is a solution.

As @jsquyres said: If we do not make backwards-incompatible changes, we (really really) should not change the major version.

But how strong should that "really really" be? The beauty of Open MPI's component architecture is that there is a lot of flexibility to change the internals of Open MPI without affecting the layer above.

rhc54 · 2018-01-03T19:56:56Z

Again, I "really, really" want to understand what problem you are trying to solve. The user community has had a way of dealing with this that was considered acceptable and adequate for nearly 14 years. What precisely is the issue now driving us to modify our methods?

People argued (rather loudly) that our feature/stable release methods should be replaced by time-based releases, and that we would let the major version indicate breaks in compatibility. This was defined as broader than what is now being suggested - specifically, it included changes in command line options and behavioral mods that would be apparent to a user.

Revving the library is a totally different question - there are strict libtool rules that govern those versions, and they have absolutely nothing to do with the release versioning. So I don't understand why this conversation is even bringing those into the thread.

jsquyres · 2018-01-03T21:28:21Z

BTW, #4635 and #4638 (currently pending for master) will definitely change the OSHMEM ABI.

gpaulsen · 2018-01-10T19:58:02Z

We discussed at our weekly meeting: https://github.com/open-mpi/ompi/wiki/WeeklyTelcon_20180109

Decision was to keep master/next release at v4.0, but not break .so versioning unless audit determines that it's needed on a library by library basis.

VERSION: Changing master to v3.2.0

373298c

There are currently no known binary incompatible changes on master that would require a first digit change. Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>

gpaulsen requested a review from hppritcha October 25, 2017 19:54

bwbarrett added the Target: main label Oct 26, 2017

gpaulsen requested review from bwbarrett and removed request for hppritcha December 15, 2017 16:14

gpaulsen requested a review from jjhursey December 20, 2017 22:48

gpaulsen removed the Target: main label Jan 10, 2018

gpaulsen closed this Jan 10, 2018

gpaulsen deleted the version branch June 5, 2019 13:24

VERSION: Changing master to v3.2.0 #4401

VERSION: Changing master to v3.2.0 #4401

Uh oh!

Conversation

gpaulsen commented Oct 25, 2017

Uh oh!

bwbarrett commented Oct 26, 2017

Uh oh!

bwbarrett commented Oct 30, 2017

Uh oh!

bwbarrett commented Oct 31, 2017

Uh oh!

gpaulsen commented Dec 20, 2017

Uh oh!

gpaulsen commented Dec 20, 2017

Uh oh!

bwbarrett commented Dec 20, 2017

Uh oh!

rhc54 commented Dec 20, 2017

Uh oh!

gpaulsen commented Dec 21, 2017

Uh oh!

rhc54 commented Dec 21, 2017

Uh oh!

gpaulsen commented Jan 2, 2018

Uh oh!

rhc54 commented Jan 2, 2018

Uh oh!

bwbarrett commented Jan 2, 2018

Uh oh!

rhc54 commented Jan 2, 2018

Uh oh!

jsquyres commented Jan 3, 2018

Uh oh!

rhc54 commented Jan 3, 2018

Uh oh!

gpaulsen commented Jan 3, 2018

Uh oh!

rhc54 commented Jan 3, 2018

Uh oh!

jsquyres commented Jan 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gpaulsen commented Jan 10, 2018

Uh oh!

Uh oh!

jsquyres commented Jan 3, 2018 •

edited

Loading