Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VERSION: update for 2.1.3 release #4754

Closed
wants to merge 1 commit into from

Conversation

hppritcha
Copy link
Member

@hppritcha hppritcha commented Jan 24, 2018

Signed-off-by: Howard Pritchard howardp@lanl.gov

@hppritcha hppritcha added this to the v2.1.3 milestone Jan 24, 2018
@hppritcha hppritcha requested a review from jsquyres January 24, 2018 20:46
@hppritcha
Copy link
Member Author

hmmm...looks like AWS instances are having a bad day. sad pandas all over the place. let's try again.
bot:ompi:retest

@kawashima-fj
Copy link
Member

@hppritcha Changes of MPI_AINT_ADD_F08 and MPI_AINT_DIFF_F08 (#4667) affect only the mpi_f08 module, not libmpi_usempif08.so. So you don't need to bump the number of libmpi_usempif08_so_version.

@hppritcha
Copy link
Member Author

good point. I'll revert the mpi_f08 lib part.

@hppritcha hppritcha force-pushed the topic/version_for_2.1.3 branch from 39b572f to 3e491e1 Compare January 25, 2018 15:05
@hppritcha
Copy link
Member Author

@jsquyres could you check this morning? I"d like to kick off a build this afternoon.

@jsquyres
Copy link
Member

Yoinks, I missed this. Let me go check now...

@hppritcha
Copy link
Member Author

@jsquyres could you check this morning? I"d like to kick off a build this afternoon.

VERSION Outdated
@@ -84,15 +84,15 @@ date="Unreleased developer copy"
# Version numbers are described in the Libtool current:revision:age
# format.

libmpi_so_version=30:2:10
libmpi_so_version=30:3:10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not see any changes to libmpi that were not in components since v2.1.2. Did I miss something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh okay, i didn't realize that's how you were looking at the deltas.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to this line of thinking, I'm reverting the r update to libopen_rte as well since there were only changes in mca components.

VERSION Outdated
libmpi_cxx_so_version=30:0:10
libmpi_mpifh_so_version=31:1:11
libmpi_usempi_tkr_so_version=30:1:10
libmpi_usempi_ignore_tkr_so_version=30:1:10
libmpi_usempif08_so_version=30:1:10
libmpi_usempif08_so_version=30:2:10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to some other changes that are clearly in the F08 library, I notice that we changed the Fortran type of MPI_ERRCODES_IGNORE (which ends up in a header file that all the Fortran libraries include). Does this mean we need to bump the R for all the Fortran libraries?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so looking at the change to gni-mpi-mangling.pl

@jsquyres
Copy link
Member

@hppritcha I neglected to mention, this is what I used to check:

git checkout v2.0.x
git pull --rebase
git log --stat --topo-order --decorate v2.1.2..HEAD >! log.txt

@hppritcha
Copy link
Member Author

you meant

git checkout v2.x

right?

@jsquyres
Copy link
Member

Yes, I meant checkout v2.x, sorry.

I figured out those git commands a while ago and then put them in the release checklist (e.g., #4240).

@hppritcha hppritcha force-pushed the topic/version_for_2.1.3 branch from 3e491e1 to 81cbd2f Compare January 29, 2018 18:31
@hppritcha
Copy link
Member Author

@jsquyres check now

@jsquyres
Copy link
Member

@hppritcha One thing we've not been good about checking (or at least I haven't been good about checking) is checking where various .h files from lower layers get included into upper layers that then substantively change the upper layer. Two examples in 2.1.2 --> 2.1.3:

  1. ORTE notifier.h changed some enum values (in some cases). Does this constitute changing the ORTE r number?
  2. Some OPAL header files changed, which technically change ORTE and OMPI:
    • opal/include/opal/sys/powerpc/atomic.h changed. This only affects PPC. Is this enough to bump the r number?
    • The PPC-only change is somewhat moot here, because opal/util/if.h changed, which changes all platforms.
    • There may be other OPAL .h files that changed -- I didn't check any further.

...so do we need to bump the r number on all libraries that include atomic.h or if.h. This is similar to the Fortran header file question.

@hppritcha hppritcha force-pushed the topic/version_for_2.1.3 branch from 81cbd2f to 3939257 Compare January 29, 2018 19:18
@jsquyres
Copy link
Member

The more I think about this, the more I think: it's safer to bump r numbers if there could possibly be a change at all. I.e., even if we bump the r number unnecessarily, it's still ok. But it's not really ok if we don't bump the r number and there was a 2nd- or 3rd-order indirect effect (e.g., a header file changing two layers down) than caused a change.

This, however, is deeper than we have ever checked before. Thinking about this a little, I think there's at least a few ways we could go here:

Keep doing what we've always done

Human inspection each time we release, and manual, minimal tweaking of .so version numbers.

(Almost) Always bump all r numbers

As a rule of them, bump all r numbers unless:

  1. There's obvious bumping of c and/or a values (e.g., we added/removed interfaces, and changing the c and/or a values trumps bumping the r value).
  2. Nothing could possibly have changed. E.g., there were zero Fortran changes at all, so there's zero possibility that any of the Fortran libraries changed at all.

This would probably have us bump r numbers more often than is really needed, but is that a terrible crime?

Compared to the human time of trying to get the r-value bumps exactly correct, is (nearly) always bumping r a bigger crime than mistakenly not bumping r when we should have?

(Partial) Automation

If we want to keep minimally-updating .so version numbers (vs. (nearly) always bumping r numbers), this might be a prime candidate for at least partial automation. @bwbarrett and I have talked loosely about these kinds of things before, but I'm not sure we came to any conclusions.

Here's a (very loose and not fully thought out) idea:

  1. Take desired revision to be released and run all of the source through the preprocessor for all relevant languages (C, C++, Fortran), and save the output.
  2. Do the same thing for the previous release.
  3. Compare the two resulting trees:
    1. Our source code is quite well organized, directory-wise. Put .version files at the top level for any given tree that requires a .so version number; it can contain the library name (from VERSION) that is set by files in this tree. Anything in that tree (that isn't covered by a lower-level .version file and isn't a component) will cause a change to that version number.
    2. If any source code changed, log it for that library.
  4. The script can output a concise log at the end of which libraries had changes.
  5. A human can look closer at those libraries and determine if c, r, and/or a values need to change.

@hppritcha
Copy link
Member Author

these are the opal files that changed between v2.1.2 tag and HEAD of v2.x:

opal/datatype/opal_convertor.c
opal/datatype/opal_convertor.h
opal/include/opal/sys/powerpc/atomic.h
opal/mca/base/mca_base_open.c
opal/mca/btl/usnic/btl_usnic_recv.c
opal/mca/btl/usnic/btl_usnic_recv.h
opal/mca/btl/usnic/btl_usnic_util.c
opal/mca/btl/usnic/btl_usnic_util.h
opal/mca/btl/vader/btl_vader_fbox.h
opal/mca/btl/vader/btl_vader_frag.c
opal/mca/btl/vader/btl_vader_frag.h
opal/mca/btl/vader/btl_vader_module.c
opal/mca/btl/vader/btl_vader_send.c
opal/mca/hwloc/external/configure.m4
opal/mca/hwloc/hwloc1112/configure.m4
opal/mca/memory/patcher/memory_patcher_component.c
opal/mca/mpool/base/mpool_base_alloc.c
opal/mca/mpool/base/mpool_base_default.c
opal/mca/mpool/memkind/mpool_memkind_component.c
opal/mca/pmix/pmix112/pmix/Makefile.am
opal/mca/pmix/pmix112/pmix/NEWS
opal/mca/pmix/pmix112/pmix/VERSION
opal/mca/pmix/pmix112/pmix/config/distscript.sh
opal/mca/pmix/pmix112/pmix/config/pmix.m4
opal/mca/pmix/pmix112/pmix/config/pmix_functions.m4
opal/mca/pmix/pmix112/pmix/config/pmix_setup_cc.m4
opal/mca/pmix/pmix112/pmix/configure.ac
opal/mca/pmix/pmix112/pmix/src/buffer_ops/unpack.c
opal/mca/pmix/pmix112/pmix/src/client/pmi1.c
opal/mca/pmix/pmix112/pmix/src/client/pmi2.c
opal/mca/pmix/pmix112/pmix/src/client/pmix_client_get.c
opal/mca/pmix/pmix112/pmix/src/dstore/pmix_esh.c
opal/mca/pmix/pmix112/pmix/src/sec/pmix_sec.c
opal/mca/pmix/pmix112/pmix/src/util/output.c
opal/mca/rcache/grdma/rcache_grdma_module.c
opal/util/if.c
opal/util/if.h
opal/util/output.c
opal/util/proc.c

@jsquyres
Copy link
Member

@hppritcha Hmm. How did you get that list? I ran:

git checkout v2.x
git diff v2.1.2..HEAD | egrep '^--- a' | sort

and got a much bigger list (82 files, not all of them in opal).

@jsquyres
Copy link
Member

Blarg -- disregard my last comment. You said in your comment "these are all the opal files that changed..." Duh.

@hppritcha
Copy link
Member Author

from a checkout of v2.x I did
git diff --name-only v2.1.2..HEAD

I get 82 source file changes (excluding README, NEWS, VERSION) so I think we're in sync.
We should ask Brian what he things about the include file impact on R. I'll go along with
it but its kind of weird, unless we want to treat ompi/opal/orte as titghtly coupled.

Normally I wouldn't update software I work on just because external header files (i.e. ibverbs.h)
changes. But I can see the argument for within the OMPI project.

@jsquyres
Copy link
Member

Ok, I chatted w @bwbarrett about this. Conclusions:

  1. Let's bump C and A on all the Fortran libraries, and set R back to 0. I.e., go to:
libmpi_mpifh_so_version=32:0:12
libmpi_usempi_tkr_so_version=31:0:11
libmpi_usempi_ignore_tkr_so_version=31:0:11
libmpi_usempif08_so_version=31:0:11

This allows apps compiled against OMPI 2.1.x (x<=2) to link successfully with OMPI v2.1.2, but if you compile with OMPI v2.1.y (y>=2), it won't link against OMPI 2.1.x (x<2). Which is what we want.

  1. Bump the r number on ORTE and OPAL.
    • Jeff was overthinking the whole "let's bump the r number on everything" thing. Let's not do that.

Also, FWIW:

  1. In the v1.10, 2.0.x, and 2.1.x series, we have not bumped the OPAL, ORTE, OMPI r numbers if changes were in components.
  2. Brian did bump r numbers for OPAL, ORTE, OMPI in 3.0.1 when components changed. This actually seems a bit more correct -- let's do that with v3.0.0 going forward. But let's stick with the existing r value scheme for v2.0.x and v2.1.x -- just for consistency.
  3. We're somewhat safe because you really can't install 2 versions of OMPI in the same $prefix, which is where the r numbers would really matter. So it's not a major tragedy if we change how we do r numbers from 2.x to 3.x. But changing it in the middle of a single series would probably just confused our Future Selves if we ever went back in time and had to look at the progression of r numbers across our releases for some reason.

Cool?

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
@hppritcha hppritcha force-pushed the topic/version_for_2.1.3 branch from 3939257 to c70a8ec Compare January 29, 2018 22:27
@hppritcha hppritcha closed this Jan 30, 2018
@hppritcha hppritcha deleted the topic/version_for_2.1.3 branch May 2, 2018 03:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants