Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot MPI_File_open a one-character filename, deletes external file anyways #12619

Closed
jeffhammond opened this issue Jun 14, 2024 · 5 comments
Closed

Comments

@jeffhammond
Copy link
Contributor

jeffhammond commented Jun 14, 2024

I have a trivial MPI_F08 program that opens, closes and deletes a file. When the filename is "a", this fails. When the filename is "aa", it succeeds. When I create files "a" and "aa" using touch, the program still fails, but Open MPI deletes both files, despite saying that MPI_File_delete on "a" has failed.

Background information

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

                 Package: Open MPI jehammond@oppenheimer Distribution
                Open MPI: 5.1.0a1
  Open MPI repo revision: v2.x-dev-11448-g55c0bda957
   Open MPI release date: Unreleased developer copy
                 MPI API: 3.1.0
            Ident string: 5.1.0a1
                  Prefix: /opt/ompi/llvm
 Configured architecture: x86_64-pc-linux-gnu
           Configured by: jehammond
           Configured on: Fri Jun 14 06:34:47 UTC 2024
          Configure host: oppenheimer
  Configure command line: 'FC=/opt/llvm/latest/bin/flang-new'
                          'CC=/opt/llvm/latest/bin/clang'
                          'CXX=/opt/llvm/latest/bin/clang++'
                          '--enable-fortran=all' '--prefix=/opt/ompi/llvm'
                          'CPPFLAGS=-I/opt/llvm/latest/include/flang'
                Built by: jehammond
                Built on: pe 14.6.2024 06.48.03 +0000
              Built host: oppenheimer
              C bindings: yes
             Fort mpif.h: yes (all)
            Fort use mpi: yes (full: ignore TKR)
       Fort use mpi size: deprecated-ompi-info-value
        Fort use mpi_f08: yes
 Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
                          limitations in the /opt/llvm/latest/bin/flang-new
                          compiler and/or Open MPI, does not support the
                          following: array subsections, direct passthru
                          (where possible) to underlying Open MPI's C
                          functionality
  Fort mpi_f08 subarrays: no
           Java bindings: no
  Wrapper compiler rpath: runpath
              C compiler: /opt/llvm/latest/bin/clang
     C compiler absolute: /opt/llvm/latest/bin/clang
  C compiler family name: CLANG
      C compiler version: 19.0.0git (https://github.com/llvm/llvm-project.git
                          f2d215f572affc9ad73da07763ce1831de7f2d4d)
            C++ compiler: /opt/llvm/latest/bin/clang++
   C++ compiler absolute: /opt/llvm/latest/bin/clang++
           Fort compiler: /opt/llvm/latest/bin/flang-new
       Fort compiler abs: /opt/llvm/latest/bin/flang-new
         Fort ignore TKR: yes (!DIR$ IGNORE_TKR)
   Fort 08 assumed shape: yes
      Fort optional args: yes
          Fort INTERFACE: yes
    Fort ISO_FORTRAN_ENV: yes
       Fort STORAGE_SIZE: yes
      Fort BIND(C) (all): yes
      Fort ISO_C_BINDING: yes
 Fort SUBROUTINE BIND(C): yes
       Fort TYPE,BIND(C): yes
 Fort T,BIND(C,name="a"): yes
            Fort PRIVATE: yes
           Fort ABSTRACT: yes
       Fort ASYNCHRONOUS: yes
          Fort PROCEDURE: yes
         Fort USE...ONLY: yes
           Fort C_FUNLOC: yes
 Fort f08 using wrappers: yes
         Fort MPI_SIZEOF: yes
             C profiling: yes
   Fort mpif.h profiling: yes
  Fort use mpi profiling: yes
   Fort use mpi_f08 prof: yes
          Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: yes,
                          OMPI progress: no, Event lib: yes)
           Sparse Groups: no
  Internal debug support: no
  MPI interface warnings: yes
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
              dl support: yes
   Heterogeneous support: no
       MPI_WTIME support: native
     Symbol vis. support: yes
   Host topology support: yes
            IPv6 support: no
          MPI extensions: affinity, cuda, ftmpi, rocm, shortfloat
 Fault Tolerance support: yes
          FT MPI support: yes
  MPI_MAX_PROCESSOR_NAME: 256
    MPI_MAX_ERROR_STRING: 256
     MPI_MAX_OBJECT_NAME: 64
        MPI_MAX_INFO_KEY: 36
        MPI_MAX_INFO_VAL: 256
       MPI_MAX_PORT_NAME: 1024
  MPI_MAX_DATAREP_STRING: 128
         MCA accelerator: null (MCA v2.1.0, API v1.0.0, Component v5.1.0)
           MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v5.1.0)
           MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v5.1.0)
           MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v5.1.0)
                 MCA btl: self (MCA v2.1.0, API v3.3.0, Component v5.1.0)
                 MCA btl: sm (MCA v2.1.0, API v3.3.0, Component v5.1.0)
                 MCA btl: tcp (MCA v2.1.0, API v3.3.0, Component v5.1.0)
                 MCA btl: smcuda (MCA v2.1.0, API v3.3.0, Component v5.1.0)
                  MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component v5.1.0)
                  MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, Component
                          v5.1.0)
                  MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component
                          v5.1.0)
         MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v5.1.0)
         MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component v5.1.0)
              MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component v5.1.0)
               MCA mpool: hugepage (MCA v2.1.0, API v3.1.0, Component v5.1.0)
             MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component
                          v5.1.0)
              MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v5.1.0)
              MCA rcache: gpusm (MCA v2.1.0, API v3.3.0, Component v5.1.0)
              MCA rcache: rgpusm (MCA v2.1.0, API v3.3.0, Component v5.1.0)
           MCA reachable: weighted (MCA v2.1.0, API v2.0.0, Component v5.1.0)
               MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v5.1.0)
               MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v5.1.0)
               MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v5.1.0)
                MCA smsc: cma (MCA v2.1.0, API v1.0.0, Component v5.1.0)
             MCA threads: pthreads (MCA v2.1.0, API v1.0.0, Component v5.1.0)
               MCA timer: linux (MCA v2.1.0, API v2.0.0, Component v5.1.0)
                 MCA bml: r2 (MCA v2.1.0, API v2.1.0, Component v5.1.0)
                MCA coll: accelerator (MCA v2.1.0, API v3.0.0, Component
                          v5.1.0)
                MCA coll: adapt (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                MCA coll: basic (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                MCA coll: han (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                MCA coll: inter (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                MCA coll: libnbc (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                MCA coll: self (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                MCA coll: sync (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                MCA coll: tuned (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                MCA coll: xhc (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                MCA coll: ftagree (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                MCA coll: monitoring (MCA v2.1.0, API v3.0.0, Component
                          v5.1.0)
                MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v5.1.0)
               MCA fcoll: dynamic (MCA v2.1.0, API v3.0.0, Component v5.1.0)
               MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v3.0.0, Component
                          v5.1.0)
               MCA fcoll: individual (MCA v2.1.0, API v3.0.0, Component
                          v5.1.0)
               MCA fcoll: vulcan (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                  MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v5.1.0)
                MCA hook: comm_method (MCA v2.1.0, API v1.0.0, Component
                          v5.1.0)
                  MCA io: ompio (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                  MCA io: romio341 (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                  MCA op: avx (MCA v2.1.0, API v1.0.0, Component v5.1.0)
                 MCA osc: sm (MCA v2.1.0, API v4.0.0, Component v5.1.0)
                 MCA osc: monitoring (MCA v2.1.0, API v4.0.0, Component
                          v5.1.0)
                 MCA osc: rdma (MCA v2.1.0, API v4.0.0, Component v5.1.0)
                MCA part: persist (MCA v2.1.0, API v4.0.0, Component v5.1.0)
                 MCA pml: cm (MCA v2.1.0, API v2.1.0, Component v5.1.0)
                 MCA pml: monitoring (MCA v2.1.0, API v2.1.0, Component
                          v5.1.0)
                 MCA pml: ob1 (MCA v2.1.0, API v2.1.0, Component v5.1.0)
                 MCA pml: v (MCA v2.1.0, API v2.1.0, Component v5.1.0)
            MCA sharedfp: individual (MCA v2.1.0, API v3.0.0, Component
                          v5.1.0)
            MCA sharedfp: lockedfile (MCA v2.1.0, API v3.0.0, Component
                          v5.1.0)
            MCA sharedfp: sm (MCA v2.1.0, API v3.0.0, Component v5.1.0)
                MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v5.1.0)
                MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component
                          v5.1.0)
           MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component
                          v5.1.0)

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

See above.

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

 e32e0179bc6bd1637f92690511ce6091719fa046 3rd-party/openpmix (v1.1.3-4036-ge32e0179)
 1d867e84981077bffda9ad9d44ff415a3f6d91c4 3rd-party/prrte (psrvr-v2.0.0rc1-4783-g1d867e8498)
 dfff67569fb72dbf8d73a1dcf74d091dad93f71b config/oac (remotes/origin/HEAD)

Please describe the system on which you are running

  • Operating system/version:
  • Computer hardware:
  • Network type:
Linux oppenheimer 6.5.0-35-generic #35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue May  7 09:00:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
AMD Ryzen 9 7950X 16-Core Processor

Details of the problem

$ make test_file.x && ls a aa ; echo "====" ; /opt/ompi/llvm/bin/mpirun -n 1 ./test_file.x ; echo "====" ; touch a aa ; ls a aa ; echo "====" ; /opt/ompi/llvm/bin/mpirun -n 1 ./test_file.x ; echo "====" ; ls a aa
make: 'test_file.x' is up to date.
ls: cannot access 'a': No such file or directory
ls: cannot access 'aa': No such file or directory
====
 I am  0  of  1  of WORLD
 filename = "a        "
 open failed
 why? MPI_ERR_OTHER: known error not in list



 close failed
 why? MPI_ERR_FILE: invalid file



 delete failed
 why? MPI_ERR_FILE: invalid file



 filename = "aa       "
 done
====
a  aa
====
 I am  0  of  1  of WORLD
 filename = "a        "
 open failed
 why? MPI_ERR_OTHER: known error not in list



 close failed
 why? MPI_ERR_FILE: invalid file



 delete failed
 why? MPI_ERR_FILE: invalid file



 filename = "aa       "
 done
====
ls: cannot access 'a': No such file or directory
ls: cannot access 'aa': No such file or directory
program main
    use mpi_f08
    implicit none
    integer :: ierror, slen
    integer :: me, np
    integer :: amode
    type(MPI_File) :: f
    character(len=9) :: filename
    character(len=MPI_MAX_ERROR_STRING) :: string

    call MPI_Init(ierror)

    call MPI_Comm_rank(MPI_COMM_WORLD,me)
    call MPI_Comm_size(MPI_COMM_WORLD,np)
    print*,'I am ',me,' of ',np,' of WORLD'

    filename = "a"
    print*,'filename = "',filename,'"'
    amode = IOR( MPI_MODE_CREATE, MPI_MODE_RDWR )

    call MPI_File_open(MPI_COMM_SELF,filename,amode,MPI_INFO_NULL,f,ierror)
    if (ierror.ne.MPI_SUCCESS) then
        print*,'open failed'
        call MPI_Error_string(ierror, string, slen)
        print*,'why? ',trim(string)
    endif

    call MPI_File_close(f,ierror)
    if (ierror.ne.MPI_SUCCESS) then
        print*,'close failed'
        call MPI_Error_string(ierror, string, slen)
        print*,'why? ',trim(string)
    endif

    call MPI_File_delete(filename,MPI_INFO_NULL)
    if (ierror.ne.MPI_SUCCESS) then
        print*,'delete failed'
        call MPI_Error_string(ierror, string, slen)
        print*,'why? ',trim(string)
    endif

    filename = "aa"
    print*,'filename = "',filename,'"'
    amode = IOR( MPI_MODE_CREATE, MPI_MODE_RDWR )

    call MPI_File_open(MPI_COMM_SELF,filename,amode,MPI_INFO_NULL,f,ierror)
    if (ierror.ne.MPI_SUCCESS) then
        print*,'open failed'
        call MPI_Error_string(ierror, string, slen)
        print*,'why? ',trim(string)
    endif

    call MPI_File_close(f,ierror)
    if (ierror.ne.MPI_SUCCESS) then
        print*,'close failed'
        call MPI_Error_string(ierror, string, slen)
        print*,'why? ',trim(string)
    endif

    call MPI_File_delete(filename,MPI_INFO_NULL)
    if (ierror.ne.MPI_SUCCESS) then
        print*,'delete failed'
        call MPI_Error_string(ierror, string, slen)
        print*,'why? ',trim(string)
    endif

    print*,'done'
    call MPI_Finalize(ierror)

end program main
@ggouaillardet
Copy link
Contributor

Thanks Jeff for the report, I will have a look.

I am able to reproduce the issue (with GNU compilers fwiw)
Fun fact: no error if i run in singleton mode and/or use romio

$ mpirun -np 1 ./d
 I am            0  of            1  of WORLD
 filename = "a        "
 open failed
 why? MPI_ERR_OTHER: known error not in list
 close failed
 why? MPI_ERR_FILE: invalid file
 delete failed
 why? MPI_ERR_FILE: invalid file
 filename = "aa       "
 done

$ mpirun -np 1 --mca io ^ompio ./d
 I am            0  of            1  of WORLD
 filename = "a        "
 filename = "aa       "
 done

$ ./d
 I am            0  of            1  of WORLD
 filename = "a        "
 filename = "aa       "
 done

@ggouaillardet
Copy link
Contributor

singleton vs mpirun was fun but unrelated to the root cause.

here is a patch (opal_basename() does not correctly handle single character filename !), I will issue a PR later

diff --git a/opal/util/basename.c b/opal/util/basename.c
index 0a57b07078..ad873f2c7c 100644
--- a/opal/util/basename.c
+++ b/opal/util/basename.c
@@ -77,16 +77,18 @@ char *opal_basename(const char *filename)
 
     /* Remove trailing sep's (note that we already know that strlen > 0) */
     tmp = strdup(filename);
-    for (i = strlen(tmp) - 1; i > 0; --i) {
-        if (sep == tmp[i]) {
-            tmp[i] = '\0';
-        } else {
-            break;
+    if (1 < strlen(tmp)) {
+        for (i = strlen(tmp) - 1; i > 0; --i) {
+            if (sep == tmp[i]) {
+                tmp[i] = '\0';
+            } else {
+                break;
+            }
+        }
+        if (0 == i) {
+            tmp[0] = sep;
+            return tmp;
         }
-    }
-    if (0 == i) {
-        tmp[0] = sep;
-        return tmp;
     }
 
     /* Look for the final sep */

@edgargabriel
Copy link
Member

@ggouaillardet thank you for identifying the issue, can you file a PR with the fix?

rhc54 added a commit to rhc54/openpmix that referenced this issue Jun 14, 2024
Modify pmix_basename to handle single character filenames.

Ported from comment by @ggouaillardet in open-mpi/ompi#12619

Signed-off-by: Ralph Castain <rhc@pmix.org>
rhc54 added a commit to openpmix/openpmix that referenced this issue Jun 14, 2024
Modify pmix_basename to handle single character filenames.

Ported from comment by @ggouaillardet in open-mpi/ompi#12619

Signed-off-by: Ralph Castain <rhc@pmix.org>
rhc54 added a commit to rhc54/openpmix that referenced this issue Jun 16, 2024
Modify pmix_basename to handle single character filenames.

Ported from comment by @ggouaillardet in open-mpi/ompi#12619

Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit 1ab5ece)
rhc54 added a commit to openpmix/openpmix that referenced this issue Jun 16, 2024
Modify pmix_basename to handle single character filenames.

Ported from comment by @ggouaillardet in open-mpi/ompi#12619

Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit 1ab5ece)
ggouaillardet added a commit to ggouaillardet/ompi that referenced this issue Jun 19, 2024
Thanks Jeff Hammond for the bug report

Refs. open-mpi#12619

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
@ggouaillardet
Copy link
Contributor

Sorry for the delay, I just issued #12632

janjust pushed a commit to janjust/ompi that referenced this issue Jun 27, 2024
Thanks Jeff Hammond for the bug report

Refs. open-mpi#12619

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
janjust pushed a commit to janjust/ompi that referenced this issue Jun 27, 2024
Thanks Jeff Hammond for the bug report

Refs. open-mpi#12619

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
janjust pushed a commit to janjust/ompi that referenced this issue Jun 27, 2024
Thanks Jeff Hammond for the bug report

Refs. open-mpi#12619

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit dd34ecf)
@wenduwan
Copy link
Contributor

wenduwan commented Jul 3, 2024

Should be fixed in 5.0.4 scheduled in 7/2024

@wenduwan wenduwan closed this as completed Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants