Skip to content
This repository has been archived by the owner on Oct 23, 2020. It is now read-only.

threaded ocean core timestamp issues #1289

Open
mark-petersen opened this issue Apr 7, 2017 · 5 comments
Open

threaded ocean core timestamp issues #1289

mark-petersen opened this issue Apr 7, 2017 · 5 comments

Comments

@mark-petersen
Copy link
Contributor

I’m tested threaded MPAS-O stand-alone on edison. The second thread has a blank instead of a proper timestamp variable.

@mark-petersen
Copy link
Contributor Author

mark-petersen commented Apr 7, 2017

@Larofeticus and @philipwjones
In this section here:

vi src/core_ocean/*/mpas_ocn_forward_mode.F

         !$omp parallel default(firstprivate) shared(domain, dt, timeStamp)

         print *, 'xtime ocn_timestep',timeStamp,'end'
         call ocn_timestep(domain, dt, timeStamp)

         !$omp end parallel

The added print statement shows that, when running with two threads:

 xtime ocn_timestep
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@




                                        end

                                       end

With one thread it has the correct time. Is it clear that timeStamp should be on the shared list?

@mark-petersen
Copy link
Contributor Author

BTW, currently there is also a write statement within that omp parallel loop:

#ifdef MPAS_DEBUG
         write(stderrUnit,*) '   Computing forward time step'
#endif

If I run with two threads in debug mode, the second thread also has trouble writing to the std err unit=0 file. This is the output:

    Computing forward time step
forrtl: severe (40): recursive I/O operation, unit 0, file unknown
Image              PC                Routine            Line        Source
ocean_model        00000000037998EB  Unknown               Unknown  Unknown
ocean_model        000000000211AAA2  ocn_forward_mode_         563  mpas_ocn_forward_mode.F

If I move the print statement outside the omp loop, the problem goes away. But we may want to print within threaded sections elsewhere.

@mark-petersen
Copy link
Contributor Author

mark-petersen commented Apr 7, 2017

I tested MPAS-O standalone on LANL grizzly using OMP_NUM_THREADS=1 and OMP_NUM_THREADS=2, and it passes the bfb thread comparison test, without any of these problems. I compiled with
make ifort CORE=ocean OPENMP=true DEBUG=true

@philipwjones
Copy link
Contributor

@mark-petersen - just to help diagnose, can you try putting these print/write statements within a critical region, i.e.:
!$omp critical
print *...
!$omp end critical

@mark-petersen
Copy link
Contributor Author

Probably related, on a threaded run on grizzly with intel, with OMP_NUM_THREADS 2 I get very large numbers on some timers. I spent a few hours on this and couldn't figure it out. The *** are format overflows.

/lustre/scratch3/turquoise/mpeterse/runs/t44g/ocean/baroclinic_channel/10km/threads_test/4thread_run

    timer_name                                            total       calls        min            max            avg      pct_tot   pct_par     par_eff
  1 total time                                          31.60624         1       31.60615       31.60624        1.75590   100.00       0.00       0.06
  2  initialize                                          1.67186         1        1.53690        1.67186        0.08885     5.29       5.29       0.05
  3   io_read                                            0.35304         2        0.05418        0.29886        0.00968     1.12      21.12       0.05
  4    analysis_bootstrap                                0.00006         1        0.00005        0.00006        0.00000     0.00       0.02       0.05
  3   reset_io_alarms                                    0.00014         2        0.00001        0.00013        0.00000     0.00       0.01       0.05
  3   diagnostic solve                                   0.06254         1        0.02893        0.06254        0.00241     0.20       3.74       0.04
  4    equation of state                                 0.00112         3        0.00027        0.00041        0.00002     0.00       1.79       0.05
  3   analysis_init                                      0.00016         1        0.00013        0.00016        0.00001     0.00       0.01       0.05
  2  analysis_compute_startup                            0.00004         1        0.00003        0.00004        0.00000     0.00       0.00       0.05
  2  io_shortwave                                        0.00017         4        0.00003        0.00005        0.00000     0.00       0.00       0.05
  2  io_read                                             0.00008         3        0.00002        0.00003        0.00000     0.00       0.00       0.05
  2  reset_io_alarms                                     0.00064        12        0.00001        0.00012        0.00000     0.00       0.00       0.05
  2  land_ice_build_arrays                               0.00000         3        0.00000        0.00000        0.00000     0.00       0.00       0.05
  2  time integration                                    5.87513         3        1.57887        2.27402        0.10807    18.59      18.59       0.06
  3   se timestep                                  4139315.21997         3        1.55496  1379782.39758   268292.53881 ********   ********       0.19
  4    se prep                                           0.10929         3        0.01071        0.04455        0.02953     0.35       0.00       0.81
  4    se loop                                           5.22887         6        0.66651        1.00913        0.84954    16.54       0.00       0.97
  5     se halo diag                                     0.48093         6        0.02135        0.18459        0.05196     1.52       9.20       0.65
  5     se bcl vel                                 8278622.24341         6        0.03206  1379781.46736   734601.37084 ********   ********       0.53
  6      se bcl vel tend                                 0.10941         6        0.01273        0.02403        0.01734     0.35       0.00       0.95
  7       ocn_tend_vel                             8278622.04411         6        0.01199  1379781.42544   542967.92072 ********   ********       0.39
  8        coriolis                                      0.02440         6        0.00087        0.00525        0.00326     0.08       0.00       0.80
  8        vel vadv                                      0.00696         6        0.00013        0.00193        0.00071     0.02       0.00       0.61
  8        pressure grad                                 0.00592         6        0.00011        0.00197        0.00069     0.02       0.00       0.70
  8        vel hmix                                8278622.01003         6        0.00214  1379781.41770    89432.41184 ********     100.00       0.06
  9         vel del2                                     0.00941         6        0.00011        0.00318        0.00098     0.03       0.00       0.62
  8        vel surface stress                            0.01264         6        0.00039        0.00515        0.00150     0.04       0.00       0.71
  6      bcl iters on linear Coriolis             12417934.53092         9        0.00996  1379781.45871   159699.29634 ********     150.00       0.12
  7       ocn_fuperp                                     0.03319         9        0.00213        0.00475        0.00340     0.11       0.00       0.92
  7       se halo normalBaroclinicVelocity               0.12757         9        0.00479        0.02013        0.01140     0.40       0.00       0.80
  6      se halo barotropicForcing                       0.05310         6        0.00664        0.01090        0.00863     0.17       0.00       0.98
  5     se btr vel                                 8278625.75274         6        0.47477  1379782.02420   680305.90091 ********   ********       0.49
  6      btr vel se init                                 0.00198         6        0.00017        0.00050        0.00028     0.01       0.00       0.84
  6      btr se subcycle loop                            3.39628         6        0.45993        0.68095        0.56535    10.75       0.00       1.00
  7       se halo subcycle                               2.97804       240        0.00805        0.01888        0.01227     9.42      87.69       0.99
  6      btr se norm                                     0.00100         6        0.00005        0.00024        0.00014     0.00       0.00       0.85
  6      se halo F and btr vel                           0.08599         6        0.00992        0.01842        0.01365     0.27       0.00       0.95
  6      btr se ssh verif                                0.00454         6        0.00019        0.00114        0.00060     0.01       0.00       0.80
  5     se thick tend                              8278625.80751         6        0.00568  1379782.03243   533385.92463 ********   ********       0.39
  6      thick vert trans vel top                        0.02930         6        0.00224        0.00590        0.00413     0.09       0.00       0.85
  6      ocn_tend_thick                            8278625.80577         6        0.00205  1379782.03204   632397.70516 ********     100.00       0.46
  7       thick hadv                                     0.00458         6        0.00015        0.00125        0.00061     0.01       0.00       0.80
  7       thick vadv                                     0.00167         6        0.00007        0.00092        0.00018     0.01       0.00       0.65
  7       thick surface flux                             0.00229         6        0.00011        0.00062        0.00032     0.01       0.00       0.84
  5     se halo thickness                                0.07832         6        0.00659        0.02145        0.01158     0.25       1.50       0.89
  5     se tracer tend                             8278626.22063         6        0.05450  1379782.10071   159698.42585 ********   ********       0.12

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants