Binary output proposal #237

donnaaboise · 2018-12-26T23:06:12Z

I've included an option in the 2d/3d valout.f90 files that strips ghost cells from binary output. Seems to be significantly faster than writing with ghost cells.

In 2d, the ghost cells are included in binary output, since Python/visclaw depends on this. Easy to strip these though (see end of valout.f90).
In 3d, ghost cells are stripped.
Matlab code makes same assumptions for 2d/3d code.
easy to change options (although requires recompilation).
I haven't done anything with aux array output.

…still output in 2d, however

…n output in 3d

rjleveque · 2019-01-02T21:58:06Z

@donnaaboise: Since you reverted to writing out the ghost cells, why not just revert to the original versions of valout.f90 in 2d, rather than this new version with an additional write_slice subroutine call?

But more generally we should resolve the issue of how to handle ghost cells following the discussion in #236 and be consistent in 2d and 3d.

If we decide not to write ghost cells in 3d, or allow the user to specify how many ghost cells to include, then there has to be a corresponding change made to $CLAW/pyclaw/src/pyclaw/fileio/binary.py since it currently assumes ghost cells are included (in any number of dimensions).

Finally, I would avoid using the term "slice" when talking about the interior grid cells without ghost cells. This doesn't correspond to what I think of as a slice of the data, and conflicts with the way @cjvogl and I are using the term in another branch that I've started using again for 3d output (and that we should try to clean up and merge in soon). This version has a slices_module that prints out only 2d planar slices of the 3d solution (coordinate-aligned, with a fixed value of x, y, or z on each slice). These are output in the format of a 2d amrclaw solution (from whatever grid patches intersect the slice) so that they can be plotted using the 2d Python plotting routines. For some purposes this is sufficient and gives much less output than the full 3d solution.

donnaaboise · 2019-01-03T11:33:51Z

I could have left the original 2d valout.f90 code - I just included the write_slice routine, in case there was a consensus to not include ghost cells. If ghost cells should be included, then we should revert to the original code.

The main reason I see for not including the ghost cells is that they take more storage and takes longer to output the data. This is particularly noticable in 3d. In 2d, it makes less of a difference, and in any case, it sounds like the 2d Python graphics expects ghost cells, so may not be worth stripping them before outputting data.

I used the term "slice" only because that is the term I use in the Matlab graphics. I have no particular attachment to that term, and can change it.

In any case, the Matlab graphics can read the 2d/3d binary data with or without ghost cells. Currently I have hardwired the selection, but can easily make this a user choice, depending on what amrclaw does.

In any case, I would think it is okay if 2d wrote out ghost cells (since, at the very least Python graphics depends on them), but the 3d doesn't. In 3d, there is a big performance hit, and there is no corresponding plotting routines that depends on ghost cells.

rjleveque · 2019-01-03T20:43:18Z

Just a point of clarification about reading binary files and plotting with visclaw:

The reading is done by pyclaw.fileio.binary, which can read in 1d, 2d, or 3d data. It currently always assumes ghost cells are included (though this could be changed pretty easily). Ghost cells are stripped off in creating the pyclaw.solution.Solution object that is returned.

The 2d plotting routines in visclaw use this stripped down version.

So it is not the plotting that depends on the ghost cells being in the binary files, but the fileio.binary routines, which already handle 3d (you can read in the solution and do your own thing with it) even though we don't have Python plotting routines in visclaw for 3d yet.

So I suggest that whatever we do should be consistent between 2d and 3d. (And 1d, which also supports binary in the same way.)

donnaaboise · 2019-01-04T11:27:44Z

Thanks @rjleveque for clarification - I had been assuming that Visclaw didn't read 3d files at all.

I still think it might be worth considering the option of not printing out ghost cells, especially in 3d. IN In the one example I have looked at, the ratio is of timing/storage without verses with ghost cells is about 60%.

Timing (without ghost cells)

============================== Timing Data ==============================

Integration Time (stepgrid + BC + overhead)
Level           Wall Time (seconds)    CPU Time (seconds)   Total Cell Updates
  1                     8.251                  8.241            0.288E+07
  2                    84.831                 84.642            0.300E+08
total                  93.082                 92.884            0.328E+08

All levels:
stepgrid               91.240                 91.045    
BC/ghost cells          1.378                  1.375
Regridding              3.963                  3.957  
Output (valout)         0.091                  0.090  

Total time:            97.494                 97.280  
Using  1 thread(s)

Note: The CPU times are summed over all threads.
      Total time includes more than the subroutines listed above

=========================================================================

Storage : (without ghost cells)

(bash) ~/.../amrclaw/examples/advection_3d_swirl (donna_valout) % du -hsc _output/fort.b*
2.3M	_output/fort.b0000
3.2M	_output/fort.b0001
3.7M	_output/fort.b0002
4.3M	_output/fort.b0003
5.0M	_output/fort.b0004
4.4M	_output/fort.b0005
23M	total

Timing (with ghost cells)

============================== Timing Data ==============================

Integration Time (stepgrid + BC + overhead)
Level           Wall Time (seconds)    CPU Time (seconds)   Total Cell Updates
  1                     8.019                  8.008            0.288E+07
  2                    82.827                 82.699            0.300E+08
total                  90.846                 90.707            0.328E+08

All levels:
stepgrid               89.117                 88.978    
BC/ghost cells          1.306                  1.303
Regridding              3.858                  3.843  
Output (valout)         0.153                  0.150  

Total time:            95.200                 95.048  
Using  1 thread(s)

Note: The CPU times are summed over all threads.
      Total time includes more than the subroutines listed above

=========================================================================

Storage (with ghost cells)

(bash) ~/.../amrclaw/examples/advection_3d_swirl (donna_valout) % du -hsc _output/fort.b*
3.3M	_output/fort.b0000
5.1M	_output/fort.b0001
5.9M	_output/fort.b0002
6.4M	_output/fort.b0003
8.0M	_output/fort.b0004
7.6M	_output/fort.b0005
36M	total

So while the timing for the output isn't that much (0.091s vs. 0.153s, or about 59%), the storage is significant (23M vs. 36M, or about 63%).

Regardless of what AMRClaw decides, I'll go ahead and make an option in Matlab so the user can decide whether or not to read in ghost cells.

donnaaboise · 2019-01-04T11:41:21Z

The timing results for the 2d swirl example show that the timing is about 55% faster when not writing out ghost cells (0.052s vs. 0.096s) and 80% for storage (5.0Mb vs. 6.2Mb).

rjleveque · 2019-01-06T01:44:31Z

It seems misleading to say it is faster by such a huge margin when ghost cells aren't included since you're only looking at the valout time, which is a tiny fraction of the total run time. In fact for the 3d example you show, the total time without ghost cells was greater than the total time with ghost cells by a couple seconds, so the uncertainty in the timings is much greater than the total time spent in valout.

But I agree the storage saving is considerable and as long as it doesn't significantly slow down valout to strip out the ghost cells, I think it's great to include this as an option.

But we do need to modify pyclaw.fileio.binary to allow this option before people can use it, if they want to read the resulting binary files into Python.

donnaaboise · 2019-01-06T11:01:12Z

In this example (the 3d advection example), the time in valout is negligible, but the several runs I did were surprisingly consistent in these timings, so I didn't see much uncertainty. What I was originally checking was to see if using the F90 slicing to strip the ghost cells would slow down the code significantly. I was surprised to see that it didn't, so proposed it as a way to avoid printing out ghost cells. I assumed that the only reason for printing out the ghost cells was to output contiguous memory. As @rjleveque pointed out, there are several other reasons why the ghost cells might be useful, though.

As a second data point, I did a run with outstyle=3, nout=200, nstep=1. Here, the valout times are longer, but still show that stripping the ghost cells is faster than printing them out (4.58s. vs 7.405s). But it is also the case that the overall time is slower when the ghost cells are stripped. I can't really explain this - is the printing somehow asynchronous? The other times (regridding, BC) were essentially the same between the two runs.

Another consideration is that I am running this on my i7 laptop, with a fast SSD hard drive. On other machines, I/O might be slower.

The storage savings show about the same savings as in the first set of simulations - 1.016Gb vs. 1.6Gb, or about 63%.

Timing (without ghost)

============================== Timing Data ==============================

Integration Time (stepgrid + BC + overhead)
Level           Wall Time (seconds)    CPU Time (seconds)   Total Cell Updates
  1                    38.518                 38.419            0.128E+08
  2                   528.565                527.249            0.189E+09
total                 567.083                565.668            0.201E+09

All levels:
stepgrid              560.387                558.992    
BC/ghost cells          3.747                  3.724
Regridding             21.589                 21.522  
Output (valout)         4.581                  4.399  

Total time:           595.360                593.680  
Using  1 thread(s)

Note: The CPU times are summed over all threads.
      Total time includes more than the subroutines listed above

=========================================================================

Storage (without ghost)

......
5.4M    _output/fort.b0193
5.4M    _output/fort.b0194
5.4M    _output/fort.b0195
5.4M    _output/fort.b0196
5.4M    _output/fort.b0197
5.4M    _output/fort.b0198
5.4M    _output/fort.b0199
4.4M    _output/fort.b0200
1016M   total

Timing (with ghost cells)

============================== Timing Data ==============================

Integration Time (stepgrid + BC + overhead)
Level           Wall Time (seconds)    CPU Time (seconds)   Total Cell Updates
  1                    37.756                 37.681            0.128E+08
  2                   518.820                517.725            0.189E+09
total                 556.576                555.406            0.201E+09

All levels:
stepgrid              549.949                548.787    
BC/ghost cells          3.697                  3.690
Regridding             21.682                 21.599  
Output (valout)         7.405                  7.297  

Total time:           587.754                586.401  
Using  1 thread(s)

Note: The CPU times are summed over all threads.
      Total time includes more than the subroutines listed above

=========================================================================

Storage (with ghost cells)

.....
8.6M    _output/fort.b0192
8.6M    _output/fort.b0193
8.6M    _output/fort.b0194
8.6M    _output/fort.b0195
8.6M    _output/fort.b0196
8.6M    _output/fort.b0197
8.6M    _output/fort.b0198
8.6M    _output/fort.b0199
7.6M    _output/fort.b0200
1.6G    total

donnaaboise · 2019-01-06T11:13:50Z

And as a final data point, here are the timing results using the ascii format (without ghost cells, I assume) for the 3d advection example, with outstyle=3, nout=200, nstep=1.

The binary output is about 30x faster than the ascii, and there is about a 3:1 compression ratio using the binary (2.8GB vs 1.016GB).

Timing (without ghost)

============================== Timing Data ==============================

Integration Time (stepgrid + BC + overhead)
Level           Wall Time (seconds)    CPU Time (seconds)   Total Cell Updates
  1                    38.053                 37.970            0.128E+08
  2                   531.170                530.122            0.189E+09
total                 569.223                568.091            0.201E+09

All levels:
stepgrid              562.409                561.294    
BC/ghost cells          3.830                  3.809
Regridding             21.721                 21.672  
Output (valout)       122.280                120.765  

Total time:           715.239                712.529  
Using  1 thread(s)

Note: The CPU times are summed over all threads.
      Total time includes more than the subroutines listed above

=========================================================================

Storage (without ghost cells)

15M	_output/fort.q0192
15M	_output/fort.q0193
15M	_output/fort.q0194
15M	_output/fort.q0195
15M	_output/fort.q0196
15M	_output/fort.q0197
15M	_output/fort.q0198
15M	_output/fort.q0199
15M	_output/fort.q0200
2.8G	total

donnaaboise · 2019-01-22T15:45:49Z

Leave valout.f as is for now and Matlab code will strip ghost cells from binary output.

rjleveque · 2019-01-22T16:59:37Z

Let's discuss further at SIAM CSE and figure out how best to handle.

donnaaboise added 6 commits December 26, 2018 15:48

Don't write out ghost cells

5266220

Revert to writing out ghost cells

1236209

Add option to print only interior values in binary; ghost values are …

8db4ae1

…still output in 2d, however

Add option to print only interior values; ghost values are stripped i…

84389a6

…n output in 3d

(advection_3d_wirl) UPdate Matlab files

2a31501

(advection_2d_swirl) Update Matlab files

9c8d8a2

donnaaboise mentioned this pull request Dec 26, 2018

Binary output : writing out ghost cells? #236

Open

Merge branch 'master' into donna_valout

939b104

donnaaboise closed this Jan 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Binary output proposal #237

Binary output proposal #237

donnaaboise commented Dec 26, 2018

rjleveque commented Jan 2, 2019

donnaaboise commented Jan 3, 2019

rjleveque commented Jan 3, 2019

donnaaboise commented Jan 4, 2019 •

edited

Loading

donnaaboise commented Jan 4, 2019

rjleveque commented Jan 6, 2019

donnaaboise commented Jan 6, 2019 •

edited

Loading

donnaaboise commented Jan 6, 2019 •

edited

Loading

donnaaboise commented Jan 22, 2019

rjleveque commented Jan 22, 2019

Binary output proposal #237

Binary output proposal #237

Conversation

donnaaboise commented Dec 26, 2018

rjleveque commented Jan 2, 2019

donnaaboise commented Jan 3, 2019

rjleveque commented Jan 3, 2019

donnaaboise commented Jan 4, 2019 • edited Loading

donnaaboise commented Jan 4, 2019

rjleveque commented Jan 6, 2019

donnaaboise commented Jan 6, 2019 • edited Loading

donnaaboise commented Jan 6, 2019 • edited Loading

donnaaboise commented Jan 22, 2019

rjleveque commented Jan 22, 2019

donnaaboise commented Jan 4, 2019 •

edited

Loading

donnaaboise commented Jan 6, 2019 •

edited

Loading

donnaaboise commented Jan 6, 2019 •

edited

Loading