Variable-Based iteration layout #855

franzpoeschel · 2020-12-21T16:25:35Z

Based on topic-adios-variables-for-attributes since "variable attributes" are needed for this.
For comparing the changes: franzpoeschel/openPMD-api@topic-adios-variables-for-attributes...franzpoeschel:topic-stepbased

This adds a third iteration layout to the openPMD API: Next to file-based and group-based, a step-based layout.

Idea: ADIOS prefers consuming data in steps. Until now, the openPMD API created entirely new datasets for each iteration, while ADIOS(2) actually allows to reuse datasets across steps. A simple dataset may now look like this in ADIOS2:

  string    /basePath                        scalar
  uint64_t  /data/__step__                   10*scalar
  uint8_t   /data/closed                     10*scalar
  double    /data/dt                         10*scalar
  int8_t    /data/meshes/E/axisLabels        10*{1, 2}
  string    /data/meshes/E/dataOrder         10*scalar
  string    /data/meshes/E/geometry          10*scalar
  double    /data/meshes/E/gridGlobalOffset  10*{1}
  double    /data/meshes/E/gridSpacing       10*{1}
  double    /data/meshes/E/gridUnitSI        10*scalar
  float     /data/meshes/E/timeOffset        10*scalar
  double    /data/meshes/E/unitDimension     10*{7}
  int32_t   /data/meshes/E/x/__data__        10*{1000}
  double    /data/meshes/E/x/position        10*{1}
  double    /data/meshes/E/x/unitSI          10*scalar
  int32_t   /data/meshes/E/y/__data__        10*{__}
  double    /data/meshes/E/y/position        10*{1}
  double    /data/meshes/E/y/unitSI          10*scalar
  double    /data/time                       10*scalar
  double    /data/timeUnitSI                 10*scalar
  string    /date                            scalar
  string    /iterationEncoding               scalar
  string    /iterationFormat                 scalar
  string    /meshesPath                      scalar
  string    /openPMD                         scalar
  uint32_t  /openPMDextension                scalar
  string    /software                        scalar
  string    /softwareVersion                 scalar

This helps controlling the amount of metadata by reusing attributes and variables across steps.

TODO:

franzpoeschel · 2020-12-22T11:00:07Z

The latest commit removes the currentIteration group:

  string    /basePath                        scalar
  uint8_t   /data/closed                     10*scalar
  double    /data/dt                         10*scalar
  uint64_t  /data/iterationIndex             10*scalar
  int8_t    /data/meshes/E/axisLabels        10*{1, 2}
  string    /data/meshes/E/dataOrder         10*scalar
  string    /data/meshes/E/geometry          10*scalar
  double    /data/meshes/E/gridGlobalOffset  10*{1}
  double    /data/meshes/E/gridSpacing       10*{1}
  double    /data/meshes/E/gridUnitSI        10*scalar
  float     /data/meshes/E/timeOffset        10*scalar
  double    /data/meshes/E/unitDimension     10*{7}
  int32_t   /data/meshes/E/x/__data__        10*{1000}
  double    /data/meshes/E/x/position        10*{1}
  double    /data/meshes/E/x/unitSI          10*scalar
  double    /data/time                       10*scalar
  double    /data/timeUnitSI                 10*scalar
  string    /date                            scalar
  string    /iterationEncoding               scalar
  string    /iterationFormat                 scalar
  string    /meshesPath                      scalar
  string    /openPMD                         scalar
  uint32_t  /openPMDextension                scalar
  string    /software                        scalar
  string    /softwareVersion                 scalar

I'm not sure how to feel about this.
Pro: Looks nicer, is simpler
Con: It's no longer possible to distinguish attributes written to series.iterations and series.iterations[0].

Would this lead to problems? Are attributes commonly written to series.iterations?
Also if we go with this:

Even if the other backends don't support step-based iteration layout, make sure they deal correctly with openPath(path=""). The ADIOS2 backend didn't.

ax3l · 2021-02-08T03:03:41Z

Hi @franzpoeschel, can you please rebase this PR for review? :)

#855 (comment)

Pro: Looks nicer, is simpler
Con: It's no longer possible to distinguish attributes written to series.iterations and series.iterations[0].

Would this lead to problems? Are attributes commonly written to series.iterations?

The layout without the extra group is exactly what we want. Attributes that were on each /data/<iteration>/ should not clash with anything in /data by design of openPMD-standard, so we are good to go. This is similar to how we handle scalar and vector records already.

I would rename iterationIndex to __steps__ as we already do for __data__.

For tests, I would say we can add as we do the "ext"(ension) argument for many steps another "encoding" argument and just increase write/read coverage by running many more unit tests through it.

franzpoeschel · 2021-02-08T10:28:38Z

For tests, I would say we can add as we do the "ext"(ension) argument for many steps another "encoding" argument and just increase write/read coverage by running many more unit tests through it.

This iteration layout requires use of the streaming API, so the number of tests we can hijack is limited by that. But testing can surely be extended there.
Currently, picking this iteration layout is supported only via Series::setIterationLayout. I think we should add some kind of pattern for this as well, so applications won't have to explicitly add support for this layout. I propose something like series%V.bp. That would also simplify hijacking tests.

franzpoeschel · 2021-02-08T10:32:34Z

Hi @franzpoeschel, can you please rebase this PR for review? :)

Done

#855 (comment)
Pro: Looks nicer, is simpler
Con: It's no longer possible to distinguish attributes written to series.iterations and series.iterations[0].
Would this lead to problems? Are attributes commonly written to series.iterations?

The layout without the extra group is exactly what we want. Attributes that were on each /data/<iteration>/ should not clash with anything in /data by design of openPMD-standard, so we are good to go. This is similar to how we handle scalar and vector records already.

It's also a bit helpful for implementation. This way, I can inquire Iteration::getAttribute("__step__") also via Series::iterations::getAttribute("__step__"), so we know which iteration to open in the first place. Otherwise, things get a bit hen-and-egg ;)
So, since that one is currently implemented, I'm keeping it.

I would rename iterationIndex to __steps__ as we already do for __data__.

Done

ax3l · 2021-02-08T18:09:06Z

Should we default to the new attribute layout #813 with stepBased from the beginning?

Maybe I overlook it in the review, but I cannot spot where this is exactly triggered in this PR if it is.

franzpoeschel · 2021-04-08T12:59:45Z

Think I found an ancient bug: ADIOS1 reports the datatype of unitDimension as VEC_DOUBLE. This never came up since the unitDimension attributes are default-initialized and the defaults were not overwritten by what was actually in the file. They now are.

src/Series.cpp

franzpoeschel · 2021-04-09T09:33:15Z

Should we default to the new attribute layout #813 with stepBased from the beginning?

Maybe I overlook it in the review, but I cannot spot where this is exactly triggered in this PR if it is.

The ADIOS2 backend will now automatically switch to the new layout when selecting step-based iteration encoding.

Otherwise, this is ready for review from my side :) @ax3l

ax3l · 2021-04-21T08:51:05Z

Awesome, can you please rebase out the little merge conflicts from the last merge and I'll get to it before I am on PTO :-)

Datasets wth changing dimensions require re-parsing

When re-parsing a Series, we must keep old handles valid, but we might need to delete stale map members.

Test still failing

franzpoeschel · 2021-04-22T10:01:37Z

Passing apart from a CI run that fails to initialize. I think this one is ready @ax3l

ax3l · 2021-04-23T06:44:44Z

Awesome! Addressing the brew macOS CI thingy in #970

ax3l

Awesome! 🚀 ✨

ax3l · 2021-04-24T01:25:52Z

@franzpoeschel I just realized: did we intentionally call this now variableBased? We designed this as stepBased:
openPMD/openPMD-standard#250 I can change this in the standards as well, I just cannot recall if we intentionally renamed it.

franzpoeschel · 2021-04-29T16:55:55Z

I initially named the new encoding "step-based", but in an offline discussion you suggested the use of "variable-based" encoding to stress the fact that we are using variables, i.e. datasets with version numbers, to encode openPMD iterations. @ax3l

ax3l · 2021-05-20T21:40:56Z

@franzpoeschel Thanks. I looked at this again from an openPMD perspective, where we don't define the term "variable".
We currently have a "Series of Iterations", which will likely be called "Series of Snapshots" in 2.0 openPMD/openPMD-standard#148

It's generally ok-ish to go with variableBased since it's ADIOS specific, but as I said it introduces a new term (stepBased would do so as well). Have no better idea at the moment either, though :)

franzpoeschel force-pushed the topic-stepbased branch 3 times, most recently from 82777da to 73cb618 Compare December 22, 2020 15:06

ax3l self-requested a review December 22, 2020 18:07

ax3l self-assigned this Dec 22, 2020

ax3l added the backend: ADIOS2 label Dec 22, 2020

franzpoeschel force-pushed the topic-stepbased branch from 73cb618 to 0c1b5b8 Compare December 23, 2020 11:55

franzpoeschel force-pushed the topic-stepbased branch from 2191931 to 201c9db Compare January 4, 2021 12:40

franzpoeschel force-pushed the topic-stepbased branch 2 times, most recently from 6d5b3bf to d397cce Compare January 19, 2021 11:43

ax3l mentioned this pull request Feb 4, 2021

ADIOS2: New Layout in openPMD 2 #920

Open

5 tasks

franzpoeschel force-pushed the topic-stepbased branch from d397cce to ea327c4 Compare February 8, 2021 10:15

franzpoeschel force-pushed the topic-stepbased branch 6 times, most recently from 12b6827 to 261ac39 Compare February 12, 2021 09:33

franzpoeschel mentioned this pull request Feb 12, 2021

ADIOS2: v2.7.0+ #927

Merged

3 tasks

franzpoeschel mentioned this pull request Feb 22, 2021

Large memory consumption in BufferSTL when using PerformPuts and Flush instead of Begin/EndStep (BP4) ornladios/ADIOS2#1891

Closed

franzpoeschel mentioned this pull request Mar 8, 2021

Use schema-based versioning in ADIOS2 backend #941

Merged

7 tasks

franzpoeschel force-pushed the topic-stepbased branch 2 times, most recently from 497ae3e to d05a222 Compare March 8, 2021 11:40

ax3l mentioned this pull request Mar 8, 2021

IterationEncoding: variableBased openPMD/openPMD-standard#250

Open

10 tasks

franzpoeschel force-pushed the topic-stepbased branch from d05a222 to 3bf4aec Compare March 9, 2021 17:43

franzpoeschel force-pushed the topic-stepbased branch 2 times, most recently from 52f311e to 480d343 Compare April 8, 2021 12:55

franzpoeschel commented Apr 8, 2021

View reviewed changes

src/Series.cpp Show resolved Hide resolved

franzpoeschel force-pushed the topic-stepbased branch from a9e5d9c to 21418ce Compare April 9, 2021 09:30

franzpoeschel added 11 commits April 22, 2021 10:44

Step-based iteration layout

75f436f

Add eager parsing test

bec97cb

Datasets wth changing dimensions require re-parsing

Fully re-read an Iteration in stepping mode

a2817dc

Remove %V shorthand to select variable-based layout

2a0accf

Hijack some other tests for variable-based encoding

bb9cfd4

Refine re-parsing

7330e33

When re-parsing a Series, we must keep old handles valid, but we might need to delete stale map members.

Test that attributes don't occur in the wrong step

1f9369a

Test still failing

Specifiy more precisely when to re-read attributes

1b322d3

Fix ADIOS1 bug: Wrong datatype reported for unitDimension

1636c31

Rename __step__ -> snapshot

c12d8fc

Code cleanup and in-code documentation

db5f0a7

franzpoeschel force-pushed the topic-stepbased branch from 21418ce to db5f0a7 Compare April 22, 2021 08:53

ax3l added the api: new additions to the API label Apr 23, 2021

ax3l approved these changes Apr 23, 2021

View reviewed changes

ax3l merged commit 86a15ba into openPMD:dev Apr 23, 2021

ax3l mentioned this pull request Apr 23, 2021

[WIP] openPMD: variableBased IterationEncoding ECP-WarpX/WarpX#1909

Closed

ax3l changed the title ~~Step-Based iteration layout~~ Variable-Based iteration layout Apr 24, 2021

franzpoeschel mentioned this pull request Jul 19, 2021

Extend global single value variables to behave more like variable attributes ornladios/ADIOS2#2792

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variable-Based iteration layout #855

Variable-Based iteration layout #855

franzpoeschel commented Dec 21, 2020 •

edited

Loading

franzpoeschel commented Dec 22, 2020 •

edited

Loading

ax3l commented Feb 8, 2021 •

edited

Loading

franzpoeschel commented Feb 8, 2021

franzpoeschel commented Feb 8, 2021

ax3l commented Feb 8, 2021

franzpoeschel commented Apr 8, 2021

franzpoeschel commented Apr 9, 2021

ax3l commented Apr 21, 2021 •

edited

Loading

franzpoeschel commented Apr 22, 2021

ax3l commented Apr 23, 2021 •

edited

Loading

ax3l left a comment

ax3l commented Apr 24, 2021 •

edited

Loading

franzpoeschel commented Apr 29, 2021 •

edited

Loading

ax3l commented May 20, 2021 •

edited

Loading

Variable-Based iteration layout #855

Variable-Based iteration layout #855

Conversation

franzpoeschel commented Dec 21, 2020 • edited Loading

franzpoeschel commented Dec 22, 2020 • edited Loading

ax3l commented Feb 8, 2021 • edited Loading

franzpoeschel commented Feb 8, 2021

franzpoeschel commented Feb 8, 2021

ax3l commented Feb 8, 2021

franzpoeschel commented Apr 8, 2021

franzpoeschel commented Apr 9, 2021

ax3l commented Apr 21, 2021 • edited Loading

franzpoeschel commented Apr 22, 2021

ax3l commented Apr 23, 2021 • edited Loading

ax3l left a comment

Choose a reason for hiding this comment

ax3l commented Apr 24, 2021 • edited Loading

franzpoeschel commented Apr 29, 2021 • edited Loading

ax3l commented May 20, 2021 • edited Loading

franzpoeschel commented Dec 21, 2020 •

edited

Loading

franzpoeschel commented Dec 22, 2020 •

edited

Loading

ax3l commented Feb 8, 2021 •

edited

Loading

ax3l commented Apr 21, 2021 •

edited

Loading

ax3l commented Apr 23, 2021 •

edited

Loading

ax3l commented Apr 24, 2021 •

edited

Loading

franzpoeschel commented Apr 29, 2021 •

edited

Loading

ax3l commented May 20, 2021 •

edited

Loading