Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ADIOS variables for openPMD attributes #813

Merged

Conversation

franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented Nov 2, 2020

Note: Don't get scared at the diff, this is based upon my topic-streaming branch, since many of the issues leading to this PR became obvious only upon implementing streaming mode.
For comparing: franzpoeschel/openPMD-api@topic-streaming...franzpoeschel:topic-adios-variables-for-attributes

It has turned out that ADIOS attributes are too restricted to use them for modeling openPMD attributes. ADIOS attributes are independent of ADIOS steps, whereas in openPMD we create new sets of attributes for each iteration (which in our Streaming API are equivalent to ADIOS steps). This is not the preferred way to model scientific data in ADIOS. ADIOS prefers metadata to be setup once only and reuse that metadata along steps. The mid-term goal might be to investigate a data layout in openPMD that agrees with this concept better. Up until then, the short-term solution is to use ADIOS variables for openPMD attributes.

ADIOS provides a zero-dimensional variable shape called global single value for storing scalars, intended for metadata that changes over steps. Since our metadata is not exclusively scalar, we cannot fully rely on those, but need to use regular global array variables as well.

In order to distinguish openPMD datasets from openPMD attributes, openPMD datasets will from now on be suffixed by /__data__. As an example, an E field created by PIConGPU:

int8_t    /data/0/fields/E/axisLabels                                      {3, 2}                                                                           
  string    /data/0/fields/E/dataOrder                                       scalar                                                                           
  string    /data/0/fields/E/fieldSmoothing                                  scalar                                                                           
  string    /data/0/fields/E/geometry                                        scalar                                                                           
  double    /data/0/fields/E/gridGlobalOffset                                {3}    
  float     /data/0/fields/E/gridSpacing                                     {3}   
  double    /data/0/fields/E/gridUnitSI                                      scalar      
  float     /data/0/fields/E/timeOffset                                      scalar
  double    /data/0/fields/E/unitDimension                                   {7}   
  float     /data/0/fields/E/x/__data__                                      {56, 56, 56}
  float     /data/0/fields/E/x/position                                      {3}    
  double    /data/0/fields/E/x/unitSI                                        scalar
  float     /data/0/fields/E/y/__data__                                      {56, 56, 56}
  float     /data/0/fields/E/y/position                                      {3}   
  double    /data/0/fields/E/y/unitSI                                        scalar 
  float     /data/0/fields/E/z/__data__                                      {56, 56, 56}
  float     /data/0/fields/E/z/position                                      {3}    
  double    /data/0/fields/E/z/unitSI                                        scalar

For bpls, the -A switch is now useless for displaying openPMD attributes only without datasets, as a workaround the following regexp may be used: bpls --regexp '^(.(?!/__data__))*$'.

TODO/DONE:

  • Feature-complete, yet possibly not fully efficient implementation of openPMD attributes via ADIOS variables.
  • A toggle between the old and new ADIOS layout in order to keep this breaking change optional and experimental for a while. (Thinking that this one is going to be ugly..)
  • Documentation
  • Add a CI test that sets OPENPMD_NEW_ATTRIBUTE_LAYOUT=1
  • Wait for Streaming Support #570 to be merged

Reimplementations of things that would come out of the box by using ADIOS attributes:

  • Buffered writing at the right times.
  • Buffered reading (missing only for vector<string> type
  • The vector<string> type. This one is implemented as a two-dimensional char-array where each line represents a zero-terminated (zero-padded) string. While this works fine as a serialization for openPMD data using ADIOS2, self-descriptiveness is somewhat limited by this approach as the output of bpls on such an attribute shows:
# new:
  int8_t       /vecString          {3, 8}
    (0,0)    118 101 99 116 111 114
    (0,6)    0 0 111 102 0 0
    (1,4)    0 0 0 0 115 116
    (2,2)    114 105 110 103 115 0
# vs. old:
  string       /vecString                    attr   = {"vector", "of", "strings"}                                                                             

Update: The -S flag on bpls solves this:

int8_t /vecString {3, 8}
(0,0) "vector"
(1,0) "of"
(2,0) "strings"
  • Pre-reading attributes upon adios2::Engine::BeginStep. Especially in the SST engine, each performed variable read requires communication back to the data source (writing application). Hence, we might want to (lazily) read and buffer attributes (= ADIOS variables without /__data__ suffix) upon each BeginStep.
    Update: I've pushed an implementation for this. There are some evil typecasts involved in order to allow buffering everything into one contiguous slab of memory.

@ax3l ax3l changed the title Use ADIOS variables for openPMD attributes [WIP] Use ADIOS variables for openPMD attributes Nov 3, 2020
@franzpoeschel franzpoeschel force-pushed the topic-adios-variables-for-attributes branch from b66faab to a9f258f Compare November 5, 2020 14:30
@franzpoeschel franzpoeschel force-pushed the topic-adios-variables-for-attributes branch from cb5120d to 74df9b9 Compare November 24, 2020 12:04
@franzpoeschel
Copy link
Contributor Author

franzpoeschel commented Nov 24, 2020

  • A toggle between the old and new ADIOS layout in order to keep this breaking change optional and experimental for a while. (Thinking that this one is going to be ugly..)

Trying that one on a new branch currently.

Things are mostly running, but I still need to hack some workarounds back in for some edge cases.

EDIT: Merged back into this branch here, things are working now

@franzpoeschel franzpoeschel force-pushed the topic-adios-variables-for-attributes branch 4 times, most recently from 1bb5ec1 to 64c55bb Compare December 9, 2020 16:57
@franzpoeschel franzpoeschel force-pushed the topic-adios-variables-for-attributes branch 4 times, most recently from 3070ff3 to 96dd562 Compare December 16, 2020 16:20
@franzpoeschel franzpoeschel force-pushed the topic-adios-variables-for-attributes branch 4 times, most recently from 072fa93 to 9b4767f Compare December 23, 2020 11:55
@franzpoeschel franzpoeschel force-pushed the topic-adios-variables-for-attributes branch 2 times, most recently from 599c439 to 955cfa2 Compare January 4, 2021 16:17
@franzpoeschel
Copy link
Contributor Author

franzpoeschel commented Jan 4, 2021

I suspect that the BP engine has issues with scalar string variables on windows.

3: C:\projects\openpmd-api\test\SerialIOTest.cpp(3032): FAILED:
3:   {Unknown expression after the reported line}
3: due to unexpected exception with message:
3:   Unknown iterationEncoding: �*��L

Possibly related: ornladios/ADIOS2#2561

@franzpoeschel franzpoeschel force-pushed the topic-adios-variables-for-attributes branch 2 times, most recently from 77309e6 to 17b5ee0 Compare January 4, 2021 22:29
@franzpoeschel
Copy link
Contributor Author

I suspect that the BP engine has issues with scalar string variables on windows.

3: C:\projects\openpmd-api\test\SerialIOTest.cpp(3032): FAILED:
3:   {Unknown expression after the reported line}
3: due to unexpected exception with message:
3:   Unknown iterationEncoding: �*��L

Possibly related: ornladios/ADIOS2#2561

Greetings from Windows, following example program:

#include <adios2.h>
#include <iostream>
#include <string>
#include <vector>

int
main( int argsc, char ** argsv )
{

    std::string engine_type = "bp4";

    adios2::ADIOS adios{ };
    adios2::IO IO = adios.DeclareIO( "IO" );
    IO.SetEngine( engine_type );

#define PUT_INT true

    {
        // write
        adios2::Engine engine = IO.Open( "stream", adios2::Mode::Write );
        engine.BeginStep();
#if PUT_INT
        auto othervariable = IO.DefineVariable< int >(
            "someothertype", { 10 }, { 0 }, { 10 } );
        std::vector< int > v( 10, 1234 );
        engine.Put( othervariable, v.data() );
#endif
        auto var1 = IO.DefineVariable< std::string >( "firststring" );
        auto var2 = IO.DefineVariable< std::string >( "secondstring" );
        std::string firststring = "firststring";
        std::string secondstring = "secondstring";
        engine.Put( var1, firststring );
        engine.Put( var2, secondstring );
        engine.EndStep();
        engine.Close();
    }
    {
        // read
        adios2::Engine engine = IO.Open( "stream", adios2::Mode::Read );
        engine.BeginStep();
        auto var1 = IO.InquireVariable< std::string >( "firststring" );
        auto var2 = IO.InquireVariable< std::string >( "secondstring" );
        std::string firststring;
        std::string secondstring;
        engine.Get( var1, firststring );
        engine.Get( var2, secondstring );
#if PUT_INT
        auto othervariable = IO.InquireVariable< int >( "someothertype" );
        std::vector< int > v( 10 );
        engine.Get( othervariable, v.data() );
#endif
        engine.EndStep();

        std::cout << " First Variable:\t" << firststring
                    << "\nSecond Variable:\t" << secondstring << std::endl;
#if PUT_INT
        std::cout << "Int variable:\t" << v[ 0 ] << std::endl;
#endif
        engine.Close();
    }
}

With ADIOS 2.6.0, this code snippet gives the following error:
Unbenannt

ADIOS2 2.6.0, but #define PUT_INT true:

First Variable:        firststring
Second Variable:        secondstring 

With ADIOS2 2.7.0:

First Variable:        firststring                                                                                                                          
Second Variable:        secondstring                                                                                                                         
Int variable:   1234

Conclusion: Scalar string variables are broken on Windows in ADIOS 2.6.0, I'll deactivate the Windows tests until we make the move to ADIOS 2.7.0 (which I propose we do soon).

franzpoeschel added a commit to franzpoeschel/openPMD-api that referenced this pull request Jan 18, 2021
@franzpoeschel franzpoeschel force-pushed the topic-adios-variables-for-attributes branch from 17b5ee0 to 51d57d3 Compare January 18, 2021 13:11
@franzpoeschel franzpoeschel changed the title [WIP] Use ADIOS variables for openPMD attributes Use ADIOS variables for openPMD attributes Jan 18, 2021
May be opted in to via OPENPMD_NEW_ATTRIBUTE_LAYOUT or
JSON parameter adios2.new_attribute_layout = true

Original commits:

Write datasets as /.../__data__ (don't read yet)

Read datasets written as /.../__data__ back correctly

Write attributes as global single values

Read attributes back from single global values

Make attribute/variable puts Sync for now

Enable stream-based processing of file-based engines

Keep track of attributes written during the running step

Write VEC_STRING as 2D variable

Read 2D string vector attributes

Fix indexing in VEC_STRING attributes

Buffer attribute writes

Still write them in sync mode tho

Use deferred attribute writes

Perform buffered attribute writes only upon advance/close_file

Preload attributes

Use preloaded attributes

Fix loading of complex types

Properly call destructors of all types

Remove special handling for strings

Fix struct visibility

Fix initialization order fiasco

Make the CI happy

1. const stuff
2. allow building when openPMD_HAVE_ADIOS2=0

Avoid some needless copying

Adhere to rule of 5

Somewhat fix Attribute loading

Parameterize flush by performputs implementation

Allow old and new mode via JSON

adios2.newAttributeLayout = true

Reimplement BP4 workaround

Some documentation

Some testing

Use OPENPMD_NEW_ATTRIBUTES_LAYOUT env var

Two little fixes

Undo unnecessary whitespace changes

Add some documentation

Add comments to ADIOS2PreloadAttributes header

Add CI run for new attribute layout

Support ADIOS2 SSC engine

Deactivate new layout tests on Windows

See here
openPMD#813 (comment)

Undo some unnecessary whitespace changes

Some cleanup

Don't use std::function destructors
@franzpoeschel franzpoeschel force-pushed the topic-adios-variables-for-attributes branch from c35deab to 1ee6193 Compare January 29, 2021 15:05
@@ -210,7 +224,7 @@ ADIOS2IOHandlerImpl::fileSuffix() const
static std::map< std::string, std::string > endings{
{ "sst", "" }, { "staging", "" }, { "bp4", ".bp" },
{ "bp3", ".bp" }, { "file", ".bp" }, { "hdf5", ".h5" },
{ "nullcore", ".nullcore" }
{ "nullcore", ".nullcore" }, { "ssc", ".ssc" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did this accidentally slip in? :)

@ax3l ax3l self-assigned this Feb 3, 2021
Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work, thank you! ✨

The comments I have are minor, I would merge this already in.

.github/workflows/unix.yml Show resolved Hide resolved
docs/source/backends/adios2.rst Show resolved Hide resolved
# endif
( std::is_same< T, rep >::value )
{
std::string metaAttr = "__is_boolean__" + name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memo to myself: standardize

( void )impl;
static std::set< std::string > streamingEngines = {
"sst", "insitumpi", "inline", "staging", "nullcore"
"sst", "insitumpi", "inline", "staging", "nullcore", "ssc"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ssc slipped in here. we can generally leave it in or make it it's own PR (that would be cleaner)

@ax3l ax3l merged commit ed2e057 into openPMD:dev Feb 3, 2021
@ax3l ax3l mentioned this pull request Feb 3, 2021
5 tasks
ax3l added a commit that referenced this pull request Feb 3, 2021
Minor updates to CI and docs from the #813 PR
@ax3l ax3l mentioned this pull request Feb 8, 2021
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants