Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split Series into an internal and an external class #886

Merged
merged 8 commits into from
Feb 25, 2021

Conversation

franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented Jan 6, 2021

For comparing: franzpoeschel/openPMD-api@47d504d...franzpoeschel:topic-internal-series

This is a suggestion to address fix for the issues with our current C++ object model of the openPMD hierarchy, observed in #534, #804 and #814.

The interface is now feature-complete, existing applications using the openPMD API should now work as before. (Exception: The class Attributable exists no longer and is now a template AttributableImpl<T>.)

The major downside of the approach that I am proposing is that it is a very heavy change, moving large chunks of code to different places. Updating this to the current dev will likely be a project of its own.

I've tried to avoid breaking our API, but given that this is a heavy change to our frontend classes, I still consider this API breaking.

Idea: Split Series into several classes:
1. An internal class internal::SeriesData that holds the actual data members, but has no logic otherwise:

// simplified
class SeriesData : public AttributableData
{
public:
    explicit SeriesData() = default;

    SeriesData(SeriesData const &) = delete;
    SeriesData(SeriesData &&) = delete;

    SeriesData & operator=(SeriesData const &) = delete;
    SeriesData & operator=(SeriesData &&) = delete;

    virtual ~SeriesData() = default;

    Container< Iteration, uint64_t > iterations;

OPENPMD_private:
    //[...]
    std::shared_ptr< std::string > m_filenamePrefix;
    std::shared_ptr< std::string > m_filenamePostfix;
    std::shared_ptr< int > m_filenamePadding;
    //[...]
}; // SeriesData

This class cannot be copied or moved, making it safe to reference by pointer. The thousands of shared_ptr members can be replaced by plain data members in a subsequent step.

2. A class SeriesImpl that has no data members, but holds only the implementation and interface of Series and a single pointer to SeriesData. This pattern works only because SeriesData is not moveable and the pointer to the instance will hence not change:

// simplified
class SeriesImpl : AttributableImpl
{
friend class Iteration;

public:
    explicit SeriesImpl() = default;

    std::string
    openPMD() const;
    /** Set the version of the enforced <A
     * HREF="https://github.com/openPMD/openPMD-standard/blob/latest/STANDARD.md#hierarchy-of-the-data-file">openPMD
     * standard</A>.
     *
     * @param   openPMD   String <CODE>MAJOR.MINOR.REVISION</CODE> of the
     * desired version of the openPMD standard.
     * @return  Reference to modified series.
     */
    SeriesImpl &
    setOpenPMD( std::string const & openPMD );

    /**
     * @return  32-bit mask of applied extensions to the <A
     * HREF="https://github.com/openPMD/openPMD-standard/blob/latest/STANDARD.md#hierarchy-of-the-data-file">openPMD
     * standard</A>.
     */
    uint32_t
    openPMDextension() const;

    // […]

    /** Execute all required remaining IO operations to write or read data.
     */
    void flush();

protected:
    // […]
    internal::SeriesData * m_series;

    inline internal::SeriesData &
    get()
    {
        return *m_series;
    }

    inline internal::SeriesData const &
    get() const
    {
        return *m_series;
    }
    // […]
};

These can then be combined to define an internal and an external Series class:
3. Internal Series:

// simplified
   class SeriesInternal
        : public SeriesData
        , public SeriesImpl
    {
    public:
#if openPMD_HAVE_MPI
        SeriesInternal(
            std::string const & filepath,
            Access at,
            MPI_Comm comm,
            std::string const & options = "{}" );
#endif

        SeriesInternal(
            std::string const & filepath,
            Access at,
            std::string const & options = "{}" );
        ~SeriesInternal();

        inline
        SeriesData & getSeries()
        {
            return *this;
        }

        inline
        SeriesData const & getSeries() const
        {
            return *this;
        }
    };

This class is not copyable and movable since it derives directly from SeriesData. It is merely a SeriesData with implementation.

4. An external class Series with (nearly) the same interface as the current Series that can be safely copied.

// simplified
class Series : public SeriesImpl< Series >
{
private:
    std::shared_ptr< internal::SeriesInternal > m_series;

public:
#if openPMD_HAVE_MPI
    Series(
        std::string const & filepath,
        Access at,
        MPI_Comm comm,
        std::string const & options = "{}" );
#endif

    Series(
        std::string const & filepath,
        Access at,
        std::string const & options = "{}" );

    Container< Iteration, uint64_t > iterations;

    /**
     * @brief Entry point to the reading end of the streaming API.
     *
     * Creates and returns an instance of the ReadIterations class which can
     * be used for iterating over the openPMD iterations in a C++11-style for
     * loop.
     * Look for the ReadIterations class for further documentation.
     *
     * @return ReadIterations
     */
    ReadIterations readIterations();

    /**
     * @brief Entry point to the writing end of the streaming API.
     *
     * Creates and returns an instance of the WriteIterations class which is a
     * restricted container of iterations which takes care of
     * streaming semantics.
     * The created object is stored as member of the Series object, hence this
     * method may be called as many times as a user wishes.
     * Look for the WriteIterations class for further documentation.
     *
     * @return WriteIterations
     */
    WriteIterations writeIterations();

    // @todo make these private
    inline
    internal::SeriesData & getSeries()
    {
        return m_series->getSeries();
    }

    inline
    internal::SeriesData const & getSeries() const
    {
        return m_series->getSeries();
    }
};

Note that the actual data members are hidden behind the shared_ptr<SeriesInternal> m_series. The destructor of SeriesInternal will only run once all copies of Series are gone.
The classes ReadIterations and WriteIterations can now safely hold copies of the Series, pointers are no longer necessary. (This is the reason why those methods are part of Series and not of SeriesImpl)

Note A similar split was necessary to do for Attributable. I have left those parts out in the code snippets above. We want the interface of Attributable available (1) for SeriesInternal (2) as well as for Series. The data members should again be present only for the internal class.
The template pattern that I have used for this corresponds to interfaces/traits in other languages (e.g. implement SeriesImpl for SeriesInternal) and is easily composable, so this was no issue.

TODO

  • Discuss whether we should continue this or go for another approach.
  • Look into compile times
  • Make WriteIterations/ReadIterations take a full copy
  • Rebase onto Series copy test thingy
  • Naming. AttributableImpl -> Attributable, LegacyAttributable -> ?
  • Cleanup
  • Idea: I think we can keep roughly the same design by using the PIMPL pattern: Instead of passing SeriesWrapper as a type parameter to SeriesImpl, we could instead just directly pass a pointer to SeriesData. This would heavily reduce the amount of templates (and hence compile time), and the new design ensures that passing a pointer is safe.

@franzpoeschel franzpoeschel force-pushed the topic-internal-series branch 3 times, most recently from f78a2c7 to d419923 Compare February 18, 2021 17:05
@franzpoeschel franzpoeschel changed the title [WIP] [Draft] Split Series into an internal and an external class [WIP] Split Series into an internal and an external class Feb 18, 2021
@franzpoeschel franzpoeschel force-pushed the topic-internal-series branch 3 times, most recently from 7866418 to c68def1 Compare February 19, 2021 10:11
@franzpoeschel franzpoeschel changed the title [WIP] Split Series into an internal and an external class Split Series into an internal and an external class Feb 19, 2021
@franzpoeschel franzpoeschel requested a review from ax3l February 19, 2021 14:28
namespace internal
{
/**
* @brief Data members for Series. Pinned at one memory location.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: maybe it's good to describe that Impl and Data are mixing classes and that ...Internal is used as the fixed in-memory storage for each interface class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added that description to design.rst.

@franzpoeschel franzpoeschel mentioned this pull request Feb 24, 2021
2 tasks
Make AttributableImpl destructor virtual

Rename Attributable -> LegacyAttributable

Use Attributable::retrieveSeries for Writable::flushSeries
Make BASEPATH a member of SeriesImpl

Documentation for Series
This is stolen from PR openPMD#804.
This PR fixes the issues from that one, so those tests are passing now.
Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good, thank you for spearheading this!

A left a small question and we should definitely document the new structure, which can also go with further PRs that refactor other classes.

Do you plan a follow-up that removes LegacyAttributable?


Container< Iteration, uint64_t > iterations;
internal::SeriesData * m_series;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if at this point, is a raw pointer the best pattern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The …Impl classes are pure interface classes, with the explicit goal of allowing us to be generic over resource management. The reason why we need the Impl classes is because we want a unified implementation for classes that manage their resources in different ways:

  • As part of the same object, pinned in one memory location, as in SeriesInternal.
  • Managed as a shared resource with a shared_ptr, as in Series.

So, the concrete approach at resource management depends on the class deriving from SeriesImplSeriesImpl must support any approach. Two ways to do this:

  1. Use templates to specifiy the kind of resource management to be used by SeriesImpl. That's what I had implemented previously. Avoids using pointers, at the cost of introducing templates, nearly doubling compile times in my tests.
  2. Relieve SeriesImpl off the task of resource management alltogether and just use a pointer. It's the task of the deriving class to deal with resource management and ensure via RAII mechanisms that this is done in a safe manner.

* Access via stepStatus() method to automatically select the correct
* one among both flags.
*/
std::shared_ptr< StepStatus >
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memo to ourselves: as mentioned in the PR description, those shared_ptr members will become regular members

@franzpoeschel
Copy link
Contributor Author

Do you plan a follow-up that removes LegacyAttributable?

As explained in the documentation now, LegacyAttributable is a class that adds the Attributable mixin to classes that do not yet follow the new design. So, if we apply the new design to all our classes, we should be able to remove that one again some day.

Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hurray! ✨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants