Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply new frontend design to Container and deriving classes #1115

Merged
merged 7 commits into from
Dec 16, 2021

Conversation

franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented Sep 27, 2021

Follow-up to #886.

This splits Container and deriving classes into a data and an interface class.

TODO:

  • Do this to all those classes. Already done: Container, BaseRecord, RecordComponent, BaseRecordComponent, PatchRecordComponent, Iteration
  • Don't split those classes as they don't have extra data members: PatchRecord, ParticlePatches, ParticleSpecies (its only member ParticlePatches already has handle semantics), Mesh
  • Fix Python bindings
  • Update documentation

This PR prepares for a necessary UX frontend redesign that will be a follow-up:

  • The SCALAR trick for instantiating datasets that have no x,y,z components is confusing for users, I've had to explain this several times now and it doesn't really make sense. Make it so that instead of auto E_x = series.iterations[0].meshes["temperature"][SCALAR] one can equivalently write auto E_x = series.iterations[0].meshes["temperature"].
  • Allow users to create custom subgroups

To do this, probably do something like:

  • Separate the functionality to create subgroups into a class (maybe reuse the class Record for that as it has a similar purpose
  • Separate the functionality to create datasets into a class (now part of RecordComponent)

@franzpoeschel franzpoeschel force-pushed the topic-container-new-design branch 5 times, most recently from d0396be to 4747b28 Compare October 1, 2021 08:49
@franzpoeschel
Copy link
Contributor Author

franzpoeschel commented Oct 1, 2021

I have now also simplified the whole design again a bit.

First approach: Split Series into SeriesInterface and SeriesData. Have two classes that put these two together:

  • internal::SeriesInternal: This has a field SeriesData data; and simply adds SeriesInterface to it. Basically, nothing else but a SeriesData with an interface. Non-copiable, non-movable.
  • Series: The public class. It has a field std::shared_ptr<SeriesData> data; and also adds SeriesInterface to it.

While this works alright, it's quite a lot of classes.
Second approach: I've tried to reduce it to two classes now:

  • internal::SeriesData: unchanged compared to previous approach
  • Series: Now contains all the logic of the Series, and a shared pointer to internal::SeriesData.

This is now quite close to the old implementation again, with one key difference: All stateful data held by the Series class is found at one object that stays at the same place in memory. So, our classes still have one unique internal address which was the goal of the whole redesign.

The class internal::SeriesInternal is now completely gone. A non-owning instance of Series can be created from an instance of internal::SeriesData by defining a shared_ptr without destructor:

Series nonOwningInternalSeries{ std::shared_ptr< internal::SeriesData >{ &seriesData, []( auto const * ){} } };

One downside of this is that for class hierarchies, you now have one redundant shared_ptr per layer in the hierarchy. They are redundant, since they point to the same data, but static_casted to each layer in the hierarchy. There is still only one internal address per object.

@franzpoeschel
Copy link
Contributor Author

I've added documentation for this now. Otherwise, this PR is finished from my side and ready for review @ax3l
It's likely that #1025 will be slightly broken (SeriesInterface and AttributableInterface are now gone again)

@franzpoeschel franzpoeschel force-pushed the topic-container-new-design branch from 8e5e205 to ce67036 Compare October 22, 2021 09:15
@franzpoeschel
Copy link
Contributor Author

@eschnett Hi Erik,
this PR will likely lead to conflicts with your Julia bindings: It simplifies the design of Series and Attributable, by e.g. removing the SeriesInterface class and retaining only two classes: SeriesData for all data members and Series for the implementation. Similar for Attributable.
The same separation between data and implementation classes is applied to several other classes in the openPMD hierarchy, see comments above.

How should we go forward on this? Would it be ok for you to check how much this breaks the Julia bindings? In which order would you prefer your and this PR to be applied?

There's no hurry as I'm in vacation next week.

(Also, some time after your PR is merged, we generally should synchronize on how to maintain the bindings)

@eschnett
Copy link
Contributor

@franzpoeschel I think the ball is currently in my court to add CI tests for the Julia bindings. Before that it'd be difficult for you to maintain the Julia bindings, or to update them when necessary.

Given this, I'd say: go ahead and merge the changes. The branch with the Julia bindings will continue to work (since it doesn't have the new changes), and I'll merge the changes and update the Yggdrasil build in one go. This way there won't be any breakage seen from Julia.

I'd love to have a preview on your changes, but realistically not in the next week, so don't let me slow you down.

@franzpoeschel
Copy link
Contributor Author

Ok, then we'll go forward with this first :)

If you need any help merging, just ask

Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good to me. Thanks a lot for this!

Just one comment about the implementation of the get()ers and one follow-up idea about catching up documenting the m_ members in ...Data classes.

Comment on lines +135 to +136
return const_cast< internal::BaseRecordComponentData & >(
static_cast< BaseRecordComponent const * >( this )->get() );
Copy link
Member

@ax3l ax3l Dec 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these getters, wouldn't return *m_baseRecordComponentData; work here as well and be more legible? :)

I think duplicating the single line does no harm as they are close by and is more readable then the current construct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah .. maybe you're right about this :D
bit of tunnel vision on my side there

Copy link
Member

@ax3l ax3l Dec 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, feel free to fix in a separate follow-up.

RecordComponentData & operator=( RecordComponentData const & ) = delete;
RecordComponentData & operator=( RecordComponentData && ) = delete;

std::queue< IOTask > m_chunks;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all the ...Data classes, it would be really good if we try to catch up and add Doxygen strings to the m_ member variables.

Can be a follow-up and we can split the work if we are unsure about some. I remember we found them hard to follow at times and it's good to document the purpose of individual m_ member variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Inside #1154 I began aliasing these data classes (needed that for something else in there) which made the whole thing a lot more readable in my opinion, e.g.:

namespace internal {
    class RecordComponentData{ /**/ };
}

class RecordComponent {
    using DataClass = internal::RecordComponentData;
    std::shared_ptr< DataClass > m_recordComponentData;
    /**/
};

Copy link
Member

@ax3l ax3l Dec 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is very nice, yes 💖

But my comment was actually about adding doxygen to the members in the data class, so we know what they are for:

struct RecordComponentData
{
    // ...

    //! container for holding data chunks
    std::queue< IOTask > m_chunks;

    //! this sets things on fire
    bool m_fire;
};

I recall that all the original m_dirty and m_flushed members among others were often hard to understand for us without searching through their usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got you
Yep, these classes are probably a good chance to have documentation for our internal state at one central place

@ax3l
Copy link
Member

ax3l commented Dec 15, 2021

Starting CI again because we haven't pushed to the branch in a bit (to see if merge is still green).

@ax3l ax3l closed this Dec 15, 2021
@ax3l ax3l reopened this Dec 15, 2021
Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CI passes then I'll merge this already, since my comments can all be addressed in follow-ups.

@ax3l ax3l self-assigned this Dec 15, 2021
@ax3l ax3l enabled auto-merge (squash) December 15, 2021 23:39
@ax3l ax3l merged commit ad6cab1 into openPMD:dev Dec 16, 2021
@ax3l ax3l mentioned this pull request May 25, 2022
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants