Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set the properties of KpointsData class in the constructor. #5942

Conversation

AhmedBasem20
Copy link
Contributor

Fixes #3287

@ltalirz
Copy link
Member

ltalirz commented Mar 26, 2023

@unkcpz Would you be able to review this?

@unkcpz unkcpz self-requested a review March 27, 2023 09:31
Copy link
Member

@unkcpz unkcpz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AhmedBasem20, thanks for your contribution. This class is more complex than it looks like, can you look at the https://github.com/aiidateam/aiida-quantumespresso/blob/01e759319552645072b7a89dd0347225ab14d635/src/aiida_quantumespresso/calculations/functions/create_kpoints_from_distance.py#L7, an implementation of how we use it in production use case and come back to the PR.

aiida/orm/nodes/data/array/kpoints.py Outdated Show resolved Hide resolved
tests/orm/nodes/data/test_kpoints.py Outdated Show resolved Hide resolved
@AhmedBasem20 AhmedBasem20 requested a review from unkcpz April 14, 2023 01:30
@unkcpz
Copy link
Member

unkcpz commented Apr 17, 2023

The new change however not address the original issue #3287.
I'd propose adding a new class method such as KpointsData.from_mesh(mesh=[2,2,2], offset=[0,0,0]) to be called and create the Data. Otherwise, you can also add a bit complex logic in the __init__ function to avoid the object create with cell, mesh and the same time (the problem I mentioned in #5942 (comment)).

@AhmedBasem20 AhmedBasem20 force-pushed the fix/3287/add-kpointsData-constructor branch from 69096ee to 31cdb8d Compare April 22, 2023 05:53
@AhmedBasem20
Copy link
Contributor Author

Hi @unkcpz I'm sorry for the delay.

I modified the __init__ to avoid setting mesh or cell twice. The current behavior is if the user provides cell and structuredata at the same time, the structuredata will be ignored. Same for mesh and distance.

@mbercx
Copy link
Member

mbercx commented Apr 22, 2023

Hi both, and thanks a lot @AhmedBasem20 for the contribution! The KpointsData class is used by pretty much 90% of AiiDA plugins, so a bold choice to tackle that. 😄

Since I just did quite a rant on data types and their constructors in the discussions, I also wanted to join in here for the review. Let me begin with the most important points from that discussion, adapted for this PR:

  1. Data type constructors should be complete but minimal.

And adapt point 4. from the TL;DR 1 of that discussion's OP:

  1. If we want to create an instance of the class from other larger data structures, we should use a from_X-type method.

So to respond to Jusong's comments:

I'd propose adding a new class method such as KpointsData.from_mesh(mesh=[2,2,2], offset=[0,0,0]) to be called and create the Data.

I'm 💯% on board when it comes to creating a KpointsData from a StructureData node, even though that also add complications, see #3287 (comment). However, one could argue that the mesh input is exactly part of the "minimal and complete" input as I describe above for a KpointsData that describes a kpoints mesh. The fact that you can't properly define a minimal set of properties that represent both a kpoints list and kpoints mesh is one of the reasons I would move towards using separate classes, but that's way beyond the scope of this PR. Still, we should try our best, see below.

Otherwise, you can also add a bit complex logic in the __init__ function to avoid the object create with cell, mesh and the same time (the problem I mentioned in #5942 (comment)).

Your comment immediately highlights one of the reasons I'm against adding superfluous input arguments to a constructor's signature: you have to somehow deal with all the cases, and often it's not entirely clear how to set precedence. Let's avoid that "complex logic" and simply add from_X methods.

Minimal but complete inputs

Now how can we define the set of minimal and complete inputs for KpointsData? Besides my statement that we can't because we try to put different concepts into one class, let's go over the inputs and see what we can do:

  • 🛑 structuredata
  • cell
  • pbc

I think it's already clear what I'm going to say here: structuredata has to go, implement a from_structuredata classmethod instead. Next to this, I'm not sure if the KpointsData needs pbc if we remove the set_kpoints_mesh_from_density method, but that's a backwards-incompatible change beyond the scope of this PR.

  • mesh
  • 🛑 distance
  • offset
  • 🛑 force_parity

mesh and offset have to stay, since it's how we define the k-points mesh concept in the KpointsData class. distance and force_parity can be moved to the from_X method.

  • kpoints
  • 🛑cartesian
  • labels
  • weights
  • 🛑 fill_values

Looking at these I'm getting more and more convinced we should have two separate classes, but I will stop my moaning. kpoints has to stay (obviously ^^), so do labels and weights. cartesian and fill_values can be moved to the corresponding classmethod, e.g. from_list.

@unkcpz what do you think? Since we still have to deal with both kpoints "list" and "mesh" being the same class, we should add some validatation. From the three groups above, the first needs to be set for both "list' and "mesh", the second should only be set for "mesh" and the third only for "list".

Not to beat a dead horse, but this validation again reaffirms my belief that we should separate the two concepts in different classes. I'm frankly confused myself: do we also want an offset for "list"-type KpointsData? Should we validate against providing that input in the constructor? A plugin developer might or might not use offset for some weird reason for "list" types. The clarity that separate classes will bring will go a long way towards making them more user-friendly and avoiding unexpected behaviour.

Conclusion

In short, in this PR I think we should:

  1. Remove the structuredata, distance, force_parity, cartesian and fill_values inputs from the __init__() constructor.
  2. Do some basic validation that the user doesn't specify e.g. both mesh and kpoints in the constructor.
  3. Add from_structure(distance, offset=None, force_parity=False) method, basically combining the empty constructor call with set_kpoints_mesh_from_density().
  4. Add from_list(self, kpoints, cartesian=False, labels=None, weights=None, fill_values=0) method, basically combining the empty constructor call with set_kpoints().

I also think we should add type hinting, but I'm happy to do that in a separate PR. Sorry for the long read, let me know if you have any questions, objections, suggestions!

Footnotes

  1. Which I can't link to directly apparently, -2 for discussions

@mbercx mbercx self-requested a review April 22, 2023 08:21
@AhmedBasem20
Copy link
Contributor Author

Hi @mbercx, thanks a lot for the detailed explanation and suggestions. I don't really get why we should keep the kpoints parameter in __init__ without using it, besides that, I applied the suggested changes (hopefully!).
I noticed that some methods need to be run sequentially (i.e. setting the cell, before setting the kpoints).
So I think chaining the methods like this: kpt=KpointsData.from_structure(structuredata).from_list(kpoints) would be nice.
What do you think? I'll be waiting for your feedback.

@unkcpz
Copy link
Member

unkcpz commented Apr 25, 2023

@mbercx Thanks a lot for the long reading comment, very helpful!
I am second on having a separate class for kpoint to make things more clearly defined, but in this PR, it may be too overwhelming for @AhmedBasem20. I think what he refactored is very close to what you proposed, correct?

Since we still have to deal with both kpoints "list" and "mesh" being the same class, we should add some validatation.

What do you mean by validation? I think for the class it self the type hint should be enough for developers. For users, maybe validate on is the mesh and offset is a three elements list/tuple?

@unkcpz
Copy link
Member

unkcpz commented Apr 25, 2023

I noticed that some methods need to be run sequentially (i.e. setting the cell, before setting the kpoints).
So I think chaining the methods like this: kpt=KpointsData.from_structure(structuredata).from_list(kpoints) would be nice.

@AhmedBasem20 good point, I think this is why @mbercx suggest combining the empty constructor call with set_kpoints_mesh_from_density() for from_structure. In my opinion, it is not a good idea to chaining the methods, I'd rather separate these two and make sure every constructor works independently.

@AhmedBasem20
Copy link
Contributor Author

I think this is why @mbercx suggest combining the empty constructor call with set_kpoints_mesh_from_density() for from_structure.

@unkcpz I'm sorry, I don't seem to get it. shouldn't the set_cell_from_structure() be added in from_structure?

I'm struggling a bit with this PR. I got from you and @mbercx 's detailed review that:

  • set_kpoints() used for kpoints list.
  • set_kpoints_mesh() and set_kpoints_mesh_from_density() used for kpoints mesh.
  • set_cell() and set_cell_from_structure() used for both types of kpoints.

I attempt to apply what @mbercx proposed in my last commit but it seems that I made some mistakes. So I think a thorough code review would be helpful.
Thank you both for your time.

@sphuber
Copy link
Contributor

sphuber commented May 17, 2023

Thanks for all the work you have done so far. I agree with @mbercx that the real problem here is the fact that a single class is serving two different purposes. But we cannot and should not change that here and now. This should be left for v3.0 since it will be backwards incompatible. This is being tracked in #2686 and aiidateam/team-compass#21

That being said, I think we can move forward with the changes proposed in this PR if the following conditions are met:

  1. They provide a useful feature
  2. The changes are fully backwards-compatible
  3. The changes do no unnecessarily complicate the future migration/refactoring of KpointsData to another plugin for aiida-core==3.0.

I think 1 is pretty clearly satisfied and I think so is 2, since only classmethods are added. The constructor being overridden should also be fine.

That leaves 3. The only potential problem that I can see is that we are now adding new methods, that will have to be deprecated and changed in the future. Are these new methods really necessary? Would the improved constructor not be sufficient for most cases?

  • from_structure: As it stands, it is neither complete nor very useful.
  • from_density: This should currently raise an AttributeError because it will try to get the cell which is not defined. Really this should be merged with from_structure to something like
    @classmethod
    def from_density(structure: StructureData, density: int | float, offset: list[int] | None = None, force_parity: bool = False):
    But then again this method is currently provided by aiida_quantumespresso.calculations.functions.create_kpoints_from_distance with a bunch of additional checks. So is now the right time to be adding this here?
  • from_list: Is also incomplete, because it would require cell and pbc.

I think we should maybe keep this PR to the minimum scope of the original issue and just allow complete initialization through the constructor. As it stands, the current constructor in this PR is not complete I think. There are two varieties that should be constructable:

  1. Kpoints based on mesh
  2. Kpoints based on explicit list

Both cases require the cell and pbc attributes (with a default for pbc possible). Then each case requires:

  1. mesh and optional offset
  2. kpoints, with optional cartesian, labels, weights and fill_values

So maybe a better constructor would be:

def __init__(
    self,
    cell: list[int | float],
    pbc: list[bool] | bool = True,
    mesh: list[int | float] | None = None,
    offset: list[int] | None = None,
    kpoints: list[list[int | float]] = None,
    cartesian: bool | None = None,
    labels: list[str] | None = None,
    weights: list[int | float] | None = None,
    fill_values: int | None = None
):
    if mesh is None and kpoints is None:
        raise ValueError('either `mesh` or `kpoints` should be specified.')

    blocked_args_kpoints = ('offset',)
    if kpoints is not None and and any(arg is not None for arg in blocked_args_kpoints):
        raise ValueError(
            f'following arguments cannot be specified when `kpoints` is defined: {", ".join(blocked_args_kpoints)}.'
        )

    blocked_args_mesh = ('cartesian', 'labels', 'weights', 'fill_values')
    if mesh is not None and any(arg is not None for arg in blocked_args_mesh):
        raise ValueError(
            f'following arguments cannot be specified when `mesh` is defined: {", ".join(blocked_args_mesh)}.'
        )

I think that would be all the validation that would be necessary. The defaults of the arguments besides mesh and kpoints will be taken care off by the call to set_kpoints_from_mesh and set_kpoints.

Thoughts @mbercx @unkcpz ?

@AhmedBasem20 I suggest you don't change the code now, but we first discuss and agree on what we want. Only once we have made a decision, should we continue implementing or you might end up doing work for nothing going back and forth.

@mbercx
Copy link
Member

mbercx commented May 22, 2023

Thanks for the comments, @sphuber! Going back through my own comment above, it seems I was indeed tripped up a bit by the design of the KpointsData, which admittedly I had not gone through entirely, e.g. I wasn't aware from_list would also require the cell and pbc. We can debate whether these are required when redesigning the data types as discussed in aiidateam/team-compass#21.

Although I still believe the from_X methods could be useful, as they would allow us to create a more intuitive and user-friendly API for the two kpoints varieties KpointsData represents, I agree that it might be best if we stick to only adding the constructor here for simplicity. It's a useful addition on its own, and the discussion on its addition is separate from the from_X methods.

At first glance the constructor and validation you propose seems good. 👍 We should of course also still add a clear but as-succinct-as-possible docstring, as well as adapt/extend the documentation of KpointsData.

@AhmedBasem20 apologies for complicating the merging of this PR, but I did say it was a bold choice to tackle KpointsData. 🙃

@AhmedBasem20
Copy link
Contributor Author

@mbercx No worries, you're right it seems like it is not a very good first issue xD
But I find the discussion here about different design choices is quite interesting. I'll wait until you reach an agreement to tackle this:)

@sphuber
Copy link
Contributor

sphuber commented May 23, 2023

Shall we close this PR for the time being then? @mbercx do we have a design document where we can summarize discussions like this? This will be important for the planning of the redesign and don't want to loose the insights from this thread.

@unkcpz
Copy link
Member

unkcpz commented May 23, 2023

I agree to close the PR for the moment, @AhmedBasem20 is focusing more on the GSoC project which has high priority. I discussed this with him yesterday and we want to set a discussion and pair coding to come back to this PR when he got spare time from GSoC.

@mbercx do we have a design document where we can summarize discussions like this?

Maybe discourse is a good place to put all such things together when we have one 😉

@mbercx
Copy link
Member

mbercx commented May 23, 2023

Ah, I thought I was clear that I think adapting the constructor is still a valuable contribution? Not sure why we wouldn't just go ahead with this now?

Maybe discourse is a good place to put all such things together when we have one 😉

Not sure I agree with that, but that's a discussion for another day. ^^ It does remind me I still have to start that free trial! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Define KpointsData via constructor
5 participants