Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add (optional) support for the AiiDA Gaussian Datatypes #104

Merged
merged 13 commits into from
Sep 2, 2020

Conversation

dev-zero
Copy link
Contributor

@dev-zero dev-zero commented Apr 21, 2020

No description provided.

@dev-zero dev-zero force-pushed the feature/integrate-datatypes branch 2 times, most recently from 7fb5157 to 6871391 Compare April 22, 2020 16:39
Copy link
Contributor

@yakutovicha yakutovicha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @dev-zero, very nice work. I've made the first path through and I really liked it! I am not 100% sure if I understood everything, but if you can reply to my comments, things would get more clear. Essentially, the two big requests I have are:

  • Can you please provide a plain example that allows see how a user is supposed to define pseudo/basisset taking from the AiiDA Gaussian object (see my comment below).
  • Would it be possible to avoid the validate_basissets and validate_pseudos checks for the time being? I believe after we mature the API of Cp2kInput class (or replace it with cp2k-input-tools) they can be made way simpler and easier to maintain.

aiida_cp2k/calculations/__init__.py Outdated Show resolved Hide resolved
aiida_cp2k/calculations/datatype_helpers.py Outdated Show resolved Hide resolved
aiida_cp2k/calculations/__init__.py Show resolved Hide resolved
aiida_cp2k/calculations/__init__.py Show resolved Hide resolved
test/test_gaussian_datatypes.py Outdated Show resolved Hide resolved
test/test_gaussian_datatypes.py Outdated Show resolved Hide resolved
return _validate_gdt_namespace(basissets, DataFactory("gaussian.basisset"), "basis set")


def validate_basissets(inp, basissets, structure):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a lot of code duplication with the validate_pseudos function below. Can we merge them? It would make a lot of sence to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I am tempted to suggest to not use this check. It appears to be quite a difficult one to maintain. Let's maybe work a bit more on the Cp2kInput class to make it more robust and with a nice/powerful API. Once this is done, we introduce those checks back but we base them on the new Cp2kInput API. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a lot of code duplication with the validate_pseudos function below. Can we merge them? It would make a lot of sence to me.

I actually tried it once (similar to the _validate_gdt_namespace), but I ended up with highly parametrized code which was barely readable. and not much less code. Which is why opted for some duplication in the end instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I am tempted to suggest to not use this check. It appears to be quite a difficult one to maintain. Let's maybe work a bit more on the Cp2kInput class to make it more robust and with a nice/powerful API. Once this is done, we introduce those checks back but we base them on the new Cp2kInput API. What do you think?

I don't like it for several reasons:

  • for the fully automated case where the user does not specify a KIND section anymore we have to generate the missing input at some place, hence we need some mechanism for that. Side-note: This could be done in Cp2kInput (or the respective cp2k-input-tools), but is also something which could be refactored later.
  • for the complex manual assignment cases we should be validating to avoid bogus links in the provenance graph and ensure CP2K is picking up only the basissets/pseudos passed by AiiDA.
  • loosening constraints (as done by these checks) should always be possible without breaking existing user-side code, while the other way round is most often not. So I would rather like to have it more complex on the developer side for now but safer and easier for the user to do the right thing, rather than the other way round

Copy link
Contributor

@yakutovicha yakutovicha Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or the fully automated case where the user does not specify a KIND section anymore we have to generate the missing input at some place, hence we need some mechanism for that. Side-note: This could be done in Cp2kInput (or the respective cp2k-input-tools), but is also something which could be refactored later.

Will it work for multiple FORCE_EVAL sections? That was my primary concern. See an example here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree that such a check is important. I am just worried that it may limit the generality of the plugin. My secondary concern is that we may need to reimplement things when we move to the cp2k-input-tools. I didn't try it yet, but I believe you have a nice navigation mechanism there.

Copy link
Contributor Author

@dev-zero dev-zero Sep 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it work for multiple FORCE_EVAL sections? That was my primary concern. See an example here

In the "manual" mode (user specifies basisset/pseudo names explicitly and passes the respective objects as inputs): yes. Here the validation is:

  • for all KIND sections, see whether we have the basisset/pseudo in the basissets/pseudos inputs
  • for all passed basissets/pseudos verify that they have been referenced in a KIND section

In the fully automated mode: not yet. The difficulty is that we might have to parse the fragments and the types of the force_evals to figure out where to create a subsys/kind section and for which symbols, which is indeed better suited in cp2k-input-tools since there we can directly validate against the schema.

The problem is that without those checks, the user can essentially pass in whatever and it may or may not work depending on what is present on the target node, which gives a false sense of security and bogus provenance graphs at best, and calculations with unintended basissets/pseudos at worst.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests for multiple FORCE_EVAL added.

test/test_gaussian_datatypes.py Outdated Show resolved Hide resolved
test/test_gaussian_datatypes.py Outdated Show resolved Hide resolved
aiida_cp2k/calculations/datatype_helpers.py Outdated Show resolved Hide resolved
@dev-zero dev-zero force-pushed the feature/integrate-datatypes branch from 10a20af to 78a11ba Compare August 25, 2020 09:37
@dev-zero dev-zero force-pushed the feature/integrate-datatypes branch from 78a11ba to c24fec9 Compare August 31, 2020 09:17
@dev-zero
Copy link
Contributor Author

@yakutovicha thanks for the review, I am working on the test fixtures now (and trying to figure out why tests now fail even though I didn't change the functionality). Can you please give feedback on whether the examples are clean enough and the move of _unpack?

@yakutovicha
Copy link
Contributor

@yakutovicha thanks for the review, I am working on the test fixtures now (and trying to figure out why tests now fail even though I didn't change the functionality). Can you please give feedback on whether the examples are clean enough and the move of _unpack?

sure, I will do it by tomorrow morning.

@yakutovicha
Copy link
Contributor

@dev-zero fyi: I am testing things right now.

@dev-zero
Copy link
Contributor Author

dev-zero commented Sep 1, 2020

@yakutovicha cool, thanks. The test failure should now be fixed as well.

@dev-zero dev-zero force-pushed the feature/integrate-datatypes branch from d87273c to d3ed164 Compare September 2, 2020 12:01
@dev-zero dev-zero force-pushed the feature/integrate-datatypes branch from d3ed164 to ddeca70 Compare September 2, 2020 13:25
@yakutovicha yakutovicha self-requested a review September 2, 2020 13:45
Copy link
Contributor

@yakutovicha yakutovicha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge it! Thanks for the nice work @dev-zero!

@yakutovicha yakutovicha merged commit 7db3189 into aiidateam:develop Sep 2, 2020
@dev-zero dev-zero deleted the feature/integrate-datatypes branch September 3, 2020 12:24
@oschuett
Copy link
Collaborator

oschuett commented Sep 4, 2020

The dashboard test started failing after this PR. The error message says:

aiida.common.exceptions.MissingEntryPointError: Entry point 'gaussian.basisset' not found in group 'aiida.data'. Try running `reentry scan` to update the entry point cache.

However, we are already running reentry after installing cp2k-aiida.

@yakutovicha
Copy link
Contributor

Thanks for the report Ole. I will fix that.

@yakutovicha
Copy link
Contributor

@oschuett, I think #110 should fix the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants