Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BasePwCpInputGenerator: Remove superfluous validation #926

Merged
merged 2 commits into from
May 5, 2023

Conversation

mbercx
Copy link
Member

@mbercx mbercx commented May 5, 2023

In the prepare_for_submission script of the BasePwCpInputGenerator,
there still is some validation for the pseudopotentials (see that the elements
match those in the provided structure) as well as the FIXED_COORDS
setting.

In general, we want to do all validation using validators of the appropriate port
or port name space. In the case of these two inputs, however, this validation is
already done by proper validators, so we can simply remove it.

Even in case the structure port is not in the name space because a wrapping
work chain excluded it when exposing the inputs, the validation will still occur when
the process is submitted in the steps of the work chain.

@mbercx
Copy link
Member Author

mbercx commented May 5, 2023

@sphuber I just opened this while working on #922. Looking into this raised an interesting question regarding validation. Just a few notes to discuss:

  1. Ideally validation should happen before a process is run, instead of excepting it, i.e. via a proper validator.
  2. Validation via a validator is only possible if the input is provided (obviously).
  3. Some inputs are only provided during runtime by work chains that wrap the process.

This has already led to issues in #722. Here we wanted to check that a parent_folder is provided for PwCalculations where a restart from the charge density is needed (nscf/bands). This is a somewhat unique type of validation, since we want to check if a certain input is present. Our solution there was a bit imperfect because it requires every work chains that wraps the PwCalculation to override its validator to PwCalculation.validate_inputs_base. I think I may have a better approach for this now, see further below.


🤔 To validator or not to validator

Here I was wondering about a different conundrum. I wanted to remove the following code:

# Check that a pseudo potential was specified for each kind present in the `StructureData`
kinds = [kind.name for kind in self.inputs.structure.kinds]
if set(kinds) != set(self.inputs.pseudos.keys()):
formatted_pseudos = ', '.join(list(self.inputs.pseudos.keys()))
formatted_kinds = ', '.join(list(kinds))
raise exceptions.InputValidationError(
'Mismatch between the defined pseudos and the list of kinds of the structure.\n'
f'Pseudos: {formatted_pseudos};\nKinds: {formatted_kinds}'
)

Since we are already validating this in a proper validator here:

structure_kinds = set(value['structure'].get_kind_names())
pseudo_kinds = set(value['pseudos'].keys())
if structure_kinds != pseudo_kinds:
return f'The `pseudos` specified and structure kinds do not match: {pseudo_kinds} vs {structure_kinds}'

And in accordance with [1], using a validator is preferable. However, I realised that this is only checked in case the structure input is actually provided. So now I was thinking about what to do.

My first inclination was that a wrapping work chain should take care of the validation. But this would of course lead to a lot of duplication of code. I.e. we don't want that the PwBandsWorkChain also has to check that the top-level structure elements all have a pseudo defined in each wrapped PwCalculation. This logic should be contained on the PwCalculation.

But following [1] and only using the validator means that the validation doesn't happen. At first I then thought that we should perhaps keep the code I'm removing here, so the excepted wrapping process still shows a more informative exception traceback. However, that isn't necessary, since when the wrapping work chain tries to submit the process, the "proper" validation still occurs (see the long traceback below).

Full Traceback
verdi process report 121
2023-05-05 16:33:48 [8 | REPORT]: [121|PwRelaxWorkChain|on_except]: Traceback (most recent call last):
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/process_states.py", line 228, in execute
    result = self.run_fn(*self.args, **self.kwargs)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/aiida/engine/processes/workchains/workchain.py", line 314, in _do_step
    finished, stepper_result = self._stepper.step()
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/workchains.py", line 295, in step
    finished, result = self._child_stepper.step()
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/workchains.py", line 538, in step
    finished, result = self._child_stepper.step()
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/workchains.py", line 295, in step
    finished, result = self._child_stepper.step()
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/workchains.py", line 246, in step
    return True, self._fn(self._workchain)
  File "/Users/mbercx/project/qe/code/aiida-quantumespresso/src/aiida_quantumespresso/workflows/pw/relax.py", line 243, in run_relax
    running = self.submit(PwBaseWorkChain, **inputs)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/aiida/engine/processes/process.py", line 534, in submit
    return self.runner.submit(process, **kwargs)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/aiida/engine/runners.py", line 183, in submit
    process_inited = self.instantiate_process(process, **inputs)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/aiida/engine/runners.py", line 169, in instantiate_process
    return instantiate_process(self, process, **inputs)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/aiida/engine/utils.py", line 64, in instantiate_process
    process = process_class(runner=runner, inputs=inputs)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 194, in __call__
    inst.transition_to(inst.create_initial_state())
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 339, in transition_to
    self.transition_failed(initial_state_label, label, *sys.exc_info()[1:])
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/processes.py", line 1003, in transition_failed
    raise exception.with_traceback(trace)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 324, in transition_to
    self._enter_next_state(new_state)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 384, in _enter_next_state
    self._fire_state_event(StateEventHook.ENTERING_STATE, next_state)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 300, in _fire_state_event
    callback(self, hook, state)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/processes.py", line 329, in <lambda>
    lambda _s, _h, state: self.on_entering(cast(process_states.State, state)),
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/processes.py", line 683, in on_entering
    call_with_super_check(self.on_create)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/utils.py", line 29, in call_with_super_check
    wrapped(*args, **kwargs)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/aiida/engine/processes/process.py", line 397, in on_create
    super().on_create()
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/utils.py", line 16, in wrapper
    wrapped(self, *args, **kwargs)
  File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/processes.py", line 750, in on_create
    raise ValueError(result)
ValueError: Error occurred validating port 'inputs.pw': The `pseudos` specified and structure kinds do not match: {'mama'} vs {'Si'}

2023-05-05 16:33:48 [9 | REPORT]: [121|PwRelaxWorkChain|on_terminated]: remote folders will not be cleaned

So this long rant is basically to reconfirm that:

We should never validate by raising exceptions in the prepare_for_submission script.

Unless there is anything I'm missing? I.e. is there a use case where having validation in the prepare_for_submission script is preferable?

As a final note, the traceback could be made more succinct/clear. Perhaps we can have validators return a generic AiiDA ValidationError, which is caught in submit/run calls and simply returns a nicely formatted (cough using rich cough) message.


🔀 Dealing with complex validations

I already mentioned in the intro that sometimes validations can be tricky because of [2-3]. To repeat the example (repurposing the number refs here):

  1. We want to check that a parent_folder is present when CONTROL.calculation is e.g. nscf.
  2. The parent_folder will not be present in the inputs of e.g. the PwBandsWorkChain bands step, since it is only created on runtime.
  3. The validation will hence always fail, raising a warning when the builder is put together using the get_builder_from_protocol() method.

Our solution at the time was to split up the validation in a base and full one, and override the validator. However, after seeing this nice example of how the port namespace can be inspected:

if any(key not in port_namespace for key in ('pseudos', 'structure')):
return

I now think it'd be better to check if the parent_folder is in the name space, and only do the restart validation in this case. Wrapping work chains that dynamically assign this input should exclude the parent_folder port when exposing the inputs of the PwBaseWorkChain anyways. I've implemented an example of this in #927.

I'm going to write down these different use cases and how to best deal with them in a tutorial. Happy to have your input @sphuber!

@sphuber
Copy link
Contributor

sphuber commented May 5, 2023

I fully agree with your thought process:

  1. Ideally, all validation is done in validators not in prepare_for_submission.
  2. Validators should use the ctx (i.e. PortNamespace that is passed) to ensure the relevant input port is still present, and otherwise skip validation of that port. This will make validation robust with respect to wrapping workchains excluding certain ports.
  3. This approach will indeed remove certain validation at launch time of the outermost layer, but as you said, eventually each subprocess will be fully validated when it is launched.

Conclusion: ok to remove this validation in prepare_for_submission.

@mbercx
Copy link
Member Author

mbercx commented May 5, 2023

Thanks @sphuber! Note that this does mean that the validation will only occur in the step where the wrapped process is launched, which means that potentially multiple calculations will have been run in the work chain before the user finds out that a provided input was wrong. In the case of the structure and pseudos, the work chain could have in principle check this. But I believe we both agree that trying to do this for each work chain is not sustainable, unless we can somehow automate this. That said, if the user is aware of get_builder_restart and caching, not much work will be lost.

More importantly though:

As a final note, the traceback could be made more succinct/clear. Perhaps we can have validators return a generic AiiDA ValidationError, which is caught in submit/run calls and simply returns a nicely formatted (cough using rich cough) message.

What do you think about this? Could we somehow make the traceback of validation failures more readable? 😇

@mbercx mbercx changed the title Remove superfluous(?) pseudo validation BasePwCpInputGenerator: Remove superfluous validation May 5, 2023
@mbercx mbercx marked this pull request as ready for review May 5, 2023 20:31
@mbercx mbercx requested a review from sphuber May 5, 2023 20:31
@sphuber
Copy link
Contributor

sphuber commented May 5, 2023

But I believe we both agree that trying to do this for each work chain is not sustainable, unless we can somehow automate this. That said, if the user is aware of get_builder_restart and caching, not much work will be lost.

@mbercx
Copy link
Member Author

mbercx commented May 5, 2023

One final question: do you agree we should no longer add the PR number to the commit titles, since this information (and link) is already found easily when looking at the commit (see bottom left below), and the PR number is actually GitHub-specific?

Screenshot 2023-05-05 at 23 01 42

@sphuber
Copy link
Contributor

sphuber commented May 5, 2023

Yeah, fine for me to start omitting those. We already transitioned to using commit hashes in the CHANGELOG right?

@mbercx
Copy link
Member Author

mbercx commented May 5, 2023

We already transitioned to using commit hashes in the CHANGELOG right?

Yup! And with the PR number still easily findable on the commit page, there is no reason to keep them in the commit title I think.

@mbercx mbercx merged commit 0b6476c into aiidateam:main May 5, 2023
@mbercx mbercx deleted the improve/pw-validation branch May 5, 2023 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants