add high-throughput screening workchain #146

federicazanca · 2024-07-05T09:34:03Z

Adding a workchain that takes files from a folder and runs a given calculation on all the files. The use for this is to use it later inside a training workchain.
Problems:

Janus-core and aiida-qe are incompatible see Issue 196 in janus
https://github.com/stfc/janus-core/issues/196
Adding more codes with different entry points might make the thing slightly more complex than expected. Basically I would like to give as an input an entry point for the calculations to run and then use it to set up a calculation.
To explain better: this is how you set up a calcjob or a workflow. Let's use the entry points mlip.opt and quantumespresso.pw.relax

geomopt_janus = CalculationFactory("mlip.opt")
geomopt_qe = WorkflowFactory("quantumespresso.pw.relax")

Then this is how the WorkChain takes the inputs. Basically spec.input are inputs for the first steps of the workchain, and then every subprocess takes its own inputs in the form of a dictionary with the expose_inputs function.

class HTSWorkChain(WorkChain):
  @classmethod
   def define(cls, spec):
        super().define(spec)

        spec.input("folder", valid_type=Str, help="Folder containing CIF files")

        # spec.input("entrypoint", valid_type=Str, help="calculation entry point")

        spec.expose_inputs(geomopt_janus, namespace="janus_inputs", exclude="struct", required=False)
        spec.expose_inputs(geomopt_qe, namespace="qe_inputs", exclude="struct", required=False)

At the moment I cannot use the "entrypoint" input to chose the type of calculation to do inside the workchain, because both the input and expose_input need to be defined outside, so I have to have basically different keywords for any calculation I want to run high-throughputly (in this example I want to run either the janus opt or the qe-workchain for opt).
I will ask about this in the aiida.discourse, but if this is not fixable basically I will need to manually choose which types of calculations I want to add to the workchain.

TLDR
making a high-throughput screening for different types of calculations (quantumespresso, janus opt, janus sp etc) might make it uselessly complex for me and the user, so I need to decide what's worth doing:
what calculations do we want in the high-throughput screening workchain? are janus optimisation and qe optimisation enough?
Should I just not make a high-throughput workchain and proceed with the training?

The text was updated successfully, but these errors were encountered:

federicazanca mentioned this issue Jul 5, 2024

High throughput wc #147

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add high-throughput screening workchain #146

add high-throughput screening workchain #146

federicazanca commented Jul 5, 2024

add high-throughput screening workchain #146

add high-throughput screening workchain #146

Comments

federicazanca commented Jul 5, 2024