Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Utility calculators and a small Clarification to the interface #9

Closed
wants to merge 14 commits into from

Conversation

tjjarvinen
Copy link
Collaborator

This adds three "utility" calculators.

First is CombinationCalculator that can be used to combine group of calculators to one calculator. It allows serial or parallel execution of the calculators. Finally it has an option to edit keywords that are passed to the calculators.

Second is ReportingCalculator that can be used to collect information during calculation.

Third is SubSystemCalculator that splits the systtem to a smaller part and performs the calculation on that part.

Interface Clarification

There is also a clarification on the interface definition here. Non allocating force calculator has two options: 1) overwrite the given force array or 2) add to the given force array. The original definition did to clarify which of these two options the interface expects.

It should be noted that, if more than one calculator is needed option 1) becomes invalid. Also in option 1) the calculator has to overwrite the whole array, while in option 2) it can only write a part of the array.

This became a clear problem when implementing CombinationCalculator. So, I did the liberty of making option 2) the official interface. This of course needs approval from everyone @cortner, @mfherbst, @jgreener64 and @rkurchin

@cortner
Copy link
Member

cortner commented Dec 11, 2023

So you are saying that the in-place calculators are never allowed to zero-out the output array. It is always the responsibility of the caller. I'm ok with that, but it is a big potential source of bugs and therefore needs to be very prominently documented.

Copy link
Member

@cortner cortner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first impressions. But more reviewing needed.

@@ -25,7 +25,7 @@ Each of the individual calls have two common inputs: `AtomsBase.AbstractSystem`
- First input is `AtomsBase.AbstractSystem` compatible structure
- Second input is `calculator` structure
- Method has to accept keyword arguments (they can be ignored)
- Non-allocating force call `force!` has an AbstractVector as the first input, to which the evaluated force values are stored (look for more details below)
- Non-allocating force call `force!` has an AbstractVector as the first input, to which the evaluated force values are **added to** (look for more details below)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we had agreed on forces, i.e. it should be forces!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and it is. This is a typo

All of these calculators are defined in a submodule [UtilityCalculators](@ref). To use them just add

```julia
using AtomsCalculators.UntilityCalculators
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

CombinationCalculator(calc1, calc2; multithreading=true)
```

By default multithreading is off, but giving kword `multithreading=true` toggles it on. This means that all calculators are launched at the same time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean "are launched asynchronously"?


## Subsystem Calculator

[SubSystemCalculator](@ref) is used to split input system for a smaller parts. This is useful e.g. for QM/MM calculations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

into smaller parts

```julia
sub_calc = SubSystemCalculator( mycalc, [1:10..., 15])
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from these docs, then functionality is actually unclear. It could mean that the system is split into clusters and the total energy on each cluster should be computed. Or it could mean that each system is split into domains, and the partial energy on each domain is computed. Or a combination of those, ...

test/runtests.jl Outdated
@@ -114,3 +117,68 @@ using AtomsCalculators.AtomsCalculatorsTesting
@test haskey(efv, :forces)
@test haskey(efv, :virial)
end


@testset "UntilityCalculators" begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

@jgreener64
Copy link

Agree that the non-allocating force calculator should add not overwrite. We should document that clearly in the docs and relevant docstrings though, e.g. by saying that fill!(f, 0) can be called before forces!.

I also prefer ArgumentError to @assert for argument checking since asserts can be turned off.

@tjjarvinen
Copy link
Collaborator Author

This is a summary of our Zoom discussion

CombinationCalculator

This was agreed to be useful. A point was asked, should the multithreading be on by default or not?

A question about using a tuple of calculators, was also asked. This could cause some issues with dispatch and would need a solution on interface level.

Concern rouse on the keyword argument edit functionality. Is it needed? Would it hide details from user?...

ReportingCalculator

There was a discussion on how the messaging part should be done and is the name ReportingCalculator a correct one.

We discussed on possibility of implementing message calls to calculators. The issue with these is that every calculator would need to implement a call check.

The use of Channel in ReportingCalculator was questioned. Would a simple method call be a better one?

SubSystemCalculator

It was agreed that this is a needed feature.

Possible issue was pointed on that this could hide the type of the calculator that could be needed e.g. in geometry optimisation.

We came to a conclusion that we should look for SciML on how subsystem and messages are handled there.

Copy link
Member

@mfherbst mfherbst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments in writing, which I also expressed in the call last week.

The docs extensions are great and could be merged as is from my opinion. I'm still not sure about the wrapping pattern for adding functionality in calculators.

@@ -1,7 +1,7 @@
name = "AtomsCalculators"
uuid = "a3e0e189-c65a-42c1-833c-339540406eb1"
authors = ["JuliaMolSim contributors"]
version = "0.1.1"
version = "0.1.2-DEV"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale here? I only modify ghis to bump versions (you release after pretty much every merge anyway)

1. Make sure that `AtomsCalculators.promote_force_type` returns correct force type and unit.
2. Make sure that `AtomsCalculators.zero_forces` returns correct array type and unit for forces.
3. Non allocating force call has to add to the force input array not overwrite it.
4. Make sure that your calculator supports nested multithreading. That is it can be called from inside multithreaded loop. This means you should not use static scheduling with `Threads.@threads`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very tricky for dftk or when calling third-party calculators (e.g. call quantum espresso). Also how do we want to deal with mpi ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am cocerned here on the upper level multithreading. Main point is that, if you look for exended doc for Threads.@threads

??Threads.@threads

Gives you

:static
  –––––––

  :static scheduler creates one task per thread and divides the iterations
  equally among them, assigning each task specifically to each thread. In
  particular, the value of threadid() is guaranteed to be constant within one
  iteration. Specifying :static is an error if used from inside another
  @threads loop or from a thread other than 1.

  │ Note
  │
  │  :static scheduling exists for supporting transition of code
  │  written before Julia 1.3. In newly written library functions,
  │  :static scheduling is discouraged because the functions using this
  │  option cannot be called from arbitrary worker threads.

This means that, if e.g. Molly spawns all calculators in a thread and one of the calculators itself uses static scheduler with Threads.@threads the code fails. So, it is possible to implement multithreading in a way that is harmful upstream. We also know there will be improvements to nested multithreading in future Julia version, and use of :static is heavily discouraged for the above reason.

Thus we should add this to the interface definition.

As for what this means for MPI, it does not affect it. The only issue is that the main MPI process can have threads doing other calculations, but it does not affect the MPI calculation itself.

sub_calc = SubSystemCalculator( mycalc, [1:10..., 15])
```

This calculator will then create a system with atoms 1 to 10 and 15 and pass that one into the calculator. Zero forces are given to all other atoms.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I already expressed in the zoom call, I think we should be very careful with this pattern of wrapping other calculators to obtain functionality. In my opinion this will make it harder down the road to define specialised methods for specific calculators. My main gripes is for example default parameters or choices in the algorithm for "cheap" calculators like empirical potentials versus "expensive" calculators like DFT computations. If you keep wrapping you loose the elegance of Julia to specialise behaviour by just implementing a new method for a specific calculator type.

Since in the discussion it became clear, that both the reporting as well as the subsystem calculator are needed for QMMM, I'd suggest to only implement a QMMM calculator and worry about coming up with more general primitives like how to extract parts of the data (reporting) or how to split into subsystems later as we have more understanding of what our ecosystem looks like.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goes on what do we mean by QM/MM calculation. If the definition is one QM and one MM system. Then making separate QM/MM calculator is the way to go.

I am however thinking of much more complicated calculators, like one QM, one ML and one MM, or several ML and QM systems. In these cases the simple QM/MM calculator wont work.

Basicly these 3 calculators (Combination, Subsystem, and Reporting) are designed to make this kind of complicated calculations easily doable.

There is of course the issue you are pointing. But, I am not sure is using dispatch a correct way to handle it. We could e.g. implement a "calculation_hit" function that would allow upstream get info on, what to expect from the calculator.


```julia
CombinationCalculator(calc1, calc2)
CombinationCalculator(calc1, calc2, calc3...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about allowing some weights ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or potentially even more general ways that calculators can be combined...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point and I will add it.

Comment on lines +30 to +42
function AtomsCalculators.UntilityCalculators.generate_keywords(
sys,
pp::PairPotential...;
kwargs...
)
cutoff = cutoff_radius(pp[begin])
if all( c -> cutoff_radius(c) ≈ cutoff, pp )
plist = PairList(sys, cutoff)
return (; :plist => plist, kwargs...)
else
return kwargs
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be solved, by the convention that all kwargs are passed to all calculators in a CombinationCalculator?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the default behavior. This was just an example on how to customize it - it is in the docs after all.

@cortner
Copy link
Member

cortner commented May 24, 2024

can this now also be closed since it now lives in ACU?

@cortner
Copy link
Member

cortner commented May 30, 2024

@tjjarvinen -- I think utilitiy calculators now go elsewhere and interface clarification will happen in a separate PR - renewing Cedric's large PR. Can this be closed?

@tjjarvinen tjjarvinen closed this May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants