Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: Overview of gufe #200

Merged
merged 1 commit into from
Jun 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ Guide to GUFE
:maxdepth: 2
:caption: Topics:

guide/overview
guide/components_and_systems
guide/protocols
guide/serialization
Expand Down
83 changes: 83 additions & 0 deletions docs/guide/overview.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
GUFE Overview
=============

GUFE exists to define interoperable APIs for free energy calculations, which
can be used by an ecosystem of Python packages to develop tools that focus
on performing specific aspects of the free energy pipeline, while
benefitting from other tools for other aspects.

In order to do this, GUFE distinguishes several aspects of the process of
calculating free energies, and define APIs for each of them. GUFE provides
the underlying infrastructure; the real science is done in packages that
implement parts of the GUFE APIs.

Core data models
----------------

In order to ensure interoperability, GUFE defines objects that represent the
core chemistry and alchemistry of a free energy pipeline, including
molecules, chemical systems, and alchemical transformations. This provides a
shared language that different tools use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth having a brief section on Components here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the next page in the guide is going to be a discussion of the Components approach (we already have a placeholder for it). My goal for this page is to be an extended table of contents for the rest of the "guide" section; details will come later.

Ligand network setup
--------------------

GUFE defines a basic API for the common case of performing alchemical
transformations between small molecules, either for relative binding free
energies of relative hydration free energies. This handles how mappings
between different molecules are defined for alchemical transformations,
by defining both the :class:`.LigandAtomMapping` object that contains the
details of a specific mapping, and the :class:`.AtomMapper` abstract API for
an object that creates the mappings.

Simulation settings
-------------------

In order to facilitate comparisons of different approaches, GUFE defines a
hierarchy of simulation settings. This allows certain settings (such as
temperature and pressure) to be consistent across different simulation
tools, while allowing additional custom settings specific to a given tool to
be defined.

Protocols
---------

The actual simulation of a free energy calculation is defined by a GUFE
:class:`.Protocol`. The :class:`.Protocol` is described as a set of tasks,
each a :class:`.ProtocolUnit`, which may depend on other tasks. As such,
they form a directed acyclic graph.

GUFE does not implement any free energy protocols, but by providing the
abstract API, allows protocol authors to create new simulation protocols
without needing to focus on the details of execution or storage.

Executors
---------

Executors actually run the simulations described by the :class:`.Protocol`.
GUFE does not define an executor API, although it includes the very simple
serial executor in :func:`.execute_DAG`.

The responsibilities of an executor include running the tasks (units) for a
:class:`.Protocol` and managing storage of output. GUFE contains some tools
to facilitate that, particularly around storage, but it is up to the
executor to determine how to/whether to use those.

Strategies
----------

Strategies have yet to be implemented, but the GUFE design leaves a place
for an object that, at the scale of an alchemical network, can dynamically
decide where to focus more simulation effort based on the results that have
been received so far. This will be useful for adaptive approaches to
sampling a network.
Comment on lines +66 to +73
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd maybe leave this out if it doesn't exist? Or maybe have a separate page of planned future scope?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggled with the same. We need to document this concept somewhere, because it informs other architectural decisions. I thought that putting a very brief paragraph here was the least-bad option, but I'd be open to other options. But I like this page being a 1-page overview of the whole architecture, as a place for people to start.


Core GUFE infrastructure
------------------------

Behind the scenes, GUFE implements a number of details common to all of its
objects. Nearly all objects in GUFE are subclasses of
:class:`.GufeTokenizable`, which sets a few requirements and behaviors on
these objects, including that they are immutable, serializable, and
hashable. These provide important guarantees to downstream packages that
facilitate repeatability and reproducibility.