Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: modernise HBA to use AG as objects internally instead of selection strings - code only PR (Issue #3933) #4533

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from
3 changes: 1 addition & 2 deletions package/AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -240,8 +240,7 @@ Chronological list of authors
- Sampurna Mukherjee
- Leon Wehrhan
- Valerij Talagayev


- Luna Morrow

External code
-------------
Expand Down
21 changes: 21 additions & 0 deletions package/CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,27 @@ The rules for this file:


-------------------------------------------------------------------------------
04/25/24 lunamorrow

* 2.8.1

Fixes

Enhancements
* modernise `HydrogenBondAnalysis` to use `AtomGroups` as objects internally
instead of selection strings

Changes
* `guess_acceptors`, `guess_hydrogens` and `guess_donators` no longer return
selection strings (return `AtomGroups`instead)
* `_prepare` and `_single_frame` no longer pass selection strings to other
class methods (pass `AtomGroups` instead)

Deprecations
* Selection string return of `guess_acceptors`, `guess_hydrogens` and
`guess_donators` has been deprecated in favour of return of `AtomGroups`
and will removed in removed in version 3.0.0 (PR #4533)

??/??/?? IAlibay, HeetVekariya, marinegor, lilyminium, RMeli,
ljwoods2, aditya292002, pstaerk, PicoCentauri, BFedder,
tyler.je.reddy, SampurnaM, leonwehrhan, kainszs, orionarcher
Expand Down
105 changes: 49 additions & 56 deletions package/MDAnalysis/analysis/hydrogenbonds/hbond_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,10 @@

Alternatively, :attr:`hydrogens_sel` and :attr:`acceptors_sel` may be generated via the :attr:`guess_hydrogens` and
:attr:`guess_acceptors`. This selection strings may then be modified prior to calling :attr:`run`, or a subset of
the universe may be used to guess the atoms. For example, find hydrogens and acceptors belonging to a protein::

Alternatively, :attr:`hydrogens_ag` and :attr:`acceptors_ag` may be generated via the :attr:`guess_hydrogens` and
:attr:`guess_acceptors`. This selection strings may then be modified prior to calling :attr:`run`, or a subset of
the universe may be used to guess the atoms. For example, find hydrogens and acceptors belonging to a protein::

import MDAnalysis
Expand Down Expand Up @@ -406,9 +410,9 @@ def guess_hydrogens(self,

Returns
-------
potential_hydrogens: str
String containing the :attr:`resname` and :attr:`name` of all
hydrogen atoms potentially capable of forming hydrogen bonds.
potential_hydrogens: AtomGroup
AtomGroup containing all of the hydrogen atoms potentially
capable of forming hydrogen bonds.

Notes
-----
Expand All @@ -424,30 +428,24 @@ def guess_hydrogens(self,
If :attr:`hydrogens_sel` is `None`, this function is called to guess
the selection.

Alternatively, this function may be used to quickly generate a
:class:`str` of potential hydrogen atoms involved in hydrogen bonding.
This str may then be modified before being used to set the attribute
:attr:`hydrogens_sel`.


.. versionchanged:: 2.4.0
Added ability to use atom types
.. versionchanged:: 3.0.0
Returns AtomGroup instead of selection string

"""

if min_mass > max_mass:
raise ValueError("min_mass is higher than (or equal to) max_mass")

ag = self.u.select_atoms(select)
hydrogens_ag = ag[
np.logical_and.reduce((
ag.masses < max_mass,
ag.charges > min_charge,
ag.masses > min_mass,
))
]
hydrogens_ag = ag.select_atoms(f"prop charge > {min_charge} and prop \
mass > {min_mass} and prop mass < \
{max_mass}",
updating=self.update_selections)

return self._group_categories(hydrogens_ag)
return hydrogens_ag

def guess_donors(self, select='all', max_charge=-0.5):
"""Guesses which atoms could be considered donors in the analysis. Only
Expand All @@ -465,9 +463,9 @@ def guess_donors(self, select='all', max_charge=-0.5):

Returns
-------
potential_donors: str
String containing the :attr:`resname` and :attr:`name` of all atoms
that are potentially capable of forming hydrogen bonds.
potential_donors: AtomGroup
AtomGroup containing all of the atoms potentially capable of
forming hydrogen bonds.

Notes
-----
Expand All @@ -484,25 +482,23 @@ def guess_donors(self, select='all', max_charge=-0.5):
have bonding information, this function is called to guess the
selection.

Alternatively, this function may be used to quickly generate a
:class:`str` of potential donor atoms involved in hydrogen bonding.
This :class:`str` may then be modified before being used to set the
attribute :attr:`donors_sel`.


.. versionchanged:: 2.4.0
Added ability to use atom types
.. versionchanged:: 3.0.0
Returns AtomGroup instead of selection string

"""

# We need to know `hydrogens_sel` before we can find donors
# Use a new variable `hydrogens_sel` so that we do not set
# `self.hydrogens_sel` if it is currently `None`
if self.hydrogens_sel is None:
hydrogens_sel = self.guess_hydrogens()
hydrogens_sel = self._group_categories(self.guess_hydrogens())
else:
hydrogens_sel = self.hydrogens_sel
hydrogens_ag = self.u.select_atoms(hydrogens_sel)
hydrogens_ag = self.u.select_atoms(hydrogens_sel,
updating=self.update_selections)

# We're using u._topology.bonds rather than u.bonds as it is a million
# times faster to access. This is because u.bonds also calculates
Expand All @@ -521,9 +517,10 @@ def guess_donors(self, select='all', max_charge=-0.5):
)
)

donors_ag = donors_ag[donors_ag.charges < max_charge]
donors_ag = donors_ag.select_atoms(f"prop charge < {max_charge}",
updating=self.update_selections)

return self._group_categories(donors_ag)
return donors_ag

def guess_acceptors(self, select='all', max_charge=-0.5):
"""Guesses which atoms could be considered acceptors in the analysis.
Expand All @@ -542,9 +539,9 @@ def guess_acceptors(self, select='all', max_charge=-0.5):

Returns
-------
potential_acceptors: str
String containing the :attr:`resname` and :attr:`name` of all atoms
that potentially capable of forming hydrogen bonds.
potential_acceptors: AtomGroup
AtomGroup containing all of the atoms potentially capable of
forming hydrogen bonds.

Notes
-----
Expand All @@ -559,21 +556,20 @@ def guess_acceptors(self, select='all', max_charge=-0.5):
If :attr:`acceptors_sel` is `None`, this function is called to guess
the selection.

Alternatively, this function may be used to quickly generate a
:class:`str` of potential acceptor atoms involved in hydrogen bonding.
This :class:`str` may then be modified before being used to set the
attribute :attr:`acceptors_sel`.


.. versionchanged:: 2.4.0
Added ability to use atom types
.. versionchanged:: 3.0.0
Returns AtomGroup instead of selection string

"""

ag = self.u.select_atoms(select)
acceptors_ag = ag[ag.charges < max_charge]
#acceptors_ag = ag[ag.charges < max_charge]
acceptors_ag = ag.select_atoms(f"prop charge < {max_charge}",
updating=self.update_selections)

return self._group_categories(acceptors_ag)
return acceptors_ag

@staticmethod
def _group_categories(group):
Expand Down Expand Up @@ -632,15 +628,21 @@ def _get_dh_pairs(self):
'Please either: load a topology file with bond information; use the guess_bonds() '
'topology guesser; or set HydrogenBondAnalysis.donors_sel so that a distance cutoff '
'can be used.')

hydrogens = self.u.select_atoms(self.hydrogens_sel)
if self.hydrogens_sel is None:
hydrogens = self.guess_hydrogens()
else:
hydrogens = self.u.select_atoms(self.hydrogens_sel,
updating=self.update_selections)
donors = sum(h.bonded_atoms[0] for h in hydrogens) if hydrogens \
else AtomGroup([], self.u)

# Otherwise, use d_h_cutoff as a cutoff distance
else:

hydrogens = self.u.select_atoms(self.hydrogens_sel)
if self.hydrogens_sel is None:
hydrogens = self.guess_hydrogens()
else:
hydrogens = self.u.select_atoms(self.hydrogens_sel,
updating=self.update_selections)
donors = self.u.select_atoms(self.donors_sel)
donors_indices, hydrogen_indices = capped_distance(
donors.positions,
Expand Down Expand Up @@ -699,27 +701,18 @@ def _filter_atoms(self, donors, acceptors):
def _prepare(self):
self.results.hbonds = [[], [], [], [], [], []]

# Set atom selections if they have not been provided
if self.acceptors_sel is None:
self.acceptors_sel = self.guess_acceptors()
if self.hydrogens_sel is None:
self.hydrogens_sel = self.guess_hydrogens()

# Select atom groups
self._acceptors = self.u.select_atoms(self.acceptors_sel,
updating=self.update_selections)
if self.acceptors_sel is None:
self._acceptors = self.guess_acceptors()
else:
self._acceptors = self.u.select_atoms(self.acceptors_sel,
updating=self.update_selections)
self._donors, self._hydrogens = self._get_dh_pairs()

def _single_frame(self):

box = self._ts.dimensions

# Update donor-hydrogen pairs if necessary
if self.update_selections:
self._donors, self._hydrogens = self._get_dh_pairs()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is still needed for _single_frame because self._donors will be updated every frame with the change of self._hydrogens. You need to separate the part that creates self._hydrogens from it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lunamorrow The reason you get different results is that in the old code, self._donors is updated every frame based on self._hydrogens. However, if you don't execute _get_dh_pairs every frame, self._donors will become a non-updating AtomGroup based on the first frame.


# find D and A within cutoff distance of one another
# min_cutoff = 1.0 as an atom cannot form a hydrogen bond with itself
d_a_indices, d_a_distances = capped_distance(
self._donors.positions,
self._acceptors.positions,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,7 @@ def test_no_bond_donor_sel(self, universe):
u.add_TopologyAttr('mass', [15.999, 1.008, 1.008] * n_residues)
u.add_TopologyAttr('charge', [-1.04, 0.52, 0.52] * n_residues)
h = HydrogenBondAnalysis(u, **kwargs)
donors = u.select_atoms(h.guess_donors())
donors = h.guess_donors()

def test_first_hbond(self, hydrogen_bonds):
assert len(hydrogen_bonds.results.hbonds) == 2
Expand Down Expand Up @@ -554,7 +554,8 @@ def test_guess_donors(self, h):

ref_donors = "(resname TIP3 and name OH2)"
donors = h.guess_donors(select='all', max_charge=-0.5)
assert donors == ref_donors
donors_sel = h._group_categories(donors)
assert donors_sel == ref_donors


class TestHydrogenBondAnalysisTIP3P_GuessHydrogens_NoTopology(object):
Expand Down Expand Up @@ -586,7 +587,8 @@ def test_guess_hydrogens(self, h):

ref_hydrogens = "(resname TIP3 and name H1) or (resname TIP3 and name H2)"
hydrogens = h.guess_hydrogens(select='all')
assert hydrogens == ref_hydrogens
hydrogens_sel = h._group_categories(hydrogens)
assert hydrogens_sel == ref_hydrogens

pytest.mark.parametrize(
"min_mass, max_mass, min_charge",
Expand All @@ -599,7 +601,7 @@ def test_guess_hydrogens(self, h):
def test_guess_hydrogens_empty_selection(self, h):

hydrogens = h.guess_hydrogens(select='all', min_charge=1.0)
assert hydrogens == ""
assert len(hydrogens) == 0

def test_guess_hydrogens_min_max_mass(self, h):

Expand Down
Loading