-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Align Phases of Isomorphous Structures #31
Comments
@kmdalton Is the issue that both structures look correct but are rotated and translated with respect to each other? If that’s the case, then a full phase origin search isn’t needed (except in P1). There’s a limited number of equivalent phase origins for each space group — P212121, for example, has only 4 — so one could get away with just mapping the phases of the second structure onto each of those equivalent origins and checking for consistency with the first. |
Yes, I think you've characterized the issue correctly, @apeck12. Do you know how we can generate the phase origins for a particular group programmatically? In cases like P21 or P61 without orthogonal symmetry elements, are there still finitely many phase origins, or are they restricted to a line? |
I can confirm that
Based on this, I think we need a hybrid approach to cover all space groups. Space groups with orthogonal symmetry axes should be handled by the approach in 1. I'm afraid we will need to learn the translation vector by a grid search and/or optimization in cases like P63. What edge cases am I missing? How can we write effective tests for these new algorithms? |
A few thoughts:
Here’s the code:
Here are plots for valid origin shifts versus a random shift (second row, last column) for a P212121 crystal: And for a P61 crystal: I examined a few other space groups with similar results, but this should be tested more exhaustively.
|
Here’s the Gemmi version of the
It seems like the main motivation for restricting the search is to avoid invalid origins since Eq. 6 is what I had in mind for the centric vs. expected values residual. I can try to write something for this. |
Updated version of the above code, which accounts for centering:
|
I can confirm that this code recovers
This is very cool, @apeck12. I will modify this slightly and add it to the branch. I think I might add a little optimization routine to refine the translation vector after the grid search. I think we will probably need that to get precise translation vectors for these groups like P61. |
Also, @apeck12, if you want to polish this up and submit a PR, I'd obviously be totally cool with that. |
On second thought -- my experiments lead me to believe this function is probably too rough to optimize easily. |
Curious, thanks for sharing this. I meant to look into doing optimization rather than a grid search some time ago but never got around to it, so this is interesting to see. We didn’t have better luck finding a common or crystallographic phase origin in real rather than reciprocal space, but our data might have been sufficiently different (very incomplete and not reduced) to be irrelevant to this case. And thanks for adding the |
in the real space - would it be similar to Phenix find_alt_orig_sym_mate and CCP4 csymmatch? |
@wojdyr, the goal is to align two electron density maps directly without a coordinate file. I cannot find a function that does this anywhere. If you know of one, please let us know. Thanks! |
Come to think, I guess this problem is pretty much molecular replacement without the rotation search component. |
I have a naive question: the equivalent origins here are not the same thing as equivalent origins mentioned in |
Yes, those are exactly what we're trying to account for. The problem is that many groups, take for instance Also, thanks for that webpage. I had been struggling to find something like that. |
I am afraid the residuals don't look much better in real space. These are real data from a pair of density modified experimental SAD maps from AutoSol. They are off by a This is just to say that real space doesn't seem any more promising unfortunately. |
I'm not quite sure I understand the distinction between red/black entries in the CCP4 table linked by @wojdyr. @apeck12's function for generating possible alternate origins seems to agree nicely with the table for from reciprocalspaceship.algorithms.phase import find_equivalent_origins # phalign branch
import gemmi
find_equivalent_origins(gemmi.SpaceGroup(19)) Output:
However, for find_equivalent_origins(gemmi.SpaceGroup(4), sampling_interval=0.25) Output:
Does the function miss some alternate origins? Or am I misunderstanding the distinction between red/black entries in the CCP4 table? |
@JBGreisman, thanks for catching this discrepancy. Examining space group 4 specifically, the space group diagram for the primitive cell indicates only a polar y axis and no alternative origins: http://img.chem.ucl.ac.uk/sgp/large/004ay1.htm For the unconventional B-type lattice, there’s an additional translation that corresponds to one of the alternative origins listed in the CCP4 table: http://img.chem.ucl.ac.uk/sgp/large/004dy1.htm I haven’t encountered this before, but according to the fourth slide here, monoclinic B is a transformation of the primitive monoclinic lattice with twice the unit cell volume. I’m not sure how to retrieve the full set of alternative origins listed in the CCP4 table from the space group’s symmetry operations alone. Perhaps the best approach would be to search the union of all valid nonpolar origins, amended for any polar axes, as @kmdalton described earlier in this issue. However, if some of the origins listed in that table correspond to unconventional lattices, would reindexing potentially be required to find the origin that aligns the phases? |
This has been dormant for a while, but I just wanted to summarize where I think things stand:
We could start out by restricting this function to non-polar spacegroups. I know how to classify spacegroups as polar or not in |
For reference, here is an updated link with alternate origins for each spacegroup: https://www.ccp4.ac.uk/html/alternate_origins_allgroups.html |
Here's a short snippet that uses import numpy as np
import gemmi
from cctbx import sgtbx
def is_polar(sg):
"""Returns whether spacegroup is polar"""
sym_ops = sg.operations().sym_ops
a = np.array([ op.rot for op in sym_ops ])
return ~(a < 0).any(axis=2).any(axis=0).all()
# Check against sgtbx
for sg in sgtbx.space_group_symbol_iterator():
sgtbx_polar = sgtbx.space_group(sg).info().number_of_continuous_allowed_origin_shifts() > 0
if sgtbx_polar != is_polar(gemmi.SpaceGroup(sg.universal_hermann_mauguin())):
# Will only print spacegroup if there's a disagreement
print(sg.universal_hermann_mauguin()) |
I realized that I had solved the 3D translation search problem with a simple iterative algorithm and forgot to post it publicly. I'm just going to put this code up here so it is public. @JBGreisman is working on putting this into import numpy as np
import reciprocalspaceship as rs
import gemmi
reference_mtz_file = "RTSAD_HEWL_refine_25.mtz"
mtz_file = "overall_best_denmod_map_coeffs.mtz"
ds = rs.read_mtz(reference_mtz_file).stack_anomalous().join(rs.read_mtz(mtz_file)).dropna()
dmin = -1.
ds = ds[ds.compute_dHKL().dHKL >= dmin]
H = ds.get_hkls().astype('float32')
phi_ref = ds['PHIF-model'].to_numpy()
phi = ds['PHWT'].to_numpy()
class TranslationSolver:
def __init__(self, H, phi_ref, phi):
"""
H : millers as an N x 3 numpy array
phi_ref : reference phases in degrees as a numpy array length N
phi : other phases in degrees as a numpy array length N
"""
self.H = H
self.phi_ref = phi_ref
self.phi = phi
self.t = None
self.D = (np.deg2rad(phi) - np.deg2rad(phi_ref)) / (2. * np.pi)
# Current voxel size in fractional coordinates
self.grid_size = 0.5
# Current bounding box which contains the solution
# 3 ranges in fractional coordinates
self.a_range = [0., 1.]
self.b_range = [0., 1.]
self.c_range = [0., 1.]
def _fit(self):
T = np.mgrid[
self.a_range[0] : self.a_range[1] : self.grid_size,
self.b_range[0] : self.b_range[1] : self.grid_size,
self.c_range[0] : self.c_range[1] : self.grid_size,
].reshape((3, -1)).T
loss = np.sum(np.square((self.D[:,None] - H@T.T) - np.round((self.D[:,None] - H@T.T))), axis=0)
tbest = T[np.argmin(loss)]
# Update grid size for next iteration
self.grid_size = self.grid_size / 2.
# Update bounding box around solution
a,b,c = tbest
self.a_range = [max(a-self.grid_size, 0.), min(a+self.grid_size, 1.)]
self.b_range = [max(b-self.grid_size, 0.), min(b+self.grid_size, 1.)]
self.c_range = [max(c-self.grid_size, 0.), min(c+self.grid_size, 1.)]
self.t = tbest
return tbest
def fit(self, maxiter=10):
for i in range(maxiter):
t = self._fit()
return t
ts = TranslationSolver(H, phi_ref, phi)
tbest = ts.fit()
out = rs.read_mtz(mtz_file)
H = out.get_hkls()
out['PHWT'] = rs.utils.canonicalize_phases(out['PHWT'] + np.rad2deg(2*np.pi*H@tbest))
out.write_mtz("translated.mtz") |
Here's an example using a fourier peak search to align phases (see here): import numpy as np
import reciprocalspaceship as rs
import gemmi
t_true = np.array([0.205, 0.66666666666666666, 0.7512])
# Read MTZ
reference_mtz_file = "4lzt_phases.mtz"
ds = rs.read_mtz(reference_mtz_file).dropna().hkl_to_asu().canonicalize_phases()
dmin = -1
ds = ds[ds.compute_dHKL().dHKL >= dmin]
ds_trans = ds.copy()
# Add jitter to phases
ds_trans["PHWT"] = ds_trans["PHWT"] + np.random.normal(scale=15.0, size=len(ds))
ds_trans.canonicalize_phases(inplace=True)
# Add jitter to structure factors
ds_trans["FP"] = ds_trans["FP"] + np.random.normal(scale=0.1 * ds_trans['FP'], size=len(ds))
# Translate Phi by t_true
H = ds.get_hkls().astype("float32")
ds_trans["PHWT"] = rs.utils.canonicalize_phases(
ds_trans["PHWT"] + np.rad2deg(2 * np.pi * H @ t_true)
)
phi_ref = ds["PHWT"].to_numpy()
phi = ds_trans["PHWT"].to_numpy()
# Based on: https://my.ece.msstate.edu/faculty/du/JSTARS-Registration.pdf
# Use phase correlations
dmin = None
sample_rate=3.
f_key = 'FP'
phi_key = 'PHWT'
ds['PC'] = np.conj(ds.to_structurefactor(f_key, phi_key)) * ds_trans.to_structurefactor(f_key, phi_key)
ds['PC'] = ds['PC'] / ds['PC'].abs()
if dmin is None:
dmin = ds.compute_dHKL().dHKL.min()
gridsize = np.ceil(
sample_rate * np.array(
ds.cell.get_hkl_limits(dmin)
)
).astype('int')
pc_grid = ds.to_reciprocalgrid('PC', gridsize)
real = np.abs(np.fft.fftn(pc_grid))
t = np.concatenate(np.where(real == real.max())) / real.shape
print(f"Translation vector from Fourier peak search: {t}") This is based on a lysozyme example from @JBGreisman. The I would love to do this with |
Just to close up this discussion -- as per the above, a good way to think of this is as an image registration problem. Conveniently, scikit-image supports this and the function, Though I have the outline of a solution here (#174), this seems better suited for |
Often times, I notice that SAD phasing solutions from isomorphous structures don't overlay in real space. This seems to have something to do with the choice of "phase origin". It'd be super useful if we had a function to make sure that two sets of structure factors have compatible origins. We should be able to implement something akin to @apeck12's method. After talking to @JBGreisman, we decided doing this in cases with good phases might boil down to solving a simple linear program. At any rate, this should absolutely be something we put in
rs.algorithms
.The text was updated successfully, but these errors were encountered: