Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Site shape analysis #166

Merged
merged 27 commits into from
Nov 20, 2023
Merged

Site shape analysis #166

merged 27 commits into from
Nov 20, 2023

Conversation

AstyLavrinenko
Copy link
Contributor

I conducted AIMD simulations using a forced symmetry structure, and I have uploaded the simulation files to the "shape_analysis" subfolder within the surfdrive directory. I've tried to apply rotational symmetry to this data, and I would appreciate if you could review my approach. I have used files created by MATLAB, which are also available on surfdrive. However, when attempting to integrate them with gemdat, I encountered an issue with the order of symmetric operations. Since I manually recorded these operations, it is difficult to integrate. I was thinking that it should be possible to extract the symmetry operations from the sites CIF file and apply them to each lithium atom at each time step. And then plot the positions of the lithium atoms around one of the specified sites within a cutoff distance.

I conducted AIMD simulations using a forced symmetry structure, and I have uploaded the simulation files to the "shape_analysis" subfolder within the surfdrive directory. I've tried to apply rotational symmetry to this data, and I would appreciate if you could review my approach. 
I have used files created by MATLAB, which are also available on surfdrive. However, when attempting to integrate them with gemdat, I encountered an issue with the order of symmetric operations. Since I manually recorded these operations, it is difficult to integrate. I was thinking that it should be possible to extract the symmetry operations from the sites CIF file and apply them to each lithium atom at each time step. And then plot the positions of the lithium atoms around one of the specified sites within a cutoff distance.
@stefsmeets
Copy link
Contributor

Thanks! I will play around with it a bit.

So, the goal for this code is to eventually extract the shape analysis bit and give it a place in gemdat.

@stefsmeets stefsmeets marked this pull request as draft October 18, 2023 13:00
@stefsmeets stefsmeets changed the title shape_analysis Site shape analysis Oct 24, 2023
@stefsmeets
Copy link
Contributor

Hi @AstyLavrinenko , I'm getting:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[11], line 3
      1 sim_data = sio.loadmat('/home/stef/md-analysis-matlab-example-short/shape_analysis/simulation_data.mat')
      2 sites = sio.loadmat('/home/stef/md-analysis-matlab-example-short/shape_analysis/sites.mat')
----> 3 fig = plot_radius_shape(sim_data, sites)

Cell In[7], line 17, in plot_radius_shape(sim_data, sites)
     14 sites_48h = np.where(np.array(sites_names) == '48h')[0]
     15 sites_48h2 = np.where(np.array(sites_names) == '48h2')[0]
---> 17 site_clusters[0] = transfer_48h(np.array(clusters_coords)[sites_48h])
     18 site_clusters[1] = transfer_48h(np.array(clusters_coords)[sites_48h2])
     19 site_clusters[2] = transfer_16e(np.array(clusters_coords)[sites_16e])

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (224,) + inhomogeneous part.

Because clusters_coords consists of arrays with different shapes. Any ideas?

@AstyLavrinenko
Copy link
Contributor Author

clusters_coords

Wow, I didn't have this error. Okay, can we try to replace lines 17-19 with lines below?

site_clusters[0] = transfer_48h([clusters_coords[i] for i in sites_48h])
site_clusters[1] = transfer_48h([clusters_coords[i] for i in sites_48h2])
site_clusters[2] = transfer_16e([clusters_coords[i] for i in sites_16e])

@stefsmeets
Copy link
Contributor

stefsmeets commented Oct 24, 2023

That works, I think you are using a deprecated numpy feature that caused this: https://numpy.org/doc/stable/release/1.19.0-notes.html#deprecate-automatic-dtype-object-for-ragged-input

Now I run into this:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
AttributeError: 'float' object has no attribute 'sqrt'

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
Cell In[54], line 3
      1 sim_data = sio.loadmat('/home/stef/md-analysis-matlab-example-short/shape_analysis/simulation_data.mat')
      2 sites = sio.loadmat('/home/stef/md-analysis-matlab-example-short/shape_analysis/sites.mat')
----> 3 fig = plot_radius_shape(sim_data, sites)

Cell In[52], line 22, in plot_radius_shape(sim_data, sites)
     19 site_clusters[2] = transfer_16e([clusters_coords[i] for i in sites_16e])
     21 for i in range(3):
---> 22     distances = np.sqrt(np.sum(site_clusters[i].copy()**2, axis=1))
     23     print(['48h','48h2','16e'][i], ' MSD (A^2) = ',np.mean(distances.copy()**2),' std = ',np.std(distances.copy()**2))
     24     theta = np.linspace(0, 2 * np.pi, 100)  # 100 points around the circle

TypeError: loop of ufunc does not support argument 0 of type float which has no callable sqrt method

@AstyLavrinenko
Copy link
Contributor Author

AstyLavrinenko commented Oct 24, 2023

Well maybe, but I didn't change anything manually and I have quite new version of numpy same as for gemdat.

Anyway, what if you try:

distances = (np.sum(site_clusters[i].copy() ** 2, axis=1)) ** 0.5

instead of np.sqrt?

@stefsmeets
Copy link
Contributor

stefsmeets commented Oct 24, 2023

I have studied the code today and here is what I came up with.

The goal is to have a generalized algorithm that for all symmetrically equivalent cluster centers, find nearest atoms from trajectory, and transforms them back to the asymmetric unit. This helps the statistics for performing shape analysis and making plots.

As input:

  1. crystal or material structure
    • contains cluster centers
    • includes symmetry operations
  2. trajectory
    • typically P1
    • maybe a supercell of crystal structure
    • lattice can be triclinic (non-constrained in simulation)

Algorithm:

  1. load clusters (pymatgen structure)
  2. load trajectory (pymatgen trajectory)
  3. reduce supercell of trajectory to match clusters
    • assert trajectory and cluster lattices match
  4. for every unique cluster center,
    5. for every symmetry operation
    • apply next symmetry operation to cluster center
    • find all trajectory points within X distance
    • copy and map points back to asymmetric unit (reverse symmetry op)
    • subtract cluster center coords
  5. perform shape analysis
    • plots, fits, heat maps, msd, etc

@AstyLavrinenko Let me know what you think about this.

@AstyLavrinenko
Copy link
Contributor Author

@stefsmeets Yes, I think this should work. Although for step 4 there are two possible ways. The first one you described. Second:

  1. for each trajectory point find the closest cluster center,
    5. for every unique cluster center
    • extract symmetry operation of unique cluster center
    • copy and map points back to asymmetric unit (reverse symmetry op)
    • subtract cluster center coords

I'm not sure which one is easier to implement. Using the code I provided, I tried to follow the second option.

@stefsmeets stefsmeets removed their request for review October 24, 2023 13:49
@stefsmeets
Copy link
Contributor

I think I found a nice way to do this with pymatgen, still WIP. Just trying things out in a notebook for now.

@stefsmeets
Copy link
Contributor

stefsmeets commented Oct 25, 2023

TODO

  • Convert distances to Angstrom
  • Write tests
  • Refactor plotting code
  • Convert notebook code to gemdat submodule
  • pymatgen label patch -> Propagate site labels in SymmetrizedStructure() materialsproject/pymatgen#3423
  • simplify plots in notebook and fix bug
  • Check NaNs in notebook
  • Update docstrings in shape.py
  • Does api/naming make sense?
  • Consider dataclass as return from analyze() (like RDFData)
    • ShapeData?
  • Delete notebook from PR
from pathlib import Path

from gemdat import Trajectory
from gemdat.io import read_cif

workdir = Path('/home/stef/md-analysis-matlab-example-short/shape_analysis')

trajectory = Trajectory.from_vasprun(workdir / 'vasprun.xml')
diff_trajectory = trajectory.filter('Li')

structure = read_cif(workdir / 'argyrodite_48h48h16e.cif')

from gemdat.shape import ShapeAnalyzer

thresh = 1.0  # Å
supercell = (2, 1, 1)

sa = ShapeAnalyzer.from_structure(structure)

shapes = sa.analyze_trajectory(trajectory=diff_trajectory,
                               supercell=supercell,
                               threshold=thresh)

import matplotlib.pyplot as plt

from gemdat import plots

bins = np.linspace(-thresh, thresh, 50)

for shape in shapes:
    plots.shape(shape, bins=bins)
    plt.show()

image

@stefsmeets
Copy link
Contributor

stefsmeets commented Nov 2, 2023

Hi @AstyLavrinenko, I'm pretty happy with the way this works now. Could you have a look whether you find this intuitive to work with, and if it helps with your analysis?

Todo

  • Add example and update docs
  • Update demo notebook
  • Maybe add nicer repr for ShapeAnalyzer

@stefsmeets stefsmeets marked this pull request as ready for review November 2, 2023 15:52
@AstyLavrinenko
Copy link
Contributor Author

Hi @AstyLavrinenko, I'm pretty happy with the way this works now. Could you have a look whether you find this intuitive to work with, and if it helps with your analysis?

Todo

  • Add example and update docs
  • Update demo notebook
  • Maybe add nicer repr for ShapeAnalyzer

Hi @stefsmeets, the example looks impressive! I really like the idea of using a heatmap for visualization. I've started experimenting with code using some different data. I'm planning to dig deeper and play around a bit more. I'll let you know how it goes by the end of the week

@stefsmeets
Copy link
Contributor

Hi @AstyLavrinenko Thanks! Since there was no response I'm going to assume everything is working as intended. I will merge this now, let me know or open new issue with any comments or issues you may find.

@stefsmeets stefsmeets merged commit 46851a6 into main Nov 20, 2023
3 checks passed
@stefsmeets stefsmeets deleted the shape_analysis branch November 20, 2023 09:27
@AstyLavrinenko
Copy link
Contributor Author

Hi @stefsmeets I have a couple of things I'd like to discuss, and I think it would be more effective to do so in person. If I'm not mistaken, you will be at the reactor tomorrow. If that's the case, we could go over these

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants