Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dataset request] Defect GNN #54

Open
1 task
gpwolfe opened this issue Mar 28, 2024 · 1 comment
Open
1 task

[Dataset request] Defect GNN #54

gpwolfe opened this issue Mar 28, 2024 · 1 comment

Comments

@gpwolfe
Copy link
Collaborator

gpwolfe commented Mar 28, 2024

Name

Gregory Wolfe

Email

gw2338@nyu.edu

Dataset name

Defect GNN

Authors

Md Habibur Rahman, Prince Gollapalli, Panayotis Manganaris, Satyesh Kumar Yadav, Ghanshyam Pilania, Brian DeCost, Kamal Choudhary, Arun Mannodi-Kanakkithodi

Publication link

https://doi.org/10.1063/5.0176333

Data link

https://github.com/msehabibur/defect_GNN_gen_1

Additional links

No response

Dataset description

Here, we develop a framework for the prediction and screening of native defects and functional impurities in a chemical space of Group IV, III-V, and II-VI zinc blende (ZB) semiconductors, powered by crystal Graph-based Neural Networks (GNNs) trained on high-throughput density functional theory (DFT) data. Using an innovative approach of sampling partially optimized defect configurations from DFT calculations, we generate one of the largest computational defect datasets to date, containing many types of vacancies, self-interstitials, anti-site substitutions, impurity interstitials and substitutions, as well as some defect complexes.

We applied three types of established GNN techniques, namely Crystal Graph Convolutional Neural Network (CGCNN), Materials Graph Network (MEGNET), and Atomistic Line Graph Neural Network (ALIGNN), to rigorously train models for predicting defect formation energy (DFE) in multiple charge states and chemical potential conditions. We find that ALIGNN yields the best DFE predictions with root mean square errors around 0.3 eV, which represents a prediction accuracy of 98% given the range of values within the dataset, improving significantly on the state-of-the-art.

File details

zipped files for 4 datasets on github repo

Method

No response

Method (other)

DFT-PBE

Software

VASP

Software (other)

No response

Software version(s)

No response

Additional details

No response

Property types

No response

Other/additional property

No response

Property details

No response

Elements

No response

Number of Configurations

No response

Naming convention

No response

Configuration sets

No response

Configuration labels

No response

Distribution license

No response

Permissions

  • I confirm that I have the necessary permissions to submit this dataset
@gpwolfe
Copy link
Collaborator Author

gpwolfe commented Jul 24, 2024

I see compressed .cif files for structures corresponding to the A-rich structures, and (I believe) defect formation energies corresponding to B-rich structures in a related .csv file for each of four different datasets. I don't see the energies for the A-rich or the structures for the B-rich.
Have contacted repo owner for clarification/possible access to VASP files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant