Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[functionality] QOL utils (parallelised vasp import, compression, decompression etc.) #793

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ligerzero-ai
Copy link

@ligerzero-ai ligerzero-ai commented Jul 24, 2023

Quality of life utilities parallelised, including but not limited to:

  • Find folders with specified files
  • Extract tarballs
  • Compress tarballs
  • Import vasp calculations into a df in parallel via a single call with function

e.g.

from pyiron_contrib.utils.vasp import DatabaseGenerator
import argparse

def main():
    parser = argparse.ArgumentParser(description='Find and compress directories based on specified criteria.')
    parser.add_argument('directory', metavar='DIR', type=str, help='the directory to operate on')
    args = parser.parse_args()
    
    datagen = DatabaseGenerator(args.directory)
    df = datagen.build_database(max_dir_count = 2000)

if __name__ == '__main__':
    main()
python3 /home/562/hlm562/python_scripts/build_vasp_database.py $PWD >> py.output

This can effectively decompress and read 12000 vasp dirs in parallel with 96 cores in 23mins ish on a HPC job.

output df:

image

  • Chargemol utils for plotting bonds/bond analysis

Some notes:

  • Need to switch to pyfileindex for file search via os.walk
  • Need to switch to pympipool when I get around to it.
  • Adjust import statements to include pyiron_contrib
  • Add option to specify workers in parallelise

@github-actions
Copy link
Contributor

Binder 👈 Launch a binder notebook on branch pyiron/pyiron_contrib/utils_and_table

@ligerzero-ai ligerzero-ai added the enhancement New feature or request label Jul 24, 2023
@ligerzero-ai ligerzero-ai requested a review from jan-janssen July 24, 2023 23:10
@ligerzero-ai ligerzero-ai marked this pull request as draft July 24, 2023 23:10
@github-actions
Copy link
Contributor

Pull Request Test Coverage Report for Build 5650712108

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 12.945%

Totals Coverage Status
Change from base Build 5645181349: 0.0%
Covered Lines: 1872
Relevant Lines: 14461

💛 - Coveralls

@ligerzero-ai
Copy link
Author

@jan-janssen

Notes:
There is no infrastructure for adding functions similar as to a pyiron table - this is purely for my own purposes which is to import vasp calculations in parallel.

There will need to be some conceptualisation of what the table functionality looks like as a standalone module. But a lot of the utilities can help with that, I imagine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant