Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option for ALCF #24

Open
dylanmcreynolds opened this issue May 6, 2024 · 1 comment
Open

Add option for ALCF #24

dylanmcreynolds opened this issue May 6, 2024 · 1 comment
Assignees

Comments

@dylanmcreynolds
Copy link
Contributor

We want to add an option to the 832 flows code that lets us move data to ALCF and launch reconstructions via Globus Flows. We code that does this in another repo, but want to add this code to the production flows.

I envision the general flow to look like:
A new JSON block will be created in prefect that lets beamline staff turn this on (default off). If on, we copy to both ALCF and NERSC, and run the reconstruction at ALS.

There is a bit of code for working with Globus flows that we will want to add as generic utility code to: https://github.com/als-computing/splash_flows_globus/blob/main/orchestration/globus.py, and we will create a new task in https://github.com/als-computing/splash_flows_globus/blob/main/orchestration/flows/bl832/move.py that can be called if we're running ALCF. The goal is to use the existing confidential client setup that we have in production for the same globus authentication configuration that we have for movement.

@davramov
Copy link
Contributor

File Flow

We met with Dula to confirm the following folder/file paths and naming conventions for the reconstruction data generated on ALCF.

Additionally, we should update reconstruction.py on ALCF to not listen for a .txt file, and instead, we should update the main function of to accept the raw filename and input/output folders on ALCF. Additionally, we need to update the reconstruction_wrapper() function from the Globus Flow to take the raw file path as input.

Naming conventions

  • Experiment folder:
    <Proposal Prefix>-<5 digit proposal number>_<first part of email address>/

  • Files in folder:
    <YYYYMMDD>_<HHMMSS>_<user defined string>.h5

  • Tiled scan:
    <YYYYMMDD>_<HHMMSS>_<user defined string>_x<##>y<##>.h5

ALCF Raw Destination

  • Proposed:
    /eagle/IRIBeta/als/bl832/raw/<proposal folder name convention>/<h5 files>

ALCF Recon Destination

  • Proposed:
    /eagle/IRIBeta/als/bl832/scratch/<proposal folder name convention>/

NERSC Destination

  • Proposed:
    /global/cfs/cdirs/als/data_mover/8.3.2/scratch/<folder name convention>
  • Prune reconstruction (NERSC) after 1 week

ALS Destination (data832)

  • Proposed:
    /data/scratch/globus_share/<folder name convention>/rec<dataset>/<tiffs>
  • Purge this, too, after 3 days. Keep a close eye on space
  • .zarr generation as soon as there’s a folder of tiffs/reconstruction is done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants