Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data catalogue and database #1270

Open
1 of 2 tasks
mandresm opened this issue Jan 17, 2025 · 0 comments
Open
1 of 2 tasks

Data catalogue and database #1270

mandresm opened this issue Jan 17, 2025 · 0 comments
Assignees

Comments

@mandresm
Copy link
Contributor

mandresm commented Jan 17, 2025

Intake and Database

This is an issue to collect requirements and design ideas for the ESM-Tools database and catalogues

Requirements

  • Easily find experiments
>>> import esm_tools.database
>>> db = esm_tools.database.Database(user, password)
>>> db.find(model_name="AWICM3", expid=".*piControl.*")
[List of SimulationResult] # ??
  • "backporting" tool for existing runs
$ esm-tools database register_run <path>

or from Python

>>> import esm_tools.database
>>> db = ...
>>> db.register(path)  # Should contain a finished_config.yaml somewhere in path
  • Centralized database

  • graph TD
      RunA --> UserA
      RunB --> UserA
      RunC --> UserA
      
      RunD --> UserB
      RunE --> UserB
    
      UserA --> GroupA
      UserB --> GroupA
      GroupB
      GroupA --> MachineA
      GroupB --> MachineA
      
      MachineA --> VM
      MachineB --> VM
    
    Loading

    In Python:

    import intake
    cat = intake.open_catalog("http://global_catalog.yml")
    # On DKRZ:
    cat = intake.open_catalog(["https://dkrz.de/s/intake"])
    my_run = cat["RunA"]
    list(my_run)
    ["/some/path/with/RunA/thetao.fesom.1850.01.nc"]
    ds = my_run.load()
  • Group by groups, user, machine, projects,

Ideas

  • Hook into existing tools (DKRZ? O2A?)
    • FREVA
  • Not including archiving

Nice to have

  • Where is this file after generation (moved?)
  • S3 support

TODO

  • Miguel -> send FREVA emails to Paul
  • Paul -> ask FREVA if intake catalogues is a good input for them
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants