-
Notifications
You must be signed in to change notification settings - Fork 1
STAC catalog sprint: to-do items #1
Comments
I thought I left a comment here, but apparently not. The gist of it was about the item:
The |
Thanks, Tom. I definitely am. Once I get a little closer to figuring out what functionality we need, I'll follow up directly on |
This is a great list. I'd love to try to help however I can. A few comments.
We should fix this! We are still very early in this project. We don't need to worry about preserving bad decisions to maintain some vague backwards compatibility. We should get consistent by either changing the file paths or changing the spec.
Yes they can. You can have as many layers of nesting as you want, by linking to other catalogs from catalogs. See STAC catalog overview.
👍 from me
👍 from me
Can pystac load Zarr STAC items? |
Yeah! Figured this out somewhere along the way too. My current thought (formal write-up to follow, which can tie in your ADR notes above) ... is that our mapping should be:
This is the pattern I've followed in https://github.com/pangeo-forge/pangeo-forge-catalog/tree/dev/stac (which I just pushed, for discussion purposes).
It requires a few lines (or wrapping them in a function), but yeah! Check this out: https://nbviewer.jupyter.org/github/cisaacstern/stac-notebooks/blob/gigatl-reg01-surf-fma/example_notebook.ipynb (where |
A prototyped version of this was completed by pangeo-forge/pangeo-forge-vue-website#6 and should be usable at https://pangeo-forge.org/catalog once Netlify rebuilds. As noted in my last comment on that PR, this code will likely need to be refactored once STAC Browser components become installable directly from
With Ryan's thumbs up on using GItHub to start, I'm going to consider that the agreed-upon approach for the time being. This is where I'm directing our STAC Browser to the Pangeo Forge root catalog: https://github.com/pangeo-forge/pangeo-forge-vue-website/blob/main/src/main.js#L14. This can be changed over to a database-backed API whenever we see fit.
I've recorded a number of specific points for consideration on this topic here: pangeo-forge/pangeo-forge-vue-website#6 (comment). Ok! More to follow tomorrow. That was a big push and I think I'm ready for a break. 😅 |
Well deserved! 🏆 for pushing this difficult and uncertain task forward with minimal guidance. 👏 👏 👏 |
Following pangeo-forge/pangeo-forge-vue-website#7, prototype catalog is now up at https://pangeo-forge.org/catalog#/. |
This is great @cisaacstern 🎊 . Apologies for the radio silence as you have been building this out as I've been out of the office (STAC is the one area of this project where I actually have some experience and might be able to contribute :]). I'll try to address points in the order described in your initial comment.
|
Thanks for this incredibly helpful perspective, @sharkinsspatial. There's a lot to dig into, but one small point of clarification to start. Option A below is the Collection naming scheme as proposed in your table above. Is it indeed a STAC Best Practice to not store a Collection object within a subdirectory of its enclosing Catalog? Option B seems more intuitive to me, but of course just want to do whatever is considered mainstream within the ecosystem.
|
I may be off-base, but radiantearth/stac-api-spec#159 might be related to the catalog / collections layout discussion. |
@cisaacstern Apologies, that is a typo in my comment. Your nested |
Thanks to Tom for stac-utils/xstac#11 (comment) which will be of great help in generating STAC Items. |
I believe the best way to build this is as a standalone GitHub Action, to be called following completion of https://github.com/pangeo-forge/feedstock-creation-action here: staged-recipes/create-feedstock.yaml A standalone Action repo should make local testing with https://github.com/nektos/act easier and means we can maintain/update/etc. Here is the WIP repo for this Action: https://github.com/pangeo-forge/stac-creation-action. Updates to follow shortly. |
{prefix}/pangeo-forge/{feedstock_name}/{recipe_name}.{dataset_type}
pangeo-forge
level should be represented by a STAC Catalogpangeo-forge
and{feedstock_name}
corresponding to the project or, we might say, "collection" to which the recipe belongs. For example, in the case of theswot-adac
project, each feedstock (i.e. PR repo corresponding to a particular model output), is stored with a target path in the style ofpangeo-forge/swot-adac/{feedstock_name}/{recipe_name}.zarr
build_stac
method or similar within theXarrayZarrRecipe
class, and that this method is something which we'll want to call at recipe build time (afterfinalize_target
), so that it can draw upon additional metadata fields (either pre-existing, or which we will add for this purpose) in the recipe'smeta.yaml
file. While certain cataloging information can be extracted from the zarr metadata directly via xarray (e.g., spatiotemporal extent, variable names, etc.), other key metadata for building expressive STAC objects will need to be fed from outside the dataset (i.e., long-form description, provider url, license, etc.). Much of this is already captured inmeta.yaml
, so this is a natural place to pull it from.pystac
-backed STAC objects, https://github.com/TomAugspurger/xstac is a good baseline, but may need extension for our specific use case(s)pystac
's type checker), which can be opened in a notebook as demonstrated here. This loading syntax is referenced from https://planetarycomputer.microsoft.com/dataset/daymet-monthly-hi#Example-Notebook and discussed further in the next to-do item.pystac
-based loading linked above currently appears to be the best way to open zarr datasets from STACintake-stac
is under discussion here Load collection-level assets into xarray intake/intake-stac#90cc @rabernat, @sharkinsspatial, @TomAugspurger so you're aware of current progress on this
The text was updated successfully, but these errors were encountered: