Replies: 6 comments 9 replies
-
That makes a lot of sense to me! I have no idea of how to go about implementing that though! |
Beta Was this translation helpful? Give feedback.
-
Some answers from the CMR-STAC team
Based on this I think aligning earthaccess with CMR-STAC is doable and will bring great flexibility and re-usability to existing workflows. |
Beta Was this translation helpful? Give feedback.
-
Thanks for documenting all these ideas @betolink ! Yes, my suggestion was to support the same search keywords and values as the STAC API (e.g.
Based on this, I can see why the above suggestion might be hard! In practice I find this lack of full compliance super difficult as a user. I’ve tried to use CMR-STAC multiple times over the last couple years but unfortunately each time find that it falls short and I stop using it:
So I wonder instead of the proxy approach, what is preventing hosting STAC metadata directly in CMR (at least just for the new cloudHosted datasets)?
Providing fsspec objects is nice for consumption by downstream tools. I'll caution that intake-stac worked well for the case of a arbitrary single file (CSV, HDF, TIF, etc) to give back a python object, but because searches can return pretty complicated collections of data, the process of returning a composite data object in an efficient and accurate way is pretty hard. You start getting into fsspec cache settings, and other settings based on the tool that ultimately does I/O (e.g. rasterio/GDAL). This is where development of intake-stac basically stalled. If you restrict the possible data formats it does become much more tractable. For example just dealing with COGs is complicated enough with stackstac and odc-stac :) |
Beta Was this translation helpful? Give feedback.
-
Hi @abbottry -- is the a decent place to report other issues about the EarthData STAC? I wanted to note that in addition to the issues mentioned above, the static JSON versions could be improved, e.g. for better consistency with stac-browser. Note that currently stac-browser does not recognize either the pagination or the stac API (rendered example) makes it appear like there are only 10 entries. As NASA is already providing both static JSON and dynamic API endpoints, I understand the API should be mentioned in the JSON using the
Examples of this can be seen in, e.g. on planetary computer stac (note the API box at top and the added search abilities that come with this; see source json for comparison). |
Beta Was this translation helpful? Give feedback.
-
It's been a while and I thought I'd revive this discussion. I'm still a bit confused by 1. NASA CMR already directly supporting STAC as an output type versus 2. The NASA-CMR STAC Proxy effort. If the cmr-stac proxy is the future of development that would be good to know - as then However, another possible approach is to just expose the STAC output type from CMR. I just opened an issue about that here nasa/python_cmr#81 . This would make it straightforward to take advantage of other great tools like The example below would require a small change to import geopandas as gpd
from cmr import GranuleQuery
import ast
api = GranuleQuery()
search = api.parameters(
point=(-105.78, 35.79),
temporal=('2021-02-01','2021-03-01'),
collection_concept_id='C2021957657-LPCLOUD' # Required for STAC search
)
items = search.format("stac").get()
features = ast.literal_eval(items[0]) # python_cmr returns list with string-wrapped dict
gf = gpd.GeoDataFrame.from_features(features, crs='EPSG:4326') |
Beta Was this translation helpful? Give feedback.
-
Perhaps orthogonal to the issue of a stac API, but this sounds really compelling to me: https://cloudnativegeo.org/blog/2024/08/introduction-to-stac-geoparquet/ Specifically, it's really nice to be able to see the entire catalog metadata as data like this, rather than having to pound the heck out of the API... |
Beta Was this translation helpful? Give feedback.
-
Based on what the GIS/Pangeo communities are doing, I think earthaccess could be more useful if it was aligned with the STAC specification. An idea would be to support both CMR and CMR-STAC access patterns, @scottyhq suggested something like this already in #167
It would be interesting to explore
intake-stac
to stream the results and earthaccess will only provide the authenticated fsspec sessions.This integration will make earthaccess more flexible but there are some things that we need to verify before trying to implement this:
DOI
/short_name
to search for data collections?Beta Was this translation helpful? Give feedback.
All reactions