-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading STAC output from satsearch library #256
Comments
Full traceback: ---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-27-8e028b95925b> in <module>
23
24 # KeyError: 'links'
---> 25 cat = pystac.read_file('my-s2-l2a-cogs.json')
~/miniconda3/envs/intake-stac-gui/lib/python3.7/site-packages/pystac/__init__.py in read_file(href)
69 by the JSON read from the file located at HREF.
70 """
---> 71 return STACObject.from_file(href)
72
73
~/miniconda3/envs/intake-stac-gui/lib/python3.7/site-packages/pystac/stac_object.py in from_file(cls, href)
521
522 if cls == STACObject:
--> 523 o = STAC_IO.stac_object_from_dict(d, href=href)
524 else:
525 o = cls.from_dict(d, href=href)
~/miniconda3/envs/intake-stac-gui/lib/python3.7/site-packages/pystac/serialization/__init__.py in stac_object_from_dict(d, href, root)
35
36 if info.object_type == STACObjectType.CATALOG:
---> 37 return Catalog.from_dict(d, href=href, root=root)
38
39 if info.object_type == STACObjectType.COLLECTION:
~/miniconda3/envs/intake-stac-gui/lib/python3.7/site-packages/pystac/catalog.py in from_dict(cls, d, href, root)
780 @classmethod
781 def from_dict(cls, d, href=None, root=None):
--> 782 catalog_type = CatalogType.determine_type(d)
783
784 d = deepcopy(d)
~/miniconda3/envs/intake-stac-gui/lib/python3.7/site-packages/pystac/catalog.py in determine_type(stac_json)
54 self_link = None
55 relative = False
---> 56 for link in stac_json['links']:
57 if link['rel'] == 'self':
58 self_link = link
KeyError: 'links' |
So it looks like the error is because the single file STAC that sat-search saves does not have a links object, which it should because it's supposed to be a valid STAC catalog. I can cut a 0.3.1 release for this (probably need to add |
Thanks @matthewhanson , following the example search results in this repo And the test file here I was able to determine the following must be added to sat-search results at a minimum: {
"links": [],
"id": "sat-search-results",
"stac_version": "1.0.0-beta.2",
"description": "sat search results",
"stac_extensions": [
"single-file-stac"
],
"type": "FeatureCollection",
.
.
. Note there are other validation errors showing up against 1.0.0-beta.2 so I'd suggest a test over in sat-search that does the following (example using modified search output) sfs = pystac.read_file('my-s2-l2a-cogs.json')
sfs.validate() #ValidationError: [-180, -90, 180, 90] is not of type 'object' I added stac_version |
@lossyrob I'm still a bit perplexed on the pystac side why single-file-stac behaves differently from pystac.Catalog, specifically iterating over items contained in the search results: test_url = 'https://raw.githubusercontent.com/stac-utils/pystac/develop/tests/data-files/examples/1.0.0-beta.2/extensions/single-file-stac/examples/example-search.json'
sfs = pystac.read_file(test_url)
sfs.validate()
print(type(sfs)) #<class 'pystac.catalog.Catalog'>
# Does not iterate Items in SFS catalog
for f in sfs.get_all_items():
print(f.id)
# Get Items from 'features' attribute instead
for f in sfs.ext['single-file-stac'].features:
print(f.id) #LC80370332018039LGN00, LC80340332018034LGN00 |
@scottyhq the single file stac extension was changed at some point in the spec from being an independent object (an ItemCollection, back when it was part of the core spec) to inheriting from Catalog. I don't think there was a lot of thought put into how a single file stac catalog should override the behaviors of Catalog, and there's some complexities around it that make the expected implementation a bit tricky. E.g. - since a single file STAC is a catalog, can it contain links to child items as well? What does that mean for what We had a STAC call today where single file STACs got brought up and there was a question about why the inherit from catalog, and if that should be the case. I feel like this is an instance where a single-file-stac extension clashes with the core Catalog types enough to make me feel like it shouldn't be a Catalog extension. Keeping it as an extension, I'm not sure what to do here - the core functionality of "get_all_items" looks for any child links that are items and returns them. In this case, the items are populated in an extension field, and are considered distinct. To modify Perhaps a better way to approach it would be to consider a single file stac as an ItemCollection and not a Catalog, and bake ItemCollection in as an additional STACObject type in PySTAC (which it was until it was dropped from the spec). We could have conversion logic from ItemCollection -> Catalog and Catalog -> ItemCollection, which would be more straightforward. This would make PySTAC a bit out of sync with the spec (e.g. it doesn't currently include any concepts from stac-api-spec, and single-file-stac is currently a Catalog extension but the way we would work with it is not extension-like). This might be better to go into a separate library - you're already using sat-search to get at the item collection, and there's been a lot of talk about a PySTAC-based stac API client that could serve those same needs. Given that ItemCollection is a STAC API concept, I think my ideal conclusion would be:
In the short term, though, I'm not if you are blocked without a workaround, which I think we can come up with - besides having to iterate over the features and not with |
Thanks @lossyrob for the thoughtful details. I opened up this issue originally while trying to switch between a From a 'new user' perspective who hasn't been up to speed on the discussion, it feels like the stac search apis should return something that tools designed around the core spec should be able to consume. But if i'm following you'd need something like the No rush here really, iterating over features works, but i did have to spend a bit of time looking at the tests in this library to figure out how to do it. I expected |
Tying up loose ends:
I think the only thing missing from @lossyrob's suggestions is a convenience method to convert between ItemCollections and Catalogs. @scottyhq is that still something that would be useful? Or can we consider this issue fixed? FYSA here's what your example looks like with pystac-client: import pystac
import pystac_client
import json
from pystac_client import Client
from pystac import ItemCollection
from pystac.validation import validate_dict
print(pystac.__version__)
print(pystac_client.__version__)
bbox = [35.48, -3.24, 35.58, -3.14]
dates = "2020-07-01/2020-08-15"
URL = "https://earth-search.aws.element84.com/v0"
client = Client.open(URL)
results = client.search(
collections=["sentinel-s2-l2a-cogs"],
datetime=dates,
bbox=bbox,
)
# 18 items found
items = results.item_collection()
print(len(items))
items.save_object("my-s2-l2a-cogs.json")
# validating an ItemCollection doesn't make sense, as there isn't a jsonschema for it.
item_collection = ItemCollection.from_file("my-s2-l2a-cogs.json") |
Thanks for checking @gadomski ! Feel free to close this issue. But I do think a Catalog <-> ItemCollection utility would be very useful. See discussion of why here : gjoseph92/stackstac#86 |
Cool, thanks for pointing me at that issue. If we decide it should be a PySTAC thing we can make a new issue to capture. 🥂 |
It appears pystac cannot currently read in the results from the https://github.com/sat-utils/sat-search library. Using the validation code in the pystac docs does not return any error, but i'm guessing the sat-search json returned is in fact not valid for some reason @matthewhanson or @lossyrob ?
The text was updated successfully, but these errors were encountered: