-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add STAC catalog #297
Add STAC catalog #297
Conversation
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/1830/Result : failure BIRDHOUSE_DEPLOY_BRANCH : add_stac DACCS_CONFIGS_BRANCH : stac_populator PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-133.rdext.crim.ca Infrastructure deployment failed. Instance has not been destroyed. @matprov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick review pass.
Will review again once the platform works with those configs.
birdhouse/components/stac/config/canarie-api/_docker-compose-extra.yml
Outdated
Show resolved
Hide resolved
@mishaschwartz Now possible via
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor edit left to fix.
Is there anything still blocking this PR?
Only approvals of @mishaschwartz and @tlvu |
@matprov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First off, totally sorry for my late review. I had too much on my plate lately and I completely forgot about this PR given it has been opened for like 6 months and movement seems to only be picked up lately.
I found a few things to fix here but it's alright since all are in the new codes so no regression at all.
@@ -15,7 +15,32 @@ | |||
[Unreleased](https://github.com/bird-house/birdhouse-deploy/tree/master) (latest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matprov No bumpversion? FYI the release procedure https://github.com/bird-house/birdhouse-deploy/blob/master/birdhouse/README.rst#release-procedure
Planning on a quick subsequent PR?
services: | ||
stac: | ||
container_name: stac | ||
image: ghcr.io/crim-ca/stac-app:main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not using exact version for reproductibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this first? crim-ca/stac-app#1
environment: | ||
- POSTGRES_USER=${STAC_POSTGRES_USER} | ||
- POSTGRES_PASS=${STAC_POSTGRES_PASSWORD} | ||
- POSTGRES_DBNAME=postgis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, is postgis
hardcoded somewhere? Why not call the DB simply stac
?
|
||
stac-browser: | ||
container_name: stac-browser | ||
image: ghcr.io/crim-ca/stac-browser:docker_image_push |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Versionned image for reproductibility?
- PGDATABASE=postgis | ||
volumes: | ||
- stac-db:/var/lib/postgresql/data | ||
healthcheck: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to have a healthcheck here. The other 2 containers, would be nice to have some sort of healthcheck as well.
retries: 5 | ||
|
||
# extend proxy with endpoint and config for STAC API access | ||
proxy: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate fragment with existing file birdhouse/components/stac/config/proxy/docker-compose-extra.yml
!
@@ -4,4 +4,5 @@ | |||
proxy_set_header X-Forwarded-Proto $real_scheme; | |||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; | |||
proxy_set_header X-Forwarded-Host $host:$server_port; | |||
proxy_set_header Forwarded "proto=https;host=${PAVICS_FQDN}"; # Helps the STAC component to craft URLs containing the full PAVICS_FQDN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use PAVICS_FQDN_PUBLIC
for anything public facing.
# populates STAC catalog with sample collection items | ||
stac-populator: | ||
container_name: stac-populator | ||
image: ghcr.io/crim-ca/stac-populator:master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Versionned image for reproductibility?
- STAC_ASSET_GENERATOR_TIMEOUT=${STAC_ASSET_GENERATOR_TIMEOUT} | ||
- STAC_HOST=http://stac:8000/stac # STAC API internally accessed to avoid Twitcher authentication | ||
command: > | ||
bash -c "./wait-for-it.sh stac:8000 -t 30 && ./populate.sh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noob question about the stac-populator: does this just populate once and exit or it stays in the background and listen for new data and repopulate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another Noob question about the stac-populator: how does it knows which collection to crawl and populate the stac-db? Is the path https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/catalog/datasets/catalog.html hardcoded? This should be configurable.
Or are we crawling directly on disk? But then I do not see any volume-mount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tlvu The pipeline for populating STAC is being handled in a separate repo (https://github.com/crim-ca/stac-populator). I think this is just for testing.
@@ -0,0 +1,6 @@ | |||
version: "3.4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entire file should be in birdhouse/optional-components/stac-public-access/config/magpie/docker-compose-extra.yml
to follow the new layout I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mishaschwartz Oh this one is a special case ! If this file move to birdhouse/optional-components/stac-public-access/config/magpie/docker-compose-extra.yml
, then there is no file docker-compose-extra.yml
at the root of this component birdhouse/optional-components/stac-public-access/
!
Would the "inner" docker-compose fragment file be discovered even if no file at the root of the component?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works... but if we want to follow the pattern we use elsewhere we should really add this to birdhouse/optional-components/stac-public-access/config/magpie/docker-compose-extra.yml
so that it will only be set if magpie is enabled as well.
We're almost certainly going to make magpie a required component but it would be nicer to keep the pattern we've already established.
Thanks for finding that @tlvu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes agreed. The new directory layout pattern is not only for looking nice and tidy, it's to allow 100% flexible deployment.
I think we need to document this pattern here https://github.com/bird-house/birdhouse-deploy/blob/master/birdhouse/README.rst and explain the proper reason behind it.
## Overview This PR includes some changes that were suggested in a review for #297. But because the PR was already merged they are included here: - removes extra block to include in docker compose files (no longer needed) - moves docker compose file in `stac-public-access` component to the correct location - uses `PAVICS_FQDN_PUBLIC` for public facing URLs in all places ## Changes **Non-breaking changes** - code reorganization **Breaking changes** None ## Related Issue / Discussion - Related to #297 ## Additional Information
Overview
./components/stac
is added toEXTRA_CONF_DIRS
.Changes
Service
stac
(API) gets added with endpoints/twitcher/ows/proxy/stac
and/stac
.STAC catalog can be explored via the
stac-browser
component, available under/stac-browser
.Image
crim-ca/stac-app
is a STAC implementation based onstac-utils/stac-fastapi
.Image
crim-ca/stac-browser
is a fork ofradiantearth/stac-browser
.Adds
Magpie
permissions and service forstac
endpoints.Uses stac-populator to populate STAC catalog with sample collection
items via CEDA STAC Generator, employed in sample
CMIP Dataset Ingestion Workflows.
Demo Instance
STAC API : https://stac-dev.crim.ca/stac/
STAC Browser : https://stac-dev.crim.ca/stac-browser/
Note that by default STAC API will return 10 items to reduce payload size. It is however possible to change this limitation by adding
?limit=200
to the URL in order to query 200 items. In the response payload you'll have a link referring to thenext
items, adding a token to the query params in order for STAC API to return next results.Sample STAC API collection query using a CLI
Remove the
-c
flag for global query across any collection.Sample STAC API global query using CQL via cURL call
Note that the operators are describe here : https://portal.ogc.org/files/96288
Get the queryables of the CMIP6 collection, statically created at collection creation
https://stac-dev.crim.ca/stac/collections/c604ffb6d610adbb9a6b4787db7b8fd7
Get the queryables of the CMIP6 collection, dynamically created at query time
https://stac-dev.crim.ca/stac/collections/c604ffb6d610adbb9a6b4787db7b8fd7/queryables
Get the queryables of the union of the CMIP5 and CMIP6 collections, dynamically created at query time
https://stac-dev.crim.ca/stac/queryables?collections=0798aa197d54eb4332767a5a4077fb0f,c604ffb6d610adbb9a6b4787db7b8fd7
daccs_configs_branch: stac_populator
daccs_skip_ci: true
fyi @huard @mishaschwartz