Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Strict Mode for Queryables / Search #163

Open
bitner opened this issue Feb 7, 2023 · 9 comments
Open

Enable Strict Mode for Queryables / Search #163

bitner opened this issue Feb 7, 2023 · 9 comments

Comments

@bitner
Copy link
Collaborator

bitner commented Feb 7, 2023

Enable mode to only allow search properties that are in the queryables table. When a queryable has a range or enum, return an error when a search requests values outside of that range.

@drnextgis
Copy link
Collaborator

Having this feature would be greatly appreciated: stac-utils/stac-fastapi-pgstac#61

@drnextgis
Copy link
Collaborator

drnextgis commented Sep 24, 2023

@bitner I would like to dive into implementing this feature. I've conducted some initial testing, which you can review in this commit. However, I have a few uncertainties. Firstly, is this the appropriate place to put this functionality?

I did some tests with pure SQL and with stac-fastapi. At first glance it seems to be working.

SQL

postgis=# SELECT cql2_query('{"op": "gte", "args": [{"property": "eo:cloud_cover"}, 0.59]}');
                            cql2_query                             
-------------------------------------------------------------------
 to_int(content->'properties'->'eo:cloud_cover') >= to_int('0.59')
(1 row)

postgis=# SELECT cql2_query('{"op": "gte", "args": [{"property": "xxx"}, 0.59]}');
ERROR:  Term xxx is not found in queryables.
CONTEXT:  PL/pgSQL function cql2_query(jsonb,text) line 110 at RAISE

postgis=# SELECT cql2_query('{"op": "gte", "args": [{"property": "gsd"}, 0.59]}');
ERROR:  Term gsd is not found in queryables.
CONTEXT:  PL/pgSQL function cql2_query(jsonb,text) line 110 at RAISE

STAC API

Using http requests it also seems to be working.

$ echo '{"filter-lang": "cql2-json", "filter": {"op": "gte", "args": [{"property": "eo:cloud_cover"}, 0.59]}}' | http http://0.0.0.0:8082/search | jq .context
{
  "limit": 10,
  "returned": 0
}
$ echo '{"filter-lang": "cql2-json", "filter": {"op": "gte", "args": [{"property": "xxx"}, 0.59]}}' | http http://0.0.0.0:8082/search | jq
{
  "code": "RaiseError",
  "description": "Term xxx is not found in queryables."
}

But this example is kind of weird because gsd doesn't exist in queryables (but unlike xxx it exists in STAC items):

$ echo '{"filter-lang": "cql2-json", "filter": {"op": "gte", "args": [{"property": "gsd"}, 0.59]}}' | http http://0.0.0.0:8082/search | jq .context
{
  "limit": 10,
  "returned": 10
}

If I modify the value, it starts functioning as anticipated 🤔

$ echo '{"filter-lang": "cql2-json", "filter": {"op": "gte", "args": [{"property": "gsd"}, 0.0]}}' | http http://0.0.0.0:8082/search
{
  "code": "RaiseError",
  "description": "Term gsd is not found in queryables."
}

@drnextgis
Copy link
Collaborator

Alright, the mystery has been unraveled. It turns out that the issue stemmed from the fact that the search query was stored in the searches table cache. Once it was removed from there, everything functioned as anticipated:

$ echo '{"filter-lang": "cql2-json", "filter": {"op": "gte", "args": [{"property": "gsd"}, 0.59]}}' | http http://0.0.0.0:8082/search
{
  "code": "RaiseError",
  "description": "Term gsd is not found in queryables."
}

So, please confirm if I'm headed in the right direction, and if so, I'll proceed with crafting a PR.

@bitner
Copy link
Collaborator Author

bitner commented Sep 26, 2023

Hey @drnextgis I'll look through this and get back to you later today

@bitner
Copy link
Collaborator Author

bitner commented Sep 28, 2023

@drnextgis I think that your approach looks like a good place to put this. Let me know if there is anything that I can do to help you get a PR up.

With 0.8.0 I tried to simplify things a bit to allow PRs with "unreleased" rather than version tagged migrations, so you will not need to create a new version tag in order to get the tests to run any more, but you do need to run scripts/stageversion which will create the database migrations for you.

Do make sure to add some tests either in the basic sql tests (this is just a sql file that gets run and another file that is just the expected output from running that sql file) or using the pgtap tests.

Let me know if there is anything that you'd like to get on a call to pair with, I'd be expected to help get more contributors!

@mmcfarland
Copy link
Collaborator

I'd look forward to this feature as well. Just want to make sure we're considering it an optional mode (defaulted to off) configured in pgstac.settings so that searches work before users can get their queryables configured.

@bitner
Copy link
Collaborator Author

bitner commented Sep 28, 2023

@mmcfarland @drnextgis yes, we should set this as a configuration parameter that should also control whether queryables advertises "additional_fields":true or not.

@drnextgis
Copy link
Collaborator

Thank you for the guidance @bitner! I've prepared a PR, please review it: #216

@drnextgis
Copy link
Collaborator

@bitner, after reviewing the documentation, I've come to understand that that the queryables schema serves an informative role and isn't intended for validation purposes. Additionally, the validation process is contingent on the specific operator being used. For instance, in the case of the != operator, none of the checks you mentioned (range or enum) can be executed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants