-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorg0.7.0 #160
Reorg0.7.0 #160
Conversation
bitner
commented
Feb 6, 2023
•
edited
Loading
edited
- Reorganize code base to create clearer separation between pgstac sql code and pypgstac.
- Move Python tooling to use hatch with all python project configuration in pyproject.toml
- Rework testing framework to not rely on pypgstac or migrations. This allows to run tests on any code updates without creating a version first. If a new version has been staged, the tests will still run through all incremental migrations to make sure they pass as well.
- Add pre-commit to run formatting as well as the tests appropriate for which files have changed.
- Add a query queue to allow for deferred processing of steps that do not change the ability to get results, but enhance performance. The query queue allows to use pg_cron or similar to run tasks that are placed in the queue.
- Modify triggers to allow the use of the query queue for building indexes, adding constraints that are used solely for constraint exclusion, and updating partition and collection spatial and temporal extents. The use of the queue is controlled by the new configuration parameter "use_queue" which can be set as the pgstac.use_queue GUC or by setting in the pgstac_settings table.
- Reorganize how partitions are created and updated to maintain more metadata about partition extents and better tie the constraints to the actual temporal extent of a partition.
- Add "partitions" view that shows stats about number of records, the partition range, constraint ranges, actual date range and spatial extent of each partition.
- Add ability to automatically update the extent object on a collection using the partition metadata via triggers. This is controlled by the new configuration parameter "update_collection_extent" which can be set as the pgstac.update_collection_extent GUC or by setting in the pgstac_settings table. This can be combined with "use_queue" to defer the processing.
- Add many new tests.
- Migrations now make sure that all objects in the pgstac schema are owned by the pgstac_admin role. Functions marked as "SECURITY DEFINER" have been moved to the lower level functions responsible for creating/altering partitions and adding records to the search/search_wheres tables. This should open the door for approaches to using Row Level Security.
- Set search_path and application_name after connection on pypgstac rather than as a kwarg parameter for compatibility with RDS (Fixes Make connection command-line parameters optional #156).
- Allow pypgstac loader to load data on pgstac databases that have the same major version even if minor version differs. Fixes Allow use of pypgstac loader within same major version. #162. Cherry picked from Allow use of pypgstac loader within same minor version #164 (Thanks @drnextgis).
src/pypgstac/pypgstac/db.py
Outdated
@@ -86,7 +87,7 @@ def get_pool(self) -> ConnectionPool: | |||
num_workers=settings.db_num_workers, | |||
kwargs={ | |||
"options": "-c search_path=pgstac,public" | |||
" -c application_name=pypgstac" | |||
" -c application_name=pypgstac", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bitner could you please make PgstacDB
more robust by not using command-line options? As @captaincoordinates pointed out PgstacDB
cannot work properly with Amazon RDS Proxy in its current state. The possible solution would be to remove that, and then once the connection is established execute:
self.connection.execute(
"set search_path = pgstac,public; set application_name=pypgstac"
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drnextgis @captaincoordinates, I've made a change for this in the latest commit. Can you verify that this works with RDS Proxy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bitner thanks for adding this change, unfortunately it will be a little while before I can test this on RDS Proxy - I'm currently using pypgstac via stac-fastapi and it doesn't support 0.7.0. I'll have to create a new project and some new AWS infrastructure for it to use before I can verify.
🥳 |