Verify that demo database is in sync for production database. #96

dblodgett-usgs · 2023-01-04T20:19:19Z

The demo database seems to be out of sync with production. We need to verify that the demo database is in sync and operates the same.

gzt5142 · 2023-01-04T20:29:19Z

Received. Not able to assign myself... should I be a member of the org?

dblodgett-usgs · 2023-01-04T20:54:33Z

You are a member of the repo now.

gzt5142 · 2023-01-04T22:12:40Z

Characterizing the issue:

When I look at how new features are inserted into the features table by the crawler (see https://github.com/internetofwater/nldi-crawler/blob/master/src/main/resources/mybatis/mappers/feature.xml), it looks like it is expecting the geometry information to be in the shape column.
The demo database from this repo includes both a shape and a location column, both of which are of geometry type.
The table creation SQL (https://github.com/internetofwater/nldi-db/blob/master/liquibase/changeLogs/nldi/nldi_data/tables.sql) defines the feature table as having only a location column.

The existing crawler inserts with geometry in the shape column; the demo database has both shape and location geometry columns. And the production database, if it was created with the tables.sql script, has only a location column.

gzt5142 · 2023-01-04T22:14:43Z

This has implications for the porting of the crawler to python. I'm hoping to replicate the function I see in the java port (where shape is the geometry), and test it with the demo DB. Moving to production -- an open question as to whether a successful demo test will work in production.

gzt5142 · 2023-01-25T20:22:11Z

OK... so I'm jammed up here on how the docker images are used to create the demo database. Looks like the database is just restored from a snapshot taken at some point with the feature table populated. That's the part to re-do.

So -- if this repo will be transitioning to another technology (i.e. python + sqlalchemy + alembic) to automate database creation, we can just keep track of this detail for that work. But if we're sticking with LiquiBase et al, then I will need to figure out how to push this change into the existing framework.

dblodgett-usgs · 2023-01-26T03:54:04Z

This is probably a topic to bring up with others working on USGS databases. I think LiquiBase is probably what we are sticking with.

Can you just fixup and reup the snapshot for now and we take up automating the demo database at a later time?

gzt5142 · 2023-01-26T20:04:46Z

Can you just fixup and reup the snapshot for now and we take up automating the demo database at a later time?

Short answer is yes... can create a snapshot to use in place of the current. The Dockerfile specifies location of pgdump snapshots to use... example:

https://github.com/internetofwater/nldi-db/releases/download/artifacts-1.0.0/nldi_data.crawler_source.pgdump.gz

We can either create a new release (1.0.1, say) or overwrite the existing release. I am not finding any documentation in the repo from Ethan as to the workflow for these artifacts. Will need to do some digging -- so while I think the final answer is going to be easy, it likely won't be fast.

gzt5142 · 2023-01-26T20:21:52Z

Creating the dump is easy enough:

pg_dump  \
    --dbname=nldi \
    --host=172.18.0.1 \
    --port=5432 \
    --username=nldi_schema_owner \
    --table=nldi_data.crawler_source \
    --format=custom \
    --file /tmp/nldi_data.crawler_source.pgdump

... assuming that the demo database is spruced up the way we want it. And then creating release and artifacts that don't break anything else.

dblodgett-usgs · 2023-04-18T17:30:42Z

@gzt5142 -- I think I am going to take care of this in #100

My plan is to dump the tables that need to be loaded for the demo database and load them using the same mechanism as you could for the production database. As part of that, I am thinking I'll create one of the dump files that contains several tables so on in demo we can load a number of demo feature sources and on prod we can load a different set.

Not quite sure how that's going to work yet, but it seems like the right concept. Any thoughts?

gzt5142 · 2023-04-19T21:50:40Z

Agree concpetually. . . a pg_dump (or similar) of a developer's minimum load would be useful, and the fully populated version of same tables in a more production-like system, or production itself. I put some comments in another issue that I now see that I should have put here. So in summary:

Dev needs the full crawler_source table
Needs the full nhdplus dataset
Full feature table for one of the smallest sources.
A minimal feature table for the other sources

The NHD pieces are needed to ensure that ingested features will be spatially matched to a comid from NHD. With a subset, we risk not getting a match for newly ingested features.

dblodgett-usgs · 2023-04-20T17:27:29Z

Right -- so the dump that contains several tables could accommodate the variation in requirements you have in your dot points. I'll at least get this roughed in as a pattern so we can iterate on the details.

dblodgett-usgs · 2023-07-12T17:34:35Z

Should be good with #112

dblodgett-usgs assigned dblodgett-usgs and unassigned dblodgett-usgs Jan 4, 2023

dblodgett-usgs assigned gzt5142 Jan 4, 2023

dblodgett-usgs self-assigned this May 8, 2023

dblodgett-usgs closed this as completed Jul 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verify that demo database is in sync for production database. #96

Verify that demo database is in sync for production database. #96

dblodgett-usgs commented Jan 4, 2023

gzt5142 commented Jan 4, 2023

dblodgett-usgs commented Jan 4, 2023

gzt5142 commented Jan 4, 2023 •

edited

Loading

gzt5142 commented Jan 4, 2023

gzt5142 commented Jan 25, 2023

dblodgett-usgs commented Jan 26, 2023

gzt5142 commented Jan 26, 2023 •

edited

Loading

gzt5142 commented Jan 26, 2023 •

edited

Loading

dblodgett-usgs commented Apr 18, 2023

gzt5142 commented Apr 19, 2023

dblodgett-usgs commented Apr 20, 2023

dblodgett-usgs commented Jul 12, 2023

Verify that demo database is in sync for production database. #96

Verify that demo database is in sync for production database. #96

Comments

dblodgett-usgs commented Jan 4, 2023

gzt5142 commented Jan 4, 2023

dblodgett-usgs commented Jan 4, 2023

gzt5142 commented Jan 4, 2023 • edited Loading

gzt5142 commented Jan 4, 2023

gzt5142 commented Jan 25, 2023

dblodgett-usgs commented Jan 26, 2023

gzt5142 commented Jan 26, 2023 • edited Loading

gzt5142 commented Jan 26, 2023 • edited Loading

dblodgett-usgs commented Apr 18, 2023

gzt5142 commented Apr 19, 2023

dblodgett-usgs commented Apr 20, 2023

dblodgett-usgs commented Jul 12, 2023

gzt5142 commented Jan 4, 2023 •

edited

Loading

gzt5142 commented Jan 26, 2023 •

edited

Loading

gzt5142 commented Jan 26, 2023 •

edited

Loading