-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extended Loading Time for Mainstem Data from CSV #111
Comments
This is because it is downloading and loading the national table. I believe it's grabbing this file: https://code.usgs.gov/wma/nhgf/reference-hydrofabric/-/raw/main/workspace/data/mainstem_lookup.csv.gz?inline=false ?? Would it be helpful to check in one for the demo database there or should we do it elsewhere? |
I believe the file is downloaded as a .gz during the docker build - so the length of this step is just to import mainstem_lookup.csv. What confuses me is that this isn't indexing any columns in the mainstem_lookup table. Docker desktop now has a handy memory plotter - this is what I see while starting nldi-db:demo from scratch
Becomes somewhat a question of ensuring we establish clear understandings of the different flavors of nldi-db being published (re: #100). Because having a yahara csv of mainstems would be a lot smaller - and faster than the full scale mainstem lookup table. |
@dblodgett-usgs For the demo database at least, I think a smaller subset of mainstem_lookup.csv makes sense. The demo database is supposed to work out of the box - I think it is fine to only have the subset of comid's that are in the Yahara basin. Based on our conversations about the future controls put on the NLDI Crawler, this would mean the demo database would only index data from yahara. This SQL returns 192 rows and could easily be put into an artifact for the demo database Dockerfile
|
Nice! |
I will create artifacts for the release then! |
Description:
Issue Summary:
The loading process for the mainstem data from a CSV file is taking an unusually long time, significantly impacting overall system performance. This issue aims to investigate and optimize the loading time to improve the efficiency of the process.
Steps to Reproduce:
Expected Behavior:
The loading of the mainstem data for the demo should be completed within a reasonable time frame, similar to other data loading processes.
Actual Behavior:
The loading of the mainstem data from the CSV file is taking a considerable amount of time, as indicated in the following logs:
As seen, the loading process took approximately 218 seconds
The text was updated successfully, but these errors were encountered: