You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an EPIC card. As items from this list are addressed, their active cards will be linked.
This is a evolving document. Expect many changes over the next few weeks.
Overall Design / Tasks
Source Data -> Process & Prepare Data ->Publish Services
Learning Ras2Fim previous processing
Phase 1: Start with example dataset
Step 3: Source Code Analysis
Step 4: Process, data integration, loading for Viz.
Step 5: Publish, various tests, not public, internal use only.
Feedback from leadership
Feedback from end-users
After Phase 1
Scale the amount of data (1/3 of total)
Fine tuning hardware and software
Increase the amount of data (2/3 of total)
Fine tuning hardware and software
Full scale testing
DEV(TI) -> UAT -> PRD
[1] Sensitive Information Locations [WILL NOT BE visible in github]
Secret manager
Google Drive
[2] Infrastructure Setup for Development Environment
S3 Bucket(s)
S3 bucket created for development
Folder naming conventions
Optimizing Performance
Measuring Performance
EC2 Instance(s) for design evaluation and testing
Windows
[In progress] Linux for Viz and FIM dev. Multiple EC2's with different sizes for benchmark analysis.
[In progress] Optimizing Performance (low volume)
[In progress] Measuring Performance (low volume)
Folder structure conventions
Data
Measuring
Optimizing
Software
Software conventions
Install Locations
Code Version control (likely in HydroVIS github. Code version not in use for HydroVIS)
[3] Source Data
*** Maybe change this section is more about tool / data analysis and should be pulled to another card?
Heidi Safa has been evaluating and analyzing Ripple data.
Heidi copying example datasets into S3 bucket
[in progress] Deciphering data we need (can we pre-filter some of it? maybe a subfolder for HV consumption? Rest kept for debugging?
Download all Dewberry FIM_30 Ripple model files and folders to FIM/HV s3 buckets. Note: 485 models available. Total size could be over 1 TiB, numbers unconfirmed. ie) determine volumes.
Build script to pull all or filtered data from RTX to HV s3 buckets. While possible to call their buckets remote, Rob strongly advises against it for multiple reasons including; permissions, moving to other enviros (UAT, PROD across multiple regions) and pre-processing if required.
[in progress] Investigate gaps in data reaches. Note: Full replacement dataset from RTX is coming including re-adding missed FIM_10 data, missed in current releases.
Eval an internal version system to reflect what version of a dataset we get. May not want auto tie it to the Ripple public release name of "Ripple 3.0". Maybe subfolders in our S3 to distinguish differences. ie) (current FIM_30 version), replacement FIM_30 dataset coming, FIM_60 dataset coming. Internal dataset name/number convention TBD.
Example to start with:
ble_12030106_EastForkTrinity
[4A] Process Data
Part One: Data flow with small sample (i.e. one or two HUCs)
[in progress]Access to flows2fim.exe software, windows/linux environments. Maybe in the lambdas?
Strategy to choose mip, ble datasets when both available.
Lookup strategy for picking stage, extent, depths.
flows2fim.exe controls
flows2fim.exe fim -lib EXTENT
flows2fim.exe fim -lib DEPTH (HOLD: Scope for this project does not include depth tasks).
Evaluate Cross Walk strategies (this is an idea)
Geometry processing and partitioning.
Benchmark
Disk Speeds
Memory Usage
Network Usage
Disk Size
Should this part be a sub-section somewhere or its own card? Not sure. I mostly is a part of the HV integration, but we can use a py command line twin to it for debuging / developing.
Develop Misc tools
(Rob) "search by HUC" code block for HydroVIS - Create a HUC S3 search code (not a tool) that can take a HUC number in, and pull down from an S3 bucket, just the files and folders required for processing or paths. Can optionally get just s3 paths or download files or both. Needed for HV code, but a variant of it for FIM. Basic py code already exists in ras2fim and can be ported and adjusted. It has a S3 wildcard system in it.
(Rob) "search by HUC" for FIM: tool for standard command line use for FIM debugging / Testing. Same as HV system in logic, using the ras2fim S3 wildcard search system.
Part Two: Data flow upscaling
Lessons learnt from part one
Dynamic Cross Walk to handle x75 increase from previous ras2fim data volume.??? TBD
Apply upscaling to resources
Performance and processing analysis, and scale with more HUCs.
Windows
Relying on ESRI tools for processing (TBD)
Ubuntu
Using QGIS and open source for processing (TBD)
[In Discussion] Convert Raster datasets to Polygon datasets
Benchmark
Disk Speeds
Memory Usage
Network Usage
Disk Size
[4B] Loading Static data
Ripple dataset at 1-3 releases per year.
Performance getting ready, large volumes of data uploaded and available for dynamic processing by HV interacting with HAND data. FIM_60 next fall?
[]
Make a FIM to HV deployment tool. It can look through multiple source Ripple model folders and pull out just folders/file it needs to be sent to the HV deployment bucket for automated processing. TBD... might not be needed, depend on pre-filtering or additional processing between Ripple and HV integration if needed.
[5] Publish Data
Part One: Testing workflows to process data for publication
Lambda tests??
(Note: ones below are TBD as part of the integration design processes evaluations)
Windows
Relying on ESRI tools for processing
Ubuntu
Using QGIS and open source for processing
Convert Raster datasets to Polygon datasets
Benchmark
Disk Speeds
Memory Usage
Network Usage
Disk Size
Note: We are talking to RTX about pre-building csv with vectors in them so we don't have to convert the tifs to extents. TBD
Part Two: Testing with small samples (i.e. one or two HUCs)
Lessons learnt from part one
Part Three: Scaling all available Ripple data.
Lessons learnt from part two
Integration for Ripple data to HV changes
This could be all steps required to run the system. Now that we have an idea of what we have in flows2fim.exe and the inputs, and we know we want to do some lambda steps, we are now gettign an idea of possible HV integration steps. Some of the HV steps are already pre-determined, such as morphing ras2fim code into ripple code.
In the current system, it has a separate service called ras2fim Boundaries. It is simply the WBD huc8 boundaries for each huc8 that has some ras2fim (and soon Ripple) data in it.
ie) current ras2fim v2 example:
1066: Ripple Task Details: Ripple Boundary Service: building out the Ripple Boundary service (replacing the ras2fim boundary service)
The text was updated successfully, but these errors were encountered:
This is an EPIC card. As items from this list are addressed, their active cards will be linked.
This is a evolving document. Expect many changes over the next few weeks.
Overall Design / Tasks
Source Data -> Process & Prepare Data ->Publish Services
Phase 1: Start with example dataset
After Phase 1
DEV(TI) -> UAT -> PRD
[1] Sensitive Information Locations [WILL NOT BE visible in github]
[2] Infrastructure Setup for Development Environment
S3 Bucket(s)
EC2 Instance(s) for design evaluation and testing
Folder structure conventions
Software conventions
[3] Source Data
*** Maybe change this section is more about tool / data analysis and should be pulled to another card?
Heidi Safa has been evaluating and analyzing Ripple data.
Example to start with:
[4A] Process Data
Part One: Data flow with small sample (i.e. one or two HUCs)
Should this part be a sub-section somewhere or its own card? Not sure. I mostly is a part of the HV integration, but we can use a py command line twin to it for debuging / developing.
Part Two: Data flow upscaling
[4B] Loading Static data
[5] Publish Data
Part One: Testing workflows to process data for publication
Lambda tests??
(Note: ones below are TBD as part of the integration design processes evaluations)
Part Two: Testing with small samples (i.e. one or two HUCs)
Part Three: Scaling all available Ripple data.
Integration for Ripple data to HV changes
This could be all steps required to run the system. Now that we have an idea of what we have in flows2fim.exe and the inputs, and we know we want to do some lambda steps, we are now gettign an idea of possible HV integration steps. Some of the HV steps are already pre-determined, such as morphing ras2fim code into ripple code.
Ripple Boundary Service
In the current system, it has a separate service called ras2fim Boundaries. It is simply the WBD huc8 boundaries for each huc8 that has some ras2fim (and soon Ripple) data in it.
ie) current ras2fim v2 example:
The text was updated successfully, but these errors were encountered: