Skip to content

Commit acdab76

Browse files
authored
Documentation (#20)
* added readme and cleaned requirements * linted * updating docs * added publish.md * added rabbit mq docs * added post processing docs * added the last of the docs/ files * added the rest of the README.mds * updated compose
1 parent 28638c5 commit acdab76

28 files changed

+307
-35
lines changed

Source/RnR/Dockerfile.docs

-14
This file was deleted.

Source/RnR/INSTALL.md

-3
This file was deleted.

Source/RnR/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Key Features of Replace and Route:
99
- Uses docker compose to manage architecture
1010
- Uses a jupyter-notebook to view and manage code through a container
1111

12-
<img src="docs/API_spec.png" alt="isolated" width="750"/>
12+
<img src="docs/photos/API_spec.png" alt="isolated" width="750"/>
1313

1414
## Installation
1515

Source/RnR/compose.yaml

-1
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,6 @@ services:
6868
build:
6969
context: .
7070
dockerfile: Dockerfile.app
71-
# image: ghcr.io/taddyb33/hydrovis/rnr:0.0.1
7271
volumes:
7372
- type: bind
7473
source: ./data

Source/RnR/data/README.md

+17-1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,19 @@
11
# Data dir
22

3-
This dir's purpose is to store data files generated by replace and route
3+
This dir's purpose is to store inputs and data files generated by replace and route. The following folder structure is:
4+
```
5+
- plots/
6+
- All hydrograph plots generated by RnR
7+
- logs/
8+
- internal service logs for the publisher-app and consumer
9+
- replace_and_route/
10+
- post processed outputs for replace and route
11+
- rfc_channel_forcings/
12+
- T-Route inputs generated from RFC forecasts
13+
- rfc_geopackage_data/
14+
- T-Route gpkg files containing spatial domain information for each RFC point
15+
- troute_output/
16+
- T-Route output .nc files
17+
- troute_restart/
18+
- restart files generated by T-Route for warmstarts
19+
```

Source/RnR/docs/assimilation.md

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Assimilation
2+
3+
Based on the PWS requirements (2.3.4.1.4) we must assimilat forecasts that have flooding.
4+
5+
```
6+
Assimilate RFC forecasts that have values within the forecast horizon that are at or above
7+
flood stage as defined by the local NWS field offices. RFC forecasts shall be assimilated
8+
upstream of the RFC forecast location at a distance greater than 2 miles, but not to exceed 5
9+
miles
10+
```
11+
12+
To account for this, we are inserting flow using the upstream catchment of the RFC point using enterprise hydrofabric connectivity. Since there is a many to one relationship between catchments to nexus points, we just have to use the direct catchment upstream.
13+
14+
Below is a photo of the assimilation in action. The grey catchments are areas where there is no flow, and the green catchments are areas where there is flow.
15+
16+
![alt text](photos/assimilation.png)

Source/RnR/docs/common-errors.md

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Common Errors:
2+
3+
## NoDownstreamLID Error
4+
5+
If there is no downstream RFC point detected, we cannot determine how far to route the flow
6+
7+
<img src="photos/error_example_1.png" alt="isolated" width="750"/>
8+
9+
## NoForecastError
10+
11+
If there is an Empty NWPS response, we can't route flow
12+
13+
<img src="photos/error_example_2.png" alt="isolated" width="750"/>
14+
15+
## LID not detected
16+
17+
Since we are using a mock database with only 170/3000+ RFC points, it is very common for a downstream RFC point to not be included in the mock database. Thus, when running RnR with docker compose you will often see a PydanticValidationError for LID not detected (there being a None where there should be rfc downstream data)
18+

Source/RnR/docs/consumer.md

+44
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Consumer
2+
3+
![alt text](photos/consumer.png)
4+
5+
## What is its job?
6+
7+
The data consumer pulls messages from the message queue and runs a series of microservices on the message json body:
8+
1. Reads in the message
9+
2. Processes the forecast into T-Route inputs
10+
3. Determines the HYFeatures ID
11+
4. Runs T-Route
12+
5. Post-processes and plots the data
13+
14+
## How is this accessed
15+
16+
The consumer is an asynchronous task that is spun up using the docker compose and will await messages from the queue
17+
18+
```yaml
19+
consumer:
20+
build:
21+
context: .
22+
dockerfile: Dockerfile.app
23+
restart: always
24+
volumes:
25+
- type: bind
26+
source: ./data
27+
target: /app/data
28+
environment:
29+
- PIKA_URL=rabbitmq
30+
- RABBITMQ_HOST=rabbitmq
31+
- SQLALCHEMY_DATABASE_URL=postgresql://{}:{}@{}/{}
32+
- DB_HOST=mock_db
33+
- REDIS_URL=redis
34+
- SUBSET_URL=http://hfsubset:8000/api/v1
35+
- TROUTE_URL=http://troute:8000/api/v1
36+
command: sh -c ". /app/.venv/bin/activate && python src/rnr/app/consumer_manager.py"
37+
depends_on:
38+
redis:
39+
condition: service_started
40+
troute:
41+
condition: service_healthy
42+
rabbitmq:
43+
condition: service_healthy
44+
```

Source/RnR/docs/docker-compose.md

Whitespace-only changes.

Source/RnR/docs/front-end.md

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Front end Web Services
2+
3+
A web frontend will be set up as part of the Docker compose process, with the endpoint /frontend/v1/plot/, which can be accessed in a browser. This allows searching of all the PNG plot files and the Replace and Route NetCDF data generated by the API. First a LID needs to be selected. Then any files that fall within the date range selected by the user will display on screen, in a tabbed interface grouping them by feature ID, and then by forecast date. The search can be narrowed down to any date range desired (though result sets may take significantly longer to load for large date ranges or for LIDs with forecasts for numerous feature IDs). There is also a "Download Zip" option that will take the currently displayed result set and package it in a Zip file for the end user.
4+
5+
![alt text](photos/front_end.png)

Source/RnR/docs/hydrofabric-usage.md

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Hydrofabric usage
2+
3+
We are using flat files generated by the HFsubset endpoint prior to RnR being run. The HFsubset endpoint, similar to T-Route, is managed by the following compose entry:
4+
```yaml
5+
hfsubset:
6+
image: ghcr.io/taddyb33/hfsubset-legacy:0.0.4
7+
ports:
8+
- "8008:8000"
9+
volumes:
10+
- type: bind
11+
source: ./data/rfc_geopackage_data
12+
target: /app/data
13+
command: sh -c ". /app/.venv/bin/activate && uvicorn src.hfsubset.app.main:app --host 0.0.0.0 --port 8000"
14+
healthcheck:
15+
test: curl --fail -I http://localhost:8000/health || exit 1
16+
interval: 30s
17+
timeout: 5s
18+
retries: 3
19+
start_period: 5s
20+
```
21+
22+
## What is its job?
23+
24+
Hfsubset is used to create subsets of the hydrofabric v20.1 geopackages that end at a specified location. We make both upstream RFC and downstream RFC subsets, then use them for determining catchments of origin, connectivity, etc. These flat files are fed to T-route using a shared volume.
25+
26+
## How is this accessed
27+
28+
Subsets are generated by the following localhost endpoints on localhost:8008/docs:
29+
- /api/v1/subset/
30+
- /api/v1/downstream
31+
32+
If you want the whole dataset, you can ping the following endpoint from localhost:8000/docs:
33+
- /api/v1/rfc/build_rfc_geopackages

Source/RnR/docs/iac.md

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# IaC
2+
3+
For all information regarding IaC visit `Source/RnR/terraform` and read their README.md file
+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Message Broker and Cache
2+
3+
The message broker is a piece of software that is used to take message bodies from the Publisher, sort them, then post the messages. Redis caching is used to make sure we are only running the workflow as required when there are new forecasts
4+
5+
![alt text](photos/message_broker_and_cache.png)
6+
7+
## What is its job?
8+
9+
Hold messages in a queue based on priority so the consumer can successfully route them, or cache data that has already been routed. There are three queues:
10+
1. Priority:
11+
- For locations that are experiencing flooding
12+
2. Base:
13+
- For all other locations
14+
3. Error: For all Locations that cause errors / trigger exceptions.
15+
16+
## How is this accessed
17+
18+
The publisher calls the message broker internally, same with caching.
19+
20+
## What is the port?
21+
- To view the rabbit MQ portal, go to localhost:15672
22+
23+
![alt text](photos/rabbit_mq.png)
File renamed without changes.
576 KB
Loading

Source/RnR/docs/photos/consumer.png

416 KB
Loading
413 KB
Loading
317 KB
Loading
331 KB
Loading

Source/RnR/docs/photos/front_end.png

233 KB
Loading
Loading

Source/RnR/docs/photos/rabbit_mq.png

186 KB
Loading

Source/RnR/docs/post-processing.md

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Post Processing
2+
3+
Post processing is performed on all t-route outputs to include required metadata.
4+
5+
## What metadata is added:
6+
7+
Outside what is generated by T-Route, and stage, we are adding the following metadata
8+
9+
```python
10+
ds = ds.assign_attrs(assimilated_rfc_point=assimilated_point)
11+
ds = ds.assign_attrs(
12+
observed_flood_status=json_data["status"]["observed"]["floodCategory"]
13+
)
14+
ds = ds.assign_attrs(
15+
forecasted_flood_status=json_data["status"]["forecast"]["floodCategory"]
16+
)
17+
ds = ds.assign_attrs(RFC_location_id=json_data["lid"])
18+
ds = ds.assign_attrs(upstream_RFC_location_id=json_data["upstream_lid"])
19+
ds = ds.assign_attrs(downstream_RFC_location_id=json_data["downstream_lid"])
20+
ds = ds.assign_attrs(RFC=json_data["rfc"]["abbreviation"])
21+
ds = ds.assign_attrs(WFO=json_data["wfo"]["abbreviation"])
22+
ds = ds.assign_attrs(USGS=json_data["usgs_id"])
23+
ds = ds.assign_attrs(county=json_data["county"])
24+
ds = ds.assign_attrs(state=json_data["state"]["abbreviation"])
25+
ds = ds.assign_attrs(Latitude=json_data["latitude"])
26+
ds = ds.assign_attrs(Longitude=json_data["longitude"])
27+
ds = ds.assign_attrs(Last_Forecast_Time=json_data["times"][stage_idx])
28+
```
29+
30+
## How is this accessed
31+
32+
The consumer app calls the post processing when running the ReplaceAndRoute service.

Source/RnR/docs/publisher.md

+24
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Publisher
2+
3+
The data publisher section of the RnR event driven architecture is shown below.
4+
5+
![alt text](photos/data_publisher.png)
6+
7+
## What is its job?
8+
9+
This application is spun up by docker compose and is tasked with:
10+
1. Requesting RFC information from a DB
11+
2. Pulling Forecasts from the RFC points
12+
3. Formatting the forecasts in a json message body
13+
4. Posting the forecasts to the Rabbit MQ message queue to be processed by the consumer
14+
15+
## How is this accessed
16+
17+
The publisher is pinged by the following localhost endpoints:
18+
- /api/v1/publish/start
19+
- Runs the publish endpoint for all RFC
20+
- /api/v1/publish/{lid}
21+
- Runs the publish endpoint for a specific RFC location ID
22+
23+
## What is the port?
24+
- localhost:8000/docs

Source/RnR/docs/t-route-usage.md

+52
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# T-Route Usage
2+
3+
Our docker compose file uses a prebuilt docker image from T-Route to call T-Route as a service for river routing:
4+
5+
```yaml
6+
troute:
7+
image: ghcr.io/taddyb33/t-route-dev:0.0.2
8+
ports:
9+
- "8004:8000"
10+
volumes:
11+
- type: bind
12+
source: ./data/troute_output
13+
target: /t-route/output
14+
bind:
15+
selinux: z
16+
- type: bind
17+
source: ./data
18+
target: /t-route/data
19+
bind:
20+
selinux: z
21+
command: sh -c ". /t-route/.venv/bin/activate && uvicorn app.main:app --host 0.0.0.0 --port 8000"
22+
healthcheck:
23+
test: curl --fail -I http://localhost:8000/health || exit 1
24+
interval: 30s
25+
timeout: 5s
26+
retries: 3
27+
start_period: 5s
28+
```
29+
30+
This searches the `ghcr.io/taddyb33/` container registry for a t-route-dev image.
31+
32+
## Why an API?
33+
34+
T-Route is used in many contexts for hydrological river routing:
35+
- NGEN
36+
- Scientific Python
37+
- Replace and Route (RnR)
38+
39+
In the latest PR for RnR, there is a requirement to run T-Route as a service. This service requires an easy way to dynamically create config files, restart flow from Initial Conditions, and run T-Route. To satisfy this requirement, a FastAPI endpoint was created in `/src/app` along with code to dynamically create t-route endpoints.
40+
41+
## Why use shared volumes?
42+
43+
Since T-Route is running in a docker container, there has to be a connection between the directories on your machine and the directories within the container. We're sharing the following folders by default:
44+
- `data/rfc_channel_forcings`
45+
- For storing RnR RFC channel domain forcing files (T-Route inputs)
46+
- `data/rfc_geopackage_data`
47+
- For storing HYFeatures gpkg files
48+
- Indexed by the NHD COMID, also called hf_id. Ex: 2930769 is the hf_id for the CAGM7 RFC forecast point.
49+
- `data/troute_restart`
50+
- For storing TRoute Restart files
51+
- `data/troute_output`
52+
- For outputting results from the T-Route container

Source/RnR/mock_db/README.md

+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Mock DB:
2+
- This dir contains information from PI-2 to setup a mock database to read RFC information from.
3+
- There is a missing `rnr_schema.dump` file that is missing as it is too large for Git
4+
5+
To create the mock DB, you can run
6+
`docker build -t mock_db -f Dockerfile.mock_db . `
7+
8+
or, you can reference the github container registry image similar to how compose does it:
9+
```yaml
10+
mock_db:
11+
image: ghcr.io/taddyb33/hydrovis/mock_database:0.0.1
12+
environment:
13+
- POSTGRES_PASSWORD=pass123
14+
- POSTGRES_USER=postgres
15+
- POSTGRES_DB=vizprocessing
16+
ports:
17+
- "5432:5432"
18+
19+
```

Source/RnR/terraform/README.md

+4-2
Original file line numberDiff line numberDiff line change
@@ -176,7 +176,7 @@ A security group is created to allow SSH access and open the necessary ports for
176176
The EC2 instance is configured via the instance user_date with necessary software including Docker, Docker Compose and the AWS CLI. It will automatically clone the RnR application code, sync data from S3, and start the services defined in the Docker Compose configuration via a systemd service called rnr-app.
177177

178178
Example of checking the status via systemd. The app(s) can be stopped and started similarly via systemctl stop and start.
179-
```sh
179+
```shell
180180
systemctl status rnr-app
181181
182182
● rnr-app.service - Docker Compose Application
@@ -196,4 +196,6 @@ Aug 29 20:38:29 ip-10-6-0-139.ngwpc.com docker[39123]: Container rnr-app-1 Sta
196196
Aug 29 20:38:30 ip-10-6-0-139.ngwpc.com docker[39123]: Container rnr-jupyterlab-1 Started
197197
Aug 29 20:38:30 ip-10-6-0-139.ngwpc.com docker[39123]: Container rnr-consumer-1 Started
198198
Aug 29 20:38:30 ip-10-6-0-139.ngwpc.com docker[39123]: Container rnr-app-1 Started
199-
```sh
199+
```
200+
201+
*Note:* If you would like to see the files that are being generated from RnR, go to the `/app/hydrovis/Source/RnR/data/` directory on the EC2. `/app/hydrovis/Source/RnR` is the location for all RnR code delivered as IaC

0 commit comments

Comments
 (0)