Skip to content

Commit 26e0773

Browse files
authored
Merge pull request #2 from eoap/develop
Develop
2 parents b867915 + bd1b2f7 commit 26e0773

13 files changed

+908
-45
lines changed

.github/workflows/docs.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -22,5 +22,5 @@ jobs:
2222
uses: actions/setup-python@v2
2323
with:
2424
python-version: 3.x
25-
- run: pip install mkdocs-material mkdocs-mermaid2-plugin
25+
- run: pip install mkdocs-material mkdocs-mermaid2-plugin mkdocs_puml
2626
- run: mkdocs gh-deploy --force

.gitignore

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
.project
2+
no-commit/*
3+
*.TIF

docs/argo-events.md

+76
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
## Argo Events
2+
3+
## Introduction
4+
5+
Argo Events provides a Kubernetes-native event-driven automation framework.
6+
7+
In this setup, it is used to trigger workflows based on events generated by Redis.
8+
9+
This section explores the key components of Argo Events: the Jetstream event bus, the Redis event source, and the event sensor that drives the execution of the water bodies detection pipeline.
10+
11+
## Key Features of Argo Events
12+
13+
* Event-Driven Execution: Automates workflows in response to real-time events.
14+
* Scalable Architecture: Decouples event sources from workflow execution.
15+
* Modularity: Supports multiple event sources, including HTTP, Redis, and more.
16+
* Native Integration: Seamlessly integrates with Argo Workflows to trigger pipelines.
17+
18+
## Components of Argo Events
19+
20+
1. **Event Bus**
21+
22+
The event bus serves as the messaging backbone, enabling communication between event sources and sensors. In this setup, a Jetstream event bus is used for reliable, high-performance event delivery.
23+
24+
Why Jetstream?
25+
26+
* High throughput and reliability.
27+
* Built-in message persistence for fault tolerance.
28+
29+
2. **Event Source**
30+
31+
The event source monitors Redis for changes and forwards relevant events to the event bus.
32+
33+
Redis simulates incoming events by querying a STAC endpoint and publishing results.
34+
35+
Key Features:
36+
37+
* Channel-Based Filtering: Listens to specific channels for relevant events.
38+
* Secure Connections: Supports secret-based password authentication.
39+
40+
3. **Event Sensor**
41+
42+
The event sensor listens for events on the Jetstream event bus and triggers the execution of workflows when criteria are met.
43+
44+
Key Features:
45+
46+
* Dependencies: Defines the event source and type to listen for.
47+
* Trigger Templates: Configures the workflow to be executed upon an event.
48+
49+
## Event Flow
50+
51+
1. Event Generation:
52+
53+
Redis generates events by querying a STAC endpoint for new geospatial data.
54+
55+
Events are published to the stac-events channel.
56+
57+
2. Event Source Handling:
58+
59+
The Redis event source monitors the stac-events channel and forwards messages to the Jetstream event bus.
60+
61+
3. Sensor Activation:
62+
63+
The sensor listens to the Jetstream event bus.
64+
65+
Upon receiving an event, it triggers the water-bodies-detection workflow.
66+
67+
4. Workflow Execution:
68+
69+
Argo Workflows orchestrates the pipeline to process geospatial data and detect water bodies.
70+
71+
## Why Use Argo Events?
72+
73+
* Seamless Workflow Triggers: Effortlessly connects events with workflows.
74+
* Flexibility: Supports various event sources, making it adaptable to different scenarios.
75+
* Scalability: Can handle high event throughput with minimal latency.
76+
* Kubernetes-Native: Fully integrates with Kubernetes for a unified ecosystem.

docs/argo-workflows.md

+67
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Workflow Orchestration with Argo Workflows
2+
3+
## Introduction
4+
Argo Workflows is a Kubernetes-native workflow orchestration tool that excels at automating complex pipelines.
5+
6+
This section explains how Argo Workflows is utilized in this setup to execute the water bodies detection algorithm and related preprocessing tasks.
7+
8+
It focuses on the two workflow templates at the core of this system:
9+
10+
## CWL Execution Template
11+
12+
### Water Bodies Detection Template
13+
14+
These templates form the backbone of the processing pipeline, enabling the execution of tasks based on events triggered by Argo Events.
15+
16+
Key Features of Argo Workflows
17+
18+
* Container-Native: Each step of the workflow runs in its own container, ensuring isolation and scalability.
19+
* Declarative Workflows: Defined in YAML, allowing easy customization and version control.
20+
* Scalability: Leverages Kubernetes to run workflows efficiently across distributed resources.
21+
* Integration: Supports external tools like Calrissian for CWL execution, making it ideal for Earth Observation Application Packages workflows.
22+
23+
## Workflow Templates
24+
25+
This setup uses two primary workflow templates, each designed for specific tasks:
26+
27+
1. CWL Execution Template
28+
29+
This template executes generic CWL workflows using Calrissian, a lightweight CWL runner optimized for Kubernetes.
30+
31+
**Purpose:**
32+
33+
To process general data preparation tasks or execute modular components of the detection pipeline.
34+
35+
2. Water Bodies Detection Template
36+
37+
This template implements the core water bodies detection algorithm.
38+
39+
It processes geospatial data retrieved from the STAC endpoint and identifies water bodies using the defined logic.
40+
41+
**Purpose:**
42+
43+
To apply the detection algorithm to geospatial datasets, generating actionable results.
44+
45+
## Workflow Execution Flow
46+
47+
1 **Triggered by Events:**
48+
Workflows are initiated when Argo Events detects an event matching the criteria.
49+
50+
2. **Data Preparation:**
51+
52+
The CWL Execution Template runs preprocessing steps, such as data transformation or tiling.
53+
54+
3. **Algorithm Execution:**
55+
56+
The Water Bodies Detection Template processes the prepared data and generates results.
57+
58+
4. **Result Handling:**
59+
60+
Output data, such as GeoJSON files, is stored or published for further analysis.
61+
62+
## Why Use Argo Workflows?
63+
64+
* Ease of Use: Declarative YAML syntax simplifies workflow definition.
65+
* Extensibility: Easily integrate custom containers and tools like Calrissian.
66+
* Kubernetes-Native: Leverages Kubernetes' orchestration capabilities for resource efficiency.
67+
* Event-Driven Compatibility: Works seamlessly with Argo Events for real-time pipeline automation.

docs/components.md

+38
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Architecture
2+
3+
```puml
4+
@startuml
5+
6+
frame "Producer Network" {
7+
component "Collector" as publisher
8+
database "Redis" as redis
9+
10+
publisher -r-> redis : Add new entries
11+
}
12+
13+
frame "Platform Network" {
14+
artifact "Sensor" as sensor
15+
component "Sensor\nController" as sensorc
16+
control "Sensor\nDeployment" as sensord
17+
component "<$node>\nWorkflow" as container
18+
19+
sensorc -u-> sensor : Watch
20+
sensorc -d-> sensord : Create
21+
sensord -r-> container : Trigger
22+
23+
artifact "Event Source" as es
24+
component "Event Source\nController" as esc
25+
control "Event Source\nDeployment" as esd
26+
27+
esc -u-> es : Watch
28+
esc -d-> esd : Create
29+
esd -l-> redis : Listen
30+
31+
queue "Event Bus" as evbus
32+
33+
esd -d-> evbus : Write\nEvents
34+
evbus -u-> sensord : Read\nEvents
35+
}
36+
37+
@enduml
38+
```

docs/flow.md

+87
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
# Integrating Event-Driven Execution with Argo Workflows
2+
3+
## Introduction
4+
5+
This page details the end-to-end integration and execution flow of the water bodies detection system.
6+
7+
By combining Argo Workflows and Argo Events, the system achieves seamless automation triggered by geospatial data events.
8+
9+
## Integration Architecture
10+
11+
The integration involves three main layers:
12+
13+
* Event Source Layer: Monitors Redis for events simulated by querying the STAC endpoint.
14+
* Event Routing Layer: The Jetstream event bus routes events from the Redis source to the sensor.
15+
* Workflow Execution Layer: Argo Workflows processes geospatial data and detects water bodies.
16+
17+
## End-to-End Execution Flow
18+
19+
### Step 1: Event Generation
20+
21+
Description: Redis acts as an intermediary for event generation. Events are created by querying a STAC API endpoint for geospatial data.
22+
23+
Details:
24+
* Queries can include filters such as specific time ranges or geographic areas.
25+
* Redis publishes events to the stac-events channel.
26+
27+
### Step 2: Event Source Monitoring
28+
29+
Description: The Redis event source listens for new messages on the stac-events channel.
30+
31+
Details:
32+
* When a new event is detected, it forwards it to the Jetstream event bus.
33+
* Events include metadata such as STAC item IDs, collection details, and timestamps.
34+
35+
### Step 3: Event Routing
36+
37+
Description: The Jetstream event bus acts as a high-performance router for events.
38+
39+
Details:
40+
* Ensures reliable delivery to the event sensor.
41+
* Supports scalable handling of multiple concurrent events.
42+
43+
### Step 4: Sensor Activation
44+
45+
Description: The event sensor listens to the Jetstream bus and triggers workflows.
46+
47+
Details:
48+
* Matches events based on defined criteria (e.g., specific STAC item properties).
49+
* Passes event metadata to the triggered workflow as input parameters.
50+
51+
### Step 5: Workflow Execution
52+
53+
Description: The Argo Workflow executes the pipeline to process geospatial data.
54+
55+
Details:
56+
57+
* The workflow includes two templates:
58+
* Calrissian Template: Runs a CWL pipeline to pre-process data.
59+
* Detection Template: Executes the water bodies detection algorithm.
60+
* Outputs include a GeoTiff file with detected water bodies described as a STAC Item, logs, and diagnostic data.
61+
62+
### Step 6: Results Delivery
63+
64+
Description: The workflow stores outputs in a predefined storage location and updates the STAC catalog with results.
65+
66+
Details:
67+
68+
* Results are made accessible via STAC API endpoints.
69+
* Users or downstream applications can retrieve the outputs for analysis.
70+
71+
## Integration Diagram
72+
73+
TODO (Insert a flowchart or diagram illustrating the flow: Redis → Event Source → Jetstream → Event Sensor → Workflow Execution → Result Storage)
74+
75+
## Key Benefits of the Integration
76+
77+
* Automation: Fully automates the data processing pipeline from event generation to result delivery.
78+
* Scalability: Supports high-throughput event handling and parallel workflow execution.
79+
* Modularity: Easy to extend with additional event sources or processing workflows.
80+
* Real-Time Processing: Responds to geospatial data changes in near real-time.
81+
82+
## How to Test the System
83+
84+
1. Simulate Events: Publish STAC query results to the Redis channel manually or via automation.
85+
2. Monitor Workflow Execution: Use the Argo Workflows UI to track pipeline progress.
86+
3. Validate Outputs: Verify that the workflow generates correct water bodies detection results and updates the STAC catalog.
87+

docs/index.md

+63-13
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,78 @@
1-
# Event driven processing of Application Packages with Argo Events and Workflows
1+
# Event-Driven Water Bodies Detection Using Argo Workflows and Argo Events
22

3+
4+
## Introduction
5+
6+
This learning resource demonstrates an event-driven system for detecting water bodies using cloud-native technologies. The system leverages Argo Events to handle and react to external event sources and Argo Workflows to execute data processing pipelines, including a water bodies detection algorithm encoded in the Common Workflow Language (CWL).
7+
8+
The workflow is triggered by events simulated through Redis, an external event source, which queries a SpatioTemporal Asset Catalog (STAC) endpoint. The STAC endpoint provides geospatial data, which serves as the input for the detection algorithm. The automation is achieved using Kubernetes-native tools, making the setup scalable, modular, and suitable for Earth observation and geospatial applications.
39
This project demonstrates an event-driven workflow for detecting water bodies in Sentinel-2 satellite imagery using Argo Events, Argo Workflows, and Calrissian.
410

511
It utilizes a Redis stream to trigger workflows that process Sentinel-2 imagery and generate outputs using CWL (Common Workflow Language).
612

7-
## What Are Argo Workflows and Argo Events?
813

9-
### Argo Workflows
14+
## Key Components
15+
16+
This setup integrates the following technologies and concepts:
1017

11-
Argo Workflows is a Kubernetes-native workflow engine that lets you define and run multi-step processes in the form of Directed Acyclic Graphs (DAGs).
18+
### Argo Workflows
1219

13-
Each "step" in a workflow can be a container or an operation, enabling automation of complex tasks like data processing, ETL jobs, or ML model training.
20+
* Automates the execution of tasks using predefined workflow templates.
21+
* Supports the execution of CWL workflows using Calrissian, a lightweight executor for CWL in Kubernetes.
22+
* Includes two templates:
23+
* CWL Execution Template: Executes general CWL workflows, such as preprocessing tasks.
24+
* Water Bodies Detection Template: Encodes the algorithm for detecting water bodies in geospatial data.
1425

1526
### Argo Events
1627

17-
Argo Events is an event-driven automation framework for Kubernetes. It allows workflows and other tasks to be triggered in response to various events, such as webhooks, message queues, or timers.
28+
* Provides an event-driven architecture for triggering workflows.
29+
* Uses a Jetstream Event Bus to handle event communication.
30+
* Includes:
31+
* Redis Event Source: Queries the STAC endpoint to generate simulated events.
32+
* Event Sensor: Listens for Redis events and triggers the water bodies detection workflow.
33+
34+
### Redis as an Event Source
35+
36+
* Simulates events by querying the STAC endpoint.
37+
* Acts as a lightweight, flexible mechanism to mimic real-time event streams.
38+
* Ensures seamless integration with Argo Events via an event source configuration.
39+
40+
### STAC Endpoint
41+
42+
* Serves as the primary data source, providing geospatial data in a standardized format.
43+
* Enables the workflow to focus on processing relevant datasets for water bodies detection.
44+
45+
## High-Level Architecture
46+
47+
The system is designed to handle the following flow:
48+
49+
1. Event Generation:
50+
51+
* Redis queries the STAC endpoint and generates events containing metadata about geospatial assets (e.g., imagery of specific regions).
52+
53+
2. Event Propagation:
54+
55+
* The Redis Event Source forwards events to the Jetstream Event Bus.
56+
57+
3. Event Sensing and Workflow Triggering:
58+
59+
* The Event Sensor monitors the Jetstream Event Bus for relevant events.
60+
* Upon detecting an event, the sensor triggers the execution of the water bodies detection workflow.
61+
62+
4. Workflow Execution:
63+
64+
* The Argo Workflow templates process the event's input data using the CWL-based algorithm.
65+
* The water bodies detection results are stored or published for further use.
66+
67+
## Why Use This Setup?
68+
69+
This setup showcases the power of combining event-driven paradigms with container-native workflows for scalable geospatial analysis.
1870

19-
By connecting event sources (like a Redis stream) to sensors that listen for events, you can trigger automated workflows.
71+
It is particularly suited for Earth observation and scientific workflows because:
2072

21-
## Overview
73+
* Scalability: Kubernetes ensures workflows can handle varying loads effectively.
74+
* Modularity: Components can be easily reused or replaced for other applications.
75+
* Automation: Events trigger workflows without manual intervention, enabling real-time processing.
2276

23-
The project workflow involves:
77+
Through this resource, you'll learn to implement a cloud-native pipeline for water bodies detection, which can be extended to other geospatial or scientific applications.
2478

25-
* Event Source: A Redis stream that holds Sentinel-2 image acquisitions.
26-
* Sensor: Listens for events from the Redis stream and triggers a workflow to detect water bodies.
27-
* Argo Workflow: Executes a CWL-based workflow using Calrissian to process Sentinel-2 images.
28-
* Calrissian: An execution engine for running CWL workflows in Kubernetes, integrated with Argo Workflows.

0 commit comments

Comments
 (0)