You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are several usecases for simple document/key-value storage
Save (append) information about executed workflows/jobs.
ID, run-time, docker hash, execution time, cost estimate, result etc...
Basically some sort of structured logs, which may be used to
see execution history & do some cost estimation (manually)
Query for running workflows/jobs, their status (history and/or curenly running workflows)
bigflow history -w workflow_id
Such cli api migh be a first step towards "airflow-free" solution
(aka ability to replace airflow with custom cron-like service)
Communicate between taks/workflows.
In some rare cases one workflow migh want to check status of another.
Also workflow migh check if another instance is currently running.
This especially important for dev-like environments, where
workflows are executed locally (via bigflow run).
Persist some information between tasks/jobs.
Like 'last-processed-id' (for incremental processing),
last time-per-batch (to auto-adjust batch-size) etc.
Database - anything for 1. BigQuery / any-sql-like DB for 1/2/3/4.
Client visible API - TBD.
The text was updated successfully, but these errors were encountered:
Metadata storage for bigflow jobs/workflows
There are several usecases for simple document/key-value storage
Save (append) information about executed workflows/jobs.
ID, run-time, docker hash, execution time, cost estimate, result etc...
Basically some sort of structured logs, which may be used to
see execution history & do some cost estimation (manually)
Query for running workflows/jobs, their status (history and/or curenly running workflows)
Communicate between taks/workflows.
In some rare cases one workflow migh want to check status of another.
Also workflow migh check if another instance is currently running.
This especially important for dev-like environments, where
workflows are executed locally (via bigflow run).
Persist some information between tasks/jobs.
Like 'last-processed-id' (for incremental processing),
last time-per-batch (to auto-adjust batch-size) etc.
Database - anything for 1. BigQuery / any-sql-like DB for 1/2/3/4.
Client visible API - TBD.
The text was updated successfully, but these errors were encountered: