Do we need a DB? #678
Replies: 1 comment 2 replies
-
Great timing for this question, because we actually removed the need for an (external) database in 0.11, released last week. Arroyo now supports sqlite to store its configuration. The database is roughly equivalent to ZK for Flink in HA mode — it stores persistent configuration for the set of jobs that are expected to be running, so that they can be recovered if the controller is restarted or fails. Our experience is that most users don't want to lose their pipelines when the controller goes down, so we've built that persistence into the core of the control plane. It also powers features like the API (which can run separately from the controller for HA), stored tables, configurations, etc. By also supporting sqlite, I think we now get the best of both worlds — we provide persistence to the control plane, while also making it easy to run small-scale pipelines without needing any extra infrastructure. |
Beta Was this translation helpful? Give feedback.
-
Does this project really need to rely on a postgress? I have used Flink in the past and it does not rely on a db, but using zookeeper for the persistence or in memory?.. I think it would be nice not to depend on an sql persistence data store. IMHO.. when a job gets loaded, why cant it just live in memory.. I think it would be nice to remove a layer of complexity and make it easier to use.
Beta Was this translation helpful? Give feedback.
All reactions