OneArmy Database Migration #1441
Replies: 7 comments 10 replies
-
Thanks for taking the time to write this up. Very useful starting point for discussion. What is the end goal of this migration? I see the terms Firebase and Firestore being used interchangeably throughout this document. However it is unclear to me on the scope of work here. Can we clarify what is out of scope? For example we currently use Firebase (sub)products authentication and file storage. Is that something that would be tackled separately? My assumption is that we are only talking about migrating the data persistence layer rather than the other features. Given that context, what additional complexity do we introduce moving data storage away whilst the others (Auth, filestorage, hosting and serverless functions) remain with Google Firebase. |
Beta Was this translation helpful? Give feedback.
-
It would be useful to sketch out a schema for our existing system that would be compatible with relational data stores. Both |
Beta Was this translation helpful? Give feedback.
-
Hello everyone! I have some thoughts about this discussion. a) Choosing between Supabase and own relational DB + backend code, the second one looks much better in your situation: you have already complex code in the frontend repository and functions, so it wouldn't be a problem to support your DB and as profit, you will never reach vendor limitations and lock of Supabase. b) My idea of the realization process of migration to own DB:
Best, Andrew |
Beta Was this translation helpful? Give feedback.
-
There is also https://pocketbase.io/ people seem to favor it instead of supabase |
Beta Was this translation helpful? Give feedback.
-
Before choosing a new database, we should consider a Multi-Tenant approach first. Why would a multi-tenant solution be beneficial?
Edit:
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
For point 1, Firebase now supports SQL (I guess?) https://youtu.be/vYk6Uh2WGto?t=32 https://www.youtube.com/watch?v=7OdVatEI85o https://firebase.google.com/products/data-connect EDIT: seems to be in early access / private preview. The smallest price seems to be like 12 Euros just for the database, which seems kinda lame. https://cloud.google.com/products/calculator-legacy/#id=7ae4087c-7184-403e-b71e-4502591a3e24 |
Beta Was this translation helpful? Give feedback.
-
What
The platform was originally based around Firebase’s Firestore DB as the primary backend database, but since evolved to also include a local offline/cache database (Dexie) and an additional server cache (Firestore Realtime DB).
The existing setup has helped us move forward pretty quickly whilst keeping server costs pretty minimal, however as we scale to additional platform instances and features it’s struggling to keep pace.
It’s time for a change!
Why
The unstructured nature of the DB means that generally it will accept any data thrown at it. Whilst this can be mitigated through type-checking and scripting, it generally means that data migrations have to be considered more carefully as hard to keep track of all historic changes and any existing inconsistencies in the data. This has made us more averse to making changes are required
Billing is directly proportional to the size of read ops, strongly discouraging us from introducing features that might negatively impact (such as generating summaries from collection of documents, e.g. all howtos). Whilst not the worst practice ever, it often leads to overly complex or anti-patterns in how we organize data and queries.
Local development typically requires access to server instances, which makes it harder to test operations such as updating and deleting data without having a knock-on for other developers. It also means exposing various api credentials for test projects which could be exploited. Firebase does provide emulators but they are incomplete and can be buggy (e.g. issues sharing seed data across windows/linux instances)
Vendor lock-in to the Google/Firebase ecosystem (less flexible and generally opposed to our open principles)
How
A planned migration away from firebase to other open database platforms, which can either be hosted on our own servers or via hosting providers. This will be a gradual process that will initially focus only on the database (not hosting, storage etc.), with the aim to run both in parallel before fully switching over.
I would propose using Supabase as the alternative DB, as it is open source and is currently trying to position itself as a direct alternative to firebase (with roadmap to replicate most of firebase’s core features).
This would address the issues above broadly by:
Supabase uses a structured postgres db underneath, so not only will it require defined table structures moving forwards but can help us identify inconsistencies when trying to migrate legacy data
If using hosted supabase billing is for db size, not requests. The free tier is a good 500MB (our current db is more like 5MB I think). If self-hosting performance may still be a factor to consider for larger operations, but consequences of poor optimisations much less drastic (possibly requiring db reboot instead of large bill). It also gives options for things like replication if we did want to run some more intensive tasks, either on a local or server clone.
Whilst I would still probably suggest keeping a dev server running to make it easier for people just interested in working on frontend code, those working on backend could quite quickly run their own instances via docker desktop which will accurately mimic how things are run on a server.
Supabase is built around docker containers, so can be deployed to any infrastructure that supports, which these days is just about everything (even a raspberry pi). If using supabase hosted service (which I think is built on top of aws), future migration is still easily possible (plenty of readily available methods to clone one db to another in postgres). There is also planned future support for kubernetes, which could help if ever requiring to build for a larger scale.
Proposed Roadmap
Phase 0 - Prepare
Clear backlog of existing PRs to try and provide a bit more space for making larger changes. Existing issues/PRs that impact on the DB will be prioritized, and future issues/PRs that impact on the DB will be temporarily put on hold
Phase 1 - Deploy
Deploy 2 supabase instances for staging and production
Create firebase functions that will allow one-time migration of legacy data
Create firebase functions that will allow us to replicate data from firestore to supabase in an ongoing way
Get supabase up and running in full replication of firestore, with multiple test/ci projects on single staging instance and a single initial production site on production DB
Use replicated database as means to check existing data for inconsistencies and apply small bit of housekeeping
Phase 2 - Read
Move db read ops to supabase
Replace direct DB write ops with cloud functions (still via firestore, groundwork for next phase)
Phase 3 - Write
Update cloud functions to support writes to supabase
Add support for data triggers to replace those used by firestore
Enable direct writes to supabase alongside firebase (instead of triggered), comparing outputs of both writes to ensure kept in sync
Phase 4 – Transition
Add support and documentation for local development with Supabase
Move all DB ops to supabase
Sticking Points
Firestore supports nested documents (subcollections), but postgres does not. Alternative syntax will need to be considered.
Supabase’s support for triggers is quite new and possibly subject to change
Assuming the migration doesn't happen all at once, likely there will be some extra overhead work to keep any supabase-backed branch/fork in line with main.
Supporting Dev
As this would be quite a large/ongoing project I would recommend defining a set of multiple bounties to be released at key milestones (e.g. phases), with work ideally coordinated between at least 1 core maintainer and 1 feature bounty developer.
Alternatives
Of the initial issues, (1 – multi projects) and (2 – unstructured) could be solved reasonably well within the existing architecture. There is nothing stopping us hosting multiple sites on the same database (we already do this when testing in CI), and there are various tools that can help enforce more rigorous structure on the database (e.g. ORMs such as sequelize, typeorm, fireorm, or more simple data validators like joi or yup).
I would expect (4 – local DB emulators) to improve over time, but (3 – Billing ops) and (5 – Vendor lock-in) are quite fundamental to firebase.
Instead of Supabase there are plenty of alternatives (e.g. mongodb, pure mysql/postgres, fauna etc.). My main issue with mongodb is the way in which the company typically tries to put useful features behind paywalls (e.g. atlas, realm), which could prove problematic when trying to migrate more features such as automated triggers, local sync etc. As supabase is built on top of postgres anyway, I see the advantage working with their existing ecosystem of tools as many are highly useful to us (e.g. triggers, functions, storage, auth etc.).
Any thoughts?
Beta Was this translation helpful? Give feedback.
All reactions