Add support for arbirary database connection string #5

ColdenCullen · 2023-08-31T20:21:07Z

It would be nice to be able to pass DATABASE_URL as an environment variable to the docker container in order to use an external database for enhanced robustness. This would allow us to run this tool in the cloud without requiring any on-disk persistence.

It seems like this should be possible by:

Replace usages of sqlx::SqlitePool with sqlx::AnyPool
Refactor the --database parameter to accept the full connection string, not just the part after sqlite:
Add an ENV entry to the Dockerfile to set a default and allow for overrides

I gave it a shot locally, but I don't know enough about sqlx to make the swap. I ran into issues with the query_scalar macro returning a QueryScalar for the Sqlite driver and not the Any driver (I think this has to do with the files in the .sqlx folder?).

The text was updated successfully, but these errors were encountered:

jorgenpt · 2023-09-16T21:24:38Z

@ColdenCullen Interesting idea, I didn't realize it would be so straight forward to support in sqlx. I'll take a peek at this (maybe today or tomorrow if I can). Out of curiosity, how are you running the Docker container?

FWIW, I think you might need to run cargo sqlx prepare -- --lib with the DATABASE_URL set with your database type to update the .sqlx files.

jorgenpt · 2023-09-16T21:32:50Z

Ah, it's a bit of a problem, it looks like -- the query macros don't support the Any connection type. (launchbadge/sqlx#964)

That means that there are a handful of other paths forward:

Make the database type a compile-time setting and publish images for two-three different database types (which one would you be looking at supporting? pgsql or mysql?)
Stop using the query macros and use the query functions instead
Switch (the default) from sqlite to a non-local database

I really like how trivial it is to spin up an instance with sqlite, and performance wise it seems totally reasonable for smaller query volumes (we're ~30-40 developers and it seems relatively low load -- plus we run it in our own Kubernetes cluster, rather than on e.g. AWS Elastic Container Service, so we would actually have to spin up a db for it), so that third option is probably off the table.

ColdenCullen · 2023-09-21T21:02:54Z

This all makes sense to me!

We run our container on an EC2 machine (not ECS) deployed by a CDK stack. The problem is that sometimes the CDK can decide that a machine requires replacement if there's been a significant enough OS update. The advantage of also having our CDK stack spin up an RDS instance to connect this to is that the machine running RUGS can be completely ephemeral, and we can apply a backup strategy only to the database, and not care if the machine running RUGS itself goes up, down, or sideways, since it can be trivially re-created.

Let me know if there's anything else I can do to help make this work!

jorgenpt · 2023-10-07T05:16:24Z

Hm, interesting. Does it not persist Docker volumes when it replaces the container? That seems a little scary! I figured the whole point of using Docker-managed volumes (and not the bind mounts) was that the volumes were separate from the container and managed by the Docker tools.

Do you have a sense for which database backend you'd want to use it with?

ColdenCullen · 2023-10-07T19:56:26Z

It does persist docker volumes between container, but CDK will sometimes replace the entire EC2 instance. The idea is that the instance piece is just ephemeral compute for data stored elsewhere. That isn’t always a practical model, but it’s the model we’re stuck with if we want easy infrastructure-as-code.

We default to Postgres on RDS unless we have a reason to use something else, so Postgres as the engine would be my first choice.

jorgenpt · 2023-10-07T21:06:11Z

Ok! Looking into this a little more, I think a potential path forward is to switch to SeaORM. It specifically abstracts it away in a manner where it supports dynamically picking the database connection at runtime. Overall I don't think the conversion will be too bad, since we use so few queries in RUGS, but I need to figure out a good way to handle migrations from existing databases -- creating a matching schema, then making sure that SeaORM doesn't think it needs migrating further. I also need to remove the dependency on the sqlite-specific COLLATE NOCASE. I see that PGSQL has citext, but I don't know that SeaORM supports it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for arbirary database connection string #5

Add support for arbirary database connection string #5

ColdenCullen commented Aug 31, 2023

jorgenpt commented Sep 16, 2023 •

edited

Loading

jorgenpt commented Sep 16, 2023

ColdenCullen commented Sep 21, 2023

jorgenpt commented Oct 7, 2023

ColdenCullen commented Oct 7, 2023

jorgenpt commented Oct 7, 2023

Add support for arbirary database connection string #5

Add support for arbirary database connection string #5

Comments

ColdenCullen commented Aug 31, 2023

jorgenpt commented Sep 16, 2023 • edited Loading

jorgenpt commented Sep 16, 2023

ColdenCullen commented Sep 21, 2023

jorgenpt commented Oct 7, 2023

ColdenCullen commented Oct 7, 2023

jorgenpt commented Oct 7, 2023

jorgenpt commented Sep 16, 2023 •

edited

Loading