Support multiple types of DBs at once #3278

Snapstromegon · 2024-06-10T21:11:44Z

This PR is meant as a starting point and baseline for discussions.

It implements support for multiple types of DBs by using mutliple "DATABASE_URL_*" environment variables, falling back to the current "DATABASE_URL" variable.

This doesn't allow using multiple DBs of the same type, but two different DBs as long as they don't share a driver (so e.g. Postgres and SQLite).

fixes #121

Signed-off-by: Raphael Höser <raphael@hoeser.info>

…y! macro. Signed-off-by: Raphael Höser <raphael@hoeser.info>

Signed-off-by: Raphael Höser <raphael@hoeser.info>

Snapstromegon · 2024-06-11T21:07:07Z

This PR right now only holds a rough draft.
I don't have much experience workling with rust macros, so I chose the easy way for some first discussions.

Right now I only implmented the query! macro, but the others should work the same way.
It also uses a stringly typing for the selection of the DB driver - I don't like that, but wasn't able to get the procedural macros to work with actual types.

How this currently looks like:

query!(
    "SQLite",
    "CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)"
).execute(&sqlite).await.unwrap();
let row = query!("PostgreSQL", "SELECT 1 as ID").fetch_one(&pg).await.unwrap();

If you now set the environment variables DATABASE_URL_SQLITE and DATABASE_URL_POSTGRES, both queries will get checked at compile time and work at once.

If you have ideas for improving this, please let me know.

esmevane · 2024-06-12T13:47:27Z

This is cool to see!

I'm wondering two things about the DATABASE_URL_DRIVER format. I don't want to make it sound like I'm suggesting you do anything differently, I'm mostly just curious and wondered if maybe you've considered these approaches to the design already:

Is it possible to get any DATABASE_URL_* string from the environment? Say by pulling up the environment as a whole and just plucking matches?
Is it possible to infer the driver via the protocol? mysql:// or postgres:// or sqlite:// etc. In other words, let the interior of the env var describe the kind of DB, vs. the var name? (I think this might already be something sqlx does?)

The reason I ask these is because I bet you if these things are possible, it would then be possible to allow the caller define any variables they like, even having multiple databases with the same driver. I.E., DATABASE_URL_MEMORY and DATABASE_URL_CACHE, both as sqlite.

Snapstromegon · 2024-06-12T14:01:38Z

Is it possible to get any DATABASE_URL_* string from the environment? Say by pulling up the environment as a whole and just plucking matches?

Yes, this is totally possible and I want to implement this in the future. It should even be possible to infer the actual expected varaible at compiletime via the type of the database driver (via Database::NAME).
Finding some wildcard solution is also important to allow for supporting third party implemented DB drivers (e.g. for DuckDB or something similar).

Is it possible to infer the driver via the protocol? mysql:// or postgres:// or sqlite:// etc. In other words, let the interior of the env var describe the kind of DB, vs. the var name? (I think this might already be something sqlx does?)

Also this is possible and I thought about this too. Right now this PR already kind of does this, because the names are only for differntiating and the code doesn't check that a DATABASE_URL_POSTGRES actually holds a postgres URL. You could just as well (using the patternmatching from point 1) use DATABASE_URL_FILESYSTEM and DATABASE_URL_SERVER for e.g. sqlite and postgres.
When resolving the type to an URL it does exactly this check to test if any of the vars contains a URL that matches the DB supported schemas.

The reason I ask these is because I bet you if these things are possible, it would then be possible to allow the caller define any variables they like, even having multiple databases with the same driver. I.E., DATABASE_URL_MEMORY and DATABASE_URL_CACHE, both as sqlite.

Sadly IMO this will not be possible (at least as I implemented it right now), because it will always use the first URL that matches the scheme for the DB when resolving the URL, so you can't have two sqlite DBs.

IMO the most urgent usecase is supporting multiple types of DBs (as this is my biggest painpoint). Maybe we can find a way to support multiple DBs of the same type at a later point in time.

Signed-off-by: Raphael Höser <raphael@hoeser.info>

Snapstromegon · 2024-06-14T13:28:06Z

I'm working on a version that uses types instead of strings for selecting the DB backend here: Snapstromegon#1

I'm still running into an issue when multiple backends are compatible with a query and any help is welcome.

…ferenced Signed-off-by: Raphael Höser <raphael@hoeser.info>

Signed-off-by: Raphael Höser <raphael@hoeser.info>

Switch to a type based implementation

Snapstromegon · 2024-06-21T14:46:27Z

Hi all, I just merged my implementation for a type driven query provider selection.

With this change you can use the query! macro like this:

query!(
  Sqlite,
  "CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)"
).execute(&sqlite).await.unwrap();
query!(
  Postgres,
  "CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)"
).execute(&pg).await.unwrap();

I only implemented the declarative macro rule changes for query!, but it should work basically the same for all other macros.

I'd love to have some feedback from someone related to the project about what would need to be added to this PR so a feature like this can be merged.

Snapstromegon · 2024-09-18T09:19:44Z

@abonander I saw that you were commenting in the similar PR #3397. What do you think about the approach presented here?
IMO this is one of the easiest ways to support the (from my experience) most common usecase of providing the ability for multiple different DBs as backends.

Maybe we can introduce a solution based on this for the "simple" cases and one based on the toml files for the more complex ones?

abonander · 2024-09-18T19:26:29Z

Sorry, I just don't think this is the right approach.

Having multiple ways to do something is just more to teach and more possible confusion for the user.

This would also be rather annoying to use because you'd have to remember to specify the driver every time.

I'm not even sure how usage of two different drivers in the same context is supposed to work; does one of the queries just become a no-op or what? That would be a nightmare to teach.

With the sqlx.toml approach, you would have a separate sub-crate per driver, which could be separately compiled in online or offline mode, and you could rename the DATABASE_URL variable whatever you want to suit the application. You wouldn't have to remember to specify the driver every time. And you would have the opportunity to configure many more things besides.

Later on, we can support the ability to create shadowed versions of the macros with a prefixed name, using a specific config. This would let you mix and match drivers and databases within the same crate to your heart's content, in a way that IDE autocompletion could actually possibly assist with.

I'm still running into an issue when multiple backends are compatible with a query and any help is welcome.

That's covered by configuring the macros to emit code using the Any database so the driver can be chosen at runtime. That's been a long-requested feature and would be the approach that I'd recommend. It would also be enabled by the sqlx.toml work.

The sqlx.toml solution is what I've arrived at after a long time of thinking about it. It covers so many different use-cases in one feature, and it's highly extensible. I'm focusing what time I have to spend on SQLx on getting it implemented so people can start playing with it.

I appreciate the effort, but unfortunately because I fundamentally disagree with the direction here, I'm going to close.

Snapstromegon · 2024-09-19T06:53:40Z

Sorry, I just don't think this is the right approach.

Having multiple ways to do something is just more to teach and more possible confusion for the user.

I fully agree here.

This would also be rather annoying to use because you'd have to remember to specify the driver every time.

As per my implementation only if you want to actually use multiple drivers, so it's an extra thing you need to do if you want to use this feature, which IMO needs to be done anyways.

I'm not even sure how usage of two different drivers in the same context is supposed to work; does one of the queries just become a no-op or what? That would be a nightmare to teach.

No, if the query gets called, it gets executed. This could be useful e.g. for tools that migrate data from an Sqlite DB to a Postgres one.
Aside from that, if you want to support multiple dbs for your program, you could select a variant of a backend during runtime.

With the sqlx.toml approach, you would have a separate sub-crate per driver, which could be separately compiled in online or offline mode, and you could rename the DATABASE_URL variable whatever you want to suit the application. You wouldn't have to remember to specify the driver every time. And you would have the opportunity to configure many more things besides.

Later on, we can support the ability to create shadowed versions of the macros with a prefixed name, using a specific config. This would let you mix and match drivers and databases within the same crate to your heart's content, in a way that IDE autocompletion could actually possibly assist with.

I'm still running into an issue when multiple backends are compatible with a query and any help is welcome.

That's covered by configuring the macros to emit code using the Any database so the driver can be chosen at runtime. That's been a long-requested feature and would be the approach that I'd recommend. It would also be enabled by the sqlx.toml work.

The sqlx.toml solution is what I've arrived at after a long time of thinking about it. It covers so many different use-cases in one feature, and it's highly extensible. I'm focusing what time I have to spend on SQLx on getting it implemented so people can start playing with it.

I appreciate the effort, but unfortunately because I fundamentally disagree with the direction here, I'm going to close.

If the sqlx.toml gets introduced, why not completely remove the env var entirely and move everything into the toml? To me it feels again like two ways of doing things if the env var persists.

Aside from that I'll need to take a look at how exactly it will work with the sub-crates and your current solution around sqlx.toml.

Personally I love that there's some movement around this, as this is right now one of my main blockers for an app I'm building.

abonander · 2024-09-20T01:56:33Z

If the sqlx.toml gets introduced, why not completely remove the env var entirely and move everything into the toml? To me it feels again like two ways of doing things if the env var persists.

No, because the sqlx.toml is for permanent, global configuration changes that are meant to be checked into version control, while DATABASE_URL is environment-specific.

Snapstromegon · 2024-09-20T08:58:23Z

Okay, I didn't completely state the idea.

Instead of having a sqlx.toml that defines the env var to use, it could have a place for putting the db url that supports using env vars for templating (like with the .env syntax). That way you could define in the simplest case: db_url = $DATABASE_URL or you could also have it like db_url = postgres://$DB_USER:$DB_PASSWORD@$DB_HOST:5432/mydb.

Either way, I'm fine with it.

Snapstromegon added 3 commits June 10, 2024 23:08

feat: support multiple types of DBs at once

018a28e

Signed-off-by: Raphael Höser <raphael@hoeser.info>

fix: make fallback env var the actual fallback

726a6f9

Signed-off-by: Raphael Höser <raphael@hoeser.info>

Add working implementation with two drivers at once just for the quer…

21c9479

…y! macro. Signed-off-by: Raphael Höser <raphael@hoeser.info>

Snapstromegon mentioned this pull request Jun 11, 2024

Question: best way to use sqlx with connections to two different databases? #121

Open

Remove incorrect type bound that was added accidentally

3af2664

Signed-off-by: Raphael Höser <raphael@hoeser.info>

Snapstromegon added 2 commits June 12, 2024 20:46

allow using any DATABASE_URL* env var for configuring DBs.

97b2faf

Signed-off-by: Raphael Höser <raphael@hoeser.info>

Switch to a type based implementation

d1d8c25

Signed-off-by: Raphael Höser <raphael@hoeser.info>

Snapstromegon and others added 4 commits June 14, 2024 16:45

Step further, but doesn't compile because the driver_type can't be re…

ebbe979

…ferenced Signed-off-by: Raphael Höser <raphael@hoeser.info>

Working version using types via associated const on DBs

a5d5685

Signed-off-by: Raphael Höser <raphael@hoeser.info>

Merge pull request #1 from Snapstromegon/121-multiple-typed-dbs

f21274b

Switch to a type based implementation

Merge branch 'main' into 121-multiple-dbs

16cea4a

abonander closed this Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multiple types of DBs at once #3278

Support multiple types of DBs at once #3278

Snapstromegon commented Jun 10, 2024 •

edited

Loading

Snapstromegon commented Jun 11, 2024

esmevane commented Jun 12, 2024

Snapstromegon commented Jun 12, 2024

Snapstromegon commented Jun 14, 2024

Snapstromegon commented Jun 21, 2024

Snapstromegon commented Sep 18, 2024

abonander commented Sep 18, 2024

Snapstromegon commented Sep 19, 2024

abonander commented Sep 20, 2024

Snapstromegon commented Sep 20, 2024

Support multiple types of DBs at once #3278

Support multiple types of DBs at once #3278

Conversation

Snapstromegon commented Jun 10, 2024 • edited Loading

Snapstromegon commented Jun 11, 2024

esmevane commented Jun 12, 2024

Snapstromegon commented Jun 12, 2024

Snapstromegon commented Jun 14, 2024

Snapstromegon commented Jun 21, 2024

Snapstromegon commented Sep 18, 2024

abonander commented Sep 18, 2024

Snapstromegon commented Sep 19, 2024

abonander commented Sep 20, 2024

Snapstromegon commented Sep 20, 2024

Snapstromegon commented Jun 10, 2024 •

edited

Loading