Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proxy support for the multitenancy context strategy (pgbouncer and Supavisor) #141

Open
dev-guy opened this issue Apr 20, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@dev-guy
Copy link

dev-guy commented Apr 20, 2023

Is your feature request related to a problem? Please describe.
The Supavisor proxy for Postgres offers several benefits, the main one being the elimination of long-lived idle database connections. It can also route requests to different Postgres instances, satisfying both scalability and regulatory needs. Unfortunately, Ash's multitenancy apparently doesn't work with this feature because, while Ash's multitenancy context strategy can change the database schema name at runtime, it is apparently unable to similarly alter the Postgres username.

Describe the solution you'd like
Allow Postgres multitenancy to dynamically modify the Postgres username with a tenant slug, similar to the way that Ash Postgres' context multitenancy strategy uses a different schema name based on the tenant at runtime. Supabase requires the user name format "tenant.usename". See the Supavisor docs for more information.

Describe alternatives you've considered

I've heard that "dynamic Ecto repositories" exists but have not researched this.

Express the feature either with a change to resource syntax, or with a change to the resource interface

To avoid hard-coding a solution, perhaps a function could be provided to the 'multitenancy' macro. I would want this to be more DRY than the following but this is the general idea. Note in the following example, the schema name wouldn't need to change, but IT DEPENDS, and thus this is an opportunity to customize behavior that is currently hard-coded. 'username' and 'schema' are optional.

  multitenancy do
    strategy :context # This feature might also be applicable to attribute multitenancy.. requires more thought
    username: fn (org_id, username)->"#{org_id}.#{username}" end
    schema: fn(org_id, schema)->schema end
  end

Being able to provide a function to map a tenant to all database credentials (host, user, password, schema) would be powerful in its own right and would eliminate the need for supabase and pgbouncer for many use cases.

Additional context
I have not researched what would be needed to support the Postgres proxy pgbouncer. Ideally, one feature would address the needs of both pgbouncer and Supavisor.

There would be a table of tenants in a master Postgres database instance in the table organizations. This would be used to generate and track unique "org_id" values. This represents a single point of failure which seems unavoidable.

There would not be a foreign key from user to organization because these tables could be located on different physical Postgres instances. Or perhaps in the tenant databases there would also be an organization table but containing only one row.

@dev-guy dev-guy added enhancement New feature or request needs review labels Apr 20, 2023
@dev-guy dev-guy changed the title Proxy-based multitenancy Proxy support for the multitenancy context strategy (pgbouncer and Supavisor) Apr 20, 2023
@zachdaniel
Copy link
Contributor

Definitely worth exploring :)

@dev-guy
Copy link
Author

dev-guy commented Apr 20, 2023

@zachdaniel I can work on a PR but need some more advice from you

@zachdaniel
Copy link
Contributor

So the guide on setting up a dynamic repository is here: https://hexdocs.pm/ecto/replicas-and-dynamic-repositories.html

The main issue with us doing it this way in ash_postgres (i.e every query maybe getting a dynamic repo) in place of changing the schema is that we'd be constantly establishing new connections for each query for actions that may call many queries.

What we would need to do is something like this:

Add a hook for data layers to wrap the entire execution of an action for a given tenant, returning context to be set on all queries/changesets for that tenant, storing that context (keyed by the tenant) in the process dictionary context(Ash.set_context() which deep-merges). Whenever we execute a query where the tenant is different than the one before, we run this hook and carry its contents down to all other queries. This has to be implemented at the action invocation level, not using something like Ash.Query.around_action which just wraps the individual query call.

Whenever we run an action, we check for context for the tenant of that action, and set it on to the current query. We already have the ability to configure the repo dynamically like so:

`Ash.Query.set_context(query, %{data_layer: %{:repo, RepoName}})`. 

So this new hook for ash_postgres would just be setting a new repo. And then we'd configure it like this:

postgres do
  multitenant_context fn config, tenant -> 
    Keyword.put(config, :username, "#{tenant}.#{config[:username]}")
  end
end

The default implementation of that function would be Keyword.put(config, :prefix, tenant).

And if all that is changed in that keyword is the prefix option then we won't set a dynamic repo. If anything else changes, then we will.

There are a fair amount of moving parts here, but it can be done. We will also likely find a few places where we aren't carrying down parent context from one query to the next. We'll just have to iterate on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants