Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamically Connect to Multiple Databases #6089

Open
rowanmiller opened this issue Jul 15, 2016 · 21 comments
Open

Dynamically Connect to Multiple Databases #6089

rowanmiller opened this issue Jul 15, 2016 · 21 comments

Comments

@rowanmiller
Copy link
Contributor

From @GSPP' comment in #5626:

In the age of cloud it becomes more common to shard across databases based on tenant, geography and scaling needs. EF should support querying across databases better.

It's not enough to change the connection string because one query might need to pull from multiple databases. Example: Join customer-specific data with master data.

I think it's really required that EF allows us to specify the database name per table reference. A strawman:

from o in db.Orders.WithDatabase("NewYork-Tenant1234")
join c in db.Customers.WithDatabase("NewYork-master") on ...
select new { o, c }

This is supposed to be a scenario where the database name depends on a geographical sharding as well as on the tenant name. Both are dynamic and can change for every query. Connection strings cannot do this. Statically configured database names cannot do that either.

This query would translate to:

 select *
 from [NewYork-Tenant1234].dbo.Orders o
 join [NewYork-master].dbo.Customers c on ...

The same configuration mechanism would lend itself well to configuring hints such as NOLOCK or the common UPDLOCK, ROWLOCK, HOLDLOCK combination. This is unrelated to this issue but I'm mentioning this to justify with the WithDatabase syntax makes sense.

A simple extension to this would be to allow completely dynamic table names. WithTableName(database, schema, name) would map to database.schema.name on the SQL side. This is useful for example when sharding by date with one table per day. SQL Server Standard Edition does not have partitioning.

What is the team's stance on supporting such scenarios?

@rowanmiller
Copy link
Contributor Author

More notes from @GSPP

A few more points:

  1. Since my issue has been merged into this (I agree with that) I want to stress this point: The table location (db, schema, name) must be changeable per table reference. Not per query or per application run. Not even per query is enough because I might want to union over all existing tenants (e.g. db.Orders.WithDatabase("Tenant1").Concat(db.Orders.WithDatabase("Tenant2")).Concat(db.Orders.WithDatabase("Tenant3"))...; this expression would, of course, be built through a loop and not hard-coded like that).
  2. Navigation properties should be supported across databases. I am using this pattern right now with a large L2S application. The use case is that one database is small and on different storage. The other one is for "bulk data" (all kinds of logs) and is on different storage. It is super convenient to hide this split from the application. Of course there can't be any FKs spanning the databases.
  3. Navigation properties should be configurable as well. If not we'd be forced to use explicit joins a lot. Strawman syntax: from o in db.Orders.WithDatabase("Tenant") select WithDatabase(o.Customer, "Master"). Semantically, a WithDatabase wrapping call would be interpreted as returning the first argument unmodified but with location meta-data attached. This function call would only be valid inside of an expression tree. The same wrapping could be done for sequence navigation properties. If extension syntax is wanted this requires an extension method on System.Object alas. This smells, I admit. So it could be a static method QueryHelpers.WithDatabase.
  4. There's still a need to configure location statically. This is useful in the scenario described under (2). Sometimes, there's a fixed number of databases and each table resides in exactly one database. This should be conveniently supported.
  5. I think this is a relational concept that not every provider can support. It can be a runtime error to use this with a provider not liking this.

Since there are so many different use cases there should be a hierarchy of location configuation:

  1. An explicit WithDatabase inside of the query takes precedence over everything.
  2. There should be a per-query default that places every table in a specific database and schema (e.g. myQuery.WithDefaultTableLocation(db, schema).ToList()).
  3. There should be a per-context default.
  4. Per-table static configuration.
  5. Connection string.

For each of these levels I can present you a real-world use case that I have had... If that hierarchy is properly document I don't think it's hard to understand or hard to work with.

Is there not an issue tracking table hints such as NOLOCK? There should be. It's important. Surely, the team is tracking it internally.

@rowanmiller rowanmiller added this to the Backlog milestone Jul 15, 2016
@rowanmiller
Copy link
Contributor Author

Split out from #4898

@Antaris
Copy link

Antaris commented Sep 30, 2016

I tackled this a different way based on my requirements around building modular applications (not specifically multi-tenant).

My approach was to define what I call logical groupings, e.g. "Core", "Commerce", "Media", etc.

Several modules can belong to the same logical group, i.e. my Security module (Users, Roles, etc.) belongs in Core, as does my Settings module.

I defined an attribute which can be used at the context level:

[LogicalGrouping("Core")]
public class SecurityDbContext : DbContext
{

}

This LogicalGrouping attribute is used when determining which connection string to use. The LogicalGrouping is also used through my entity type configurations, so I can explicitly say that an entity belongs to a specific database. This enables me to support cross-database navigation properties. I.e., I could define a User object in my Core logical group, but then include it in a different context:

[LogicalGrouping("Commerce")]
public class OrdersDbContext : DbContext
{
  public DbSet<User> Users { get; set; }
}

This is handled through overriding the SqlServerQuerySqlGenerator::VisitTable method:

/// <inheritdoc />
public override Expression VisitTable(TableExpression tableExpression)
{
    // MA - We need to resolve the database name.
    string database = _databaseLookupCache.GetOrAdd($"{tableExpression.Schema ?? "dbo"}.{tableExpression.Table}", k => GetDatabaseName(tableExpression.Schema ?? "dbo", tableExpression.Table));

    if (!string.IsNullOrWhiteSpace(database))
    {
        Sql.Append(SqlGenerator.DelimitIdentifier(database))
            .Append(".");
    }

    if (tableExpression.Schema != null)
    {
        Sql.Append(SqlGenerator.DelimitIdentifier(tableExpression.Schema))
            .Append(".");
    }

    Sql.Append(SqlGenerator.DelimitIdentifier(tableExpression.Table))
        .Append(" AS ")
        .Append(SqlGenerator.DelimitIdentifier(tableExpression.Alias));

    return tableExpression;
}

/// <summary>
/// Resolves the database name through the entity type builder for the given schema/table using connection string data.
/// </summary>
/// <param name="schema">The schema name.</param>
/// <param name="table">The table name.</param>
/// <returns>The database name.</returns>
private string GetDatabaseName(string schema, string table)
{
    var modelBuilder = _modelBuilderCache.GetEntityTypeBuilder(schema, table);
    if (modelBuilder == null)
    {
        return null;
    }

    // MA - The logical group can be used to identify connection string info for the target database.
    string logicalGroup = modelBuilder.LogicalGroup;

    // MA - Discover the connection string and return the database.
    return _connectionStringProvider.GetForLogicalGroup(logicalGroup)?.Database;
}

I know this is quite a specialised implementation, but I'd be interested to know what plumbing is changed to support the above request, should it affect my current implementation.

@rigofunc
Copy link

rigofunc commented Jan 20, 2017

@rowanmiller my team are hard to look for this feature, yes, we need this feature too.
Our scenarios is:

  1. one table per month;
  2. one database per year;

@grahamehorner
Copy link

Just like to vote up this feature; we are looking at providing a multi-tenant restful API which switches database context / schema based on URI / JWT user claims; it would be great in to have some way of resolving / dynamically creating a context / connection based on some predicate filter, storing / pooling the resulting instance so it can be reused.

@seriouz
Copy link

seriouz commented May 3, 2017

@rowanmiller Any news on this topic?

Edit: I just found this link, and it seems this is the solution for the problem: http://benfoster.io/blog/aspnet-core-multi-tenancy-data-isolation-with-entity-framework

@seblucas
Copy link

seblucas commented Sep 28, 2017

I also like to vote for this feature (or at least better documentation on the subject).

I want to have one database per customer (to make perfectly sure no data can be crossed between customers)

I can load a specific connection string based on some information coming from a JWT token (basically what Ben Foster proposed), I'm going to make a POC about it but I'm confident it'll work. I can also use what's explained in #9616. That means having one DbContext that can be different every time.

Now with EF Core 2.0, we have DbContext pooling ... which is a very good news for performance and all but I have absolutely no idea of how it'll work when every Context created can be totally different from each other.

On the other hand I can create 30~40 different DBContext (DBContextCustomerA, DBContextCustomerB, ...) and having a Factory choosing between everyone of them ...... But it's against every principle I have :(

I guess there should be a proper documentation of what is feasible with EF Core and what's the recommanded way of handling many usecase.

@stumpykilo
Copy link

Did the POC for what @seblucas suggested get created? Specifically the "load a specific connection string based on some information coming from a JWT token" part? Would be interested in how to do this.

@ralmsdeveloper ralmsdeveloper mentioned this issue May 3, 2018
@pilouk
Copy link

pilouk commented Oct 24, 2018

What's up with this ? This feature is important to us , we would use this to group some client together instead of using a 1 bucket database.. Our client need to be grouping
while maintaining their data integrity

@ajcvickers
Copy link
Member

@pilouk This issue is in the Backlog milestone. This means that it is not going to happen for the 3.0 release. We will re-assess the backlog following the 3.0 release and consider this item at that time. However, keep in mind that there are many other high priority features with which it will be competing for resources.

@KLuuKer
Copy link

KLuuKer commented Sep 13, 2019

For us a way to specify\construct a different sql connection string would be enough.
Because of scaling issues we need to split our customers over separate databases AND physical database machines

Preferably WITH db context pooling support of course

@santosh2812
Copy link

Hi,
I am trying to implement Multi-Tenant -per Tenant separate Schema (Database single Instance).
https://gunnarpeipman.com/aspnet-core-database-per-tenant/ Taking help from this tutorial. Now I have one Situation That the Owner of the Application wants to have a dashboard which will show data from all the Tenant - Is this possible to connect with different DBContext and get data, the number of Tenant is not fixed it can increase at run time.
Apart from that -
Is this possible to create database at runtime - when the send metadata is added in DB - That time if we can have a button to initiate the the process.
Need some suggestion- which way should I go.

thanks

@seriouz
Copy link

seriouz commented Apr 2, 2020

@pilouk My orga has this as a selling point: Database per customer

@pilouk
Copy link

pilouk commented Apr 2, 2020

@pilouk My orga has this as a selling point: Database per customer

I understand, there are many benefits to database per customer.
However, cross query on multiple context /database is not really performant.
Maybe it's possible to use the elastic search on top ?

@seriouz
Copy link

seriouz commented Apr 2, 2020

In my case its easy because the customers are not connected and every customer has its own instance of the .net app. I don't know about @santosh2812

@pilouk
Copy link

pilouk commented Apr 2, 2020

In my case its easy because the customers are not connected and every customer has its own instance of the .net app. I don't know about @santosh2812

So you don't need to cross query on all context right?

@seriouz
Copy link

seriouz commented Apr 2, 2020

No we decided against this idea because of the problems named in the issue. But it would be great if we do not have to run the exact same app 80 times.

@mrlife
Copy link

mrlife commented Jul 8, 2020

@ajcvickers Possible to provide an update or workaround for supporting a query that accesses 2 databases?

@ajcvickers
Copy link
Member

@mrlife I'm not aware of any workarounds that would allow a single EF context to access multiple databases in the same query.

@roji
Copy link
Member

roji commented Jul 10, 2020

It's possible to have a single LINQ query that performs a client-side set operation (Concat/Union) or some sort of join over data coming from two different databases (and therefore from two different DbContext instances). Not sure how helpful that is though.

@mrlife
Copy link

mrlife commented Jul 14, 2020

@mrlife I'm not aware of any workarounds that would allow a single EF context to access multiple databases in the same query.

@ajcvickers Thanks for letting me know. For implementing in EF Core source code, it must be more complicated than adding the name of the database onto beginning of a table reference, e.g. [db-name].[schema].[table-name]?

@roji Thanks, that is worth mentioning. The performance would be a consideration, but it could work for some situations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests